开发者

HtmlUnit is throwing Out Of Memory and maybe leaking memory

开发者 https://www.devze.com 2023-04-08 14:35 出处:网络
I use Selenium with HtmlUnitDriver with javascript enabled and I get Out Of Memory errors (I use Java). I just browse the开发者_运维百科 same page. I am only using a single GET command. Which is the s

I use Selenium with HtmlUnitDriver with javascript enabled and I get Out Of Memory errors (I use Java). I just browse the开发者_运维百科 same page. I am only using a single GET command. Which is the solution to overcome the situation?


I've had a similar issue. It ended up being an issue with auto-loading of frames... a feature that can't be disabled.

Take a look at this: Extremely simple code not working in HtmlUnit

It might be of help.

Update

Current version of HtmlUnit is 2.10. I started using HtmlUnit from version 2.8 and each new version ended up eating more memory. I got to a point in which fetching 5 pages with javascript enabled resulted in a process of 2GB.

There are many ways to improve this situation from a javascript point of view. However, when you can't modify the javascript (eg: if you are crawling a site) your hands are tied. Disabling javascript is, of course, the best way to go. However, this might result in fetched pages being different from the expected ones.

I did manage to overcome this situation, though. After many tests, I noticed that it might not be an issue with HtmlUnit (which I thought was the guilty one from the beginning). It seemed to be the JVM. Changing from Sun's JVM to OpenJDK did the trick and now the process instead of eating 2GB of memory only requires 200MB. I'm adding version information.

Sun's (Oracle) 32-bit JVM:

$java -version
java version "1.6.0.26"
Java(TM) SE Runtime Environment (build 1.6.0_26-b03)
Java HotSpot(TM) Server VM (build 20.1-b02, mixed mode)

OpenJDK 32-bit JVM:

$java -version
java version "1.6.0_18"
OpenJDK Runtime Environment (IcedTea6 1.8.13) (6b18-1.8.13-0+squeeze2)
OpenJDK Server VM (build 14.0-b16, mixed mode)

Operative system:

$ uname -a
Linux vostro1015 2.6.32-5-686-bigmem #1 SMP Sun May 6 04:39:05 UTC 2012 i686 GNU/Linux

Please, share your experience with this.


Give more memory to the JVM by adding this to the java command line that starts the JVM in which Selenium is running:

-Xmx512m

This example give a maximum of 512 Mb to the JVM.

It depends on where you're running Selenium from. If maven, you can add it to the MAVEN_OPTS environment variable, if Eclipse, you'll need to edit the run configuration for the test class, etc.


Related to HtmlUnit:

Do not forget to call webClient.closeAllWindows();. I always put it in a finally-block around the area I use the webclient. This way it is sure that all javascript is stopped and all resources are released.

Aslo useful is setting for the webClient:

    webClient.setJavaScriptTimeout(JAVASCRIPT_TIMOUT);
    webClient.setTimeout(WEB_TIMEOUT);
    webClient.setCssEnabled(false);  // for most pages you do not need css to be enabled
    webClient.setThrowExceptionOnScriptError(false); // I never want Exceptions because of javascript

JAVASCRIPT_TIMOUT should be not too high long running javascript may be a reason for memory problems. WEB_TIMEOUT think about how long you want to wait maximal.

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号