开发者

Do not want the Images to load and CSS to render on Firefox in Selenium WebDriver - Python

开发者 https://www.devze.com 2023-03-29 20:11 出处:网络
I am using Selenium 2 with p开发者_高级运维ython bindings to fetch some data from our partner\'s site. But on an average it\'s taking me around 13 secs to perform this operation.

I am using Selenium 2 with p开发者_高级运维ython bindings to fetch some data from our partner's site. But on an average it's taking me around 13 secs to perform this operation.

I was looking for a way to disable the images css and flash etc.

I am using Firefox 3.6 and also using pyvirtualdisplay to to prevent opening of firefox window. Any other optimization to speed up firefox will be also helpful.

I have already tried network.http.* options but does not help much.

And also set the permissions.default.image = 2


I have figured out a way to prevent Firefox from loading CSS, images and Flash.

from selenium.webdriver.firefox.firefox_profile import FirefoxProfile

def disableImages(self):
    ## get the Firefox profile object
    firefoxProfile = FirefoxProfile()
    ## Disable CSS
    firefoxProfile.set_preference('permissions.default.stylesheet', 2)
    ## Disable images
    firefoxProfile.set_preference('permissions.default.image', 2)
    ## Disable Flash
    firefoxProfile.set_preference('dom.ipc.plugins.enabled.libflashplayer.so',
                                  'false')
    ## Set the modified profile while creating the browser object 
    self.browserHandle = webdriver.Firefox(firefoxProfile)

Thanks again @Simon and @ernie for your suggestions.


New Edit

It has been so long since I've written this and I can say the field of web automation (either for testing or crawling/scraping purposes) has changed a lot. The major browsers have already presented a --headless flag and even interactive shell. No more change the good old DISPLAY variable on Linux.

Firefox has also changed, migrating to Servo engine written with Rust. I've tried the profile below with a contemporary version (specifically, 62.0). Some worked, some did not. Keep that in mind.


I'm just extending the answer of kyrenia in this question. However, disabling the CSS might cause Jquery not to be able to manipulate DOM elements. Use QuickJava and those below:

profile.set_preference("network.http.pipelining", True)
profile.set_preference("network.http.proxy.pipelining", True)
profile.set_preference("network.http.pipelining.maxrequests", 8)
profile.set_preference("content.notify.interval", 500000)
profile.set_preference("content.notify.ontimer", True)
profile.set_preference("content.switch.threshold", 250000)
profile.set_preference("browser.cache.memory.capacity", 65536) # Increase the cache capacity.
profile.set_preference("browser.startup.homepage", "about:blank")
profile.set_preference("reader.parse-on-load.enabled", False) # Disable reader, we won't need that.
profile.set_preference("browser.pocket.enabled", False) # Duck pocket too!
profile.set_preference("loop.enabled", False)
profile.set_preference("browser.chrome.toolbar_style", 1) # Text on Toolbar instead of icons
profile.set_preference("browser.display.show_image_placeholders", False) # Don't show thumbnails on not loaded images.
profile.set_preference("browser.display.use_document_colors", False) # Don't show document colors.
profile.set_preference("browser.display.use_document_fonts", 0) # Don't load document fonts.
profile.set_preference("browser.display.use_system_colors", True) # Use system colors.
profile.set_preference("browser.formfill.enable", False) # Autofill on forms disabled.
profile.set_preference("browser.helperApps.deleteTempFileOnExit", True) # Delete temprorary files.
profile.set_preference("browser.shell.checkDefaultBrowser", False)
profile.set_preference("browser.startup.homepage", "about:blank")
profile.set_preference("browser.startup.page", 0) # blank
profile.set_preference("browser.tabs.forceHide", True) # Disable tabs, We won't need that.
profile.set_preference("browser.urlbar.autoFill", False) # Disable autofill on URL bar.
profile.set_preference("browser.urlbar.autocomplete.enabled", False) # Disable autocomplete on URL bar.
profile.set_preference("browser.urlbar.showPopup", False) # Disable list of URLs when typing on URL bar.
profile.set_preference("browser.urlbar.showSearch", False) # Disable search bar.
profile.set_preference("extensions.checkCompatibility", False) # Addon update disabled
profile.set_preference("extensions.checkUpdateSecurity", False)
profile.set_preference("extensions.update.autoUpdateEnabled", False)
profile.set_preference("extensions.update.enabled", False)
profile.set_preference("general.startup.browser", False)
profile.set_preference("plugin.default_plugin_disabled", False)
profile.set_preference("permissions.default.image", 2) # Image load disabled again

What does it do? You can actually see what it does in comment lines. However, I've also found a couple of about:config entries to increase the performance. For example, the code above does not load the font or colors of the document, but it loads CSS, so Jquery -or any other library- can manipulate DOM elements and does not raise an error. (For a further debug, you still download CSS, but your browser will jump the lines which contains a special font-family or color definition. So browser will download and load CSS, but use system-defaults in styling and renders the page faster.)

For more information, check out this article.


Edit (Tests)

I just made a performance test. You do not really need to take the results serious since I made this test just once, for you to have an idea.

I made the test in an old machine on 2.2 gHZ Intel Pentium processor, 3 gB RAM with 4gB swap area, Ubuntu 14.04 x64 system.

The test takes three steps:

  • Driver Loading Performance: The seconds wasted to load the driver in webdriver module.
  • Page Loading Performance: The seconds wasted to load the page. It also includes the internet speed, however the render process is included as well.
  • DOM Inspecting Performance: DOM inspecting speed on the page.

I used this page as subject and inspected .xxy a as CSS selector. Then I used a special process one by one.

Selenium, Firefox, No Profile

Driver Loading Performance: 13.124099016189575
Page Loading Performance: 3.2673521041870117
DOM Inspecting Performance: 67.82778096199036

Selenium, Firefox, Profile Above

Driver Loading Performance: 7.535895824432373
Page Loading Performance: 2.9704301357269287
DOM Inspecting Performance: 64.25136017799377

Edit (About Headlessness)

I made a test maybe a month ago, but I could not take the results. However, I want to mention that driver loading, page loading and DOM inspecting speed decreases under ten seconds when Firefox is used headless. That was really cool.


Unfortunately the option firefox_profile.set_preference('permissions.default.image', 2) no longer seems to work to disable images with the latest version of Firefox - [for reason see Alecxe's answer to my question Can't turn off images in Selenium / Firefox ]

The best solution i had was to use the firefox extension quickjava , which amongst other things can disable images- https://addons.mozilla.org/en-us/firefox/addon/quickjava/

My Python code:

 from selenium import webdriver
 firefox_profile = webdriver.FirefoxProfile()

 firefox_profile.add_extension(folder_xpi_file_saved_in + "\\quickjava-2.0.6-fx.xpi")
 firefox_profile.set_preference("thatoneguydotnet.QuickJava.curVersion", "2.0.6.1") ## Prevents loading the 'thank you for installing screen'
 firefox_profile.set_preference("thatoneguydotnet.QuickJava.startupStatus.Images", 2)  ## Turns images off
 firefox_profile.set_preference("thatoneguydotnet.QuickJava.startupStatus.AnimatedImage", 2)  ## Turns animated images off

 driver = webdriver.Firefox(firefox_profile)
 driver.get(web_address_desired)

Disabling CSS (and i think flash) still work with firefox propertiees. but they and other parts can also be switched off by adding the lines:

  firefox_profile.set_preference("thatoneguydotnet.QuickJava.startupStatus.CSS", 2)  ## CSS
  firefox_profile.set_preference("thatoneguydotnet.QuickJava.startupStatus.Cookies", 2)  ## Cookies
  firefox_profile.set_preference("thatoneguydotnet.QuickJava.startupStatus.Flash", 2)  ## Flash
  firefox_profile.set_preference("thatoneguydotnet.QuickJava.startupStatus.Java", 2)  ## Java
  firefox_profile.set_preference("thatoneguydotnet.QuickJava.startupStatus.JavaScript", 2)  ## JavaScript
  firefox_profile.set_preference("thatoneguydotnet.QuickJava.startupStatus.Silverlight", 2) 


You can disable images/css using the Web Developer toolbar Addon.

https://addons.mozilla.org/en-US/firefox/addon/web-developer/

go to CSS->Disable and Images->Disable


For everyone interested in still using the original straight-forward approach suggested by Anupam:

Just install firefox version 20.0.1 (https://ftp.mozilla.org/pub/firefox/releases/20.0.1/) - works perfectly fine.

Other versions may work as well (versions 32 and higher and versions 3.6.9 and lower do NOT work)


Tossing in my 2¢.

Better to use javascript snippets to accomplish.

driver.execute_script(
   'document.querySelectorAll("img").forEach(function(ev){ev.remove()});'
);

That will remove the img elements. If you do this right after you load the page, they will have little chance to download image data.

Here is a similar solution I found elsewhere on StackOverflow. (Can't find it anymore)

driver.execute_script(
   "document.head.parentNode.removeChild(document.head)"
);
0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号