BeautifulSoup
fast parsing links out of a page in python
I need to parse a large number of pages (say 1000) and replace the links with tiny开发者_运维百科url links.[详细]
2023-03-10 05:40 分类:问答How to extract text from a web page that requires logging in using python and beautiful soup?
i have to retrieve some text from a website called morningstar.com . To access that data i have to log in. Once i log in and provide the urlof the web page,i get the HTML text of a normal user (not lo[详细]
2023-03-09 20:27 分类:问答Using Python and Beautifulsoup how do I select the desired table in a div?
I would like to be able to select the table containing the \"Accounts Payable\" text but I\'m not getting anywhere with what I\'m trying and I\'m pretty much guessing using findall.Can someone show me[详细]
2023-03-09 09:05 分类:问答BeautifulSoup cannot concatenate str and NoneType objects
hi im running python 2.7.1 and beautifulsoup 3.2.0 if i try to load some xml feed using ifile = open(os.path.join(self.path,str(self.FEED_ID)+\'.xml\'), \'r\')[详细]
2023-03-08 11:57 分类:问答BeautifulSoup Cannot Extract Metadata
I am trying to create a function which will extract meta keywords from a given URL and return it. However no matter what URLs I pass to it, it will always fail.[详细]
2023-03-07 19:20 分类:问答How to find the comment tag <!--...--> with BeautifulSoup?
I tried soup.find(\'!--\') but it doesn\'t seem to work. Thanks in advance. Edit: Thanks for the tip on how to find all comments. I have a follow up question. How do I specifically search out开发者_[详细]
2023-03-07 06:05 分类:问答Convert HTML to plain text and maintain structure/formatting, with ruby
I\'d like to convert html to plain text. I don\'t want to just strip the tags though, I\'d like to intelligently retain as much formatting as possible. Inserting line br开发者_如何学Pythoneaks for <[详细]
2023-03-07 02:27 分类:问答Why Beautiful Soup cannot display all <td> data in the tables?
I tried to page scrape wikipedia a week ago. But i could not figure out why Beautiful Soup will only show some string from the table column and show \"none\" for other table column.[详细]
2023-03-06 11:35 分类:问答Beautifulsoup - nextSibling
I\'m trying to get the content \"My home address\" using the following but got the AttributeError: address = soup.find(text=\"Address:\")[详细]
2023-03-06 07:58 分类:问答Extract content within a tag with BeautifulSoup
I\'d like to extract the content Hello world. Please note that there are multiples <table> and similar <td colspan=\"2\"> on the page as well:[详细]
2023-03-06 07:37 分类:问答