I want to fetch specific rows in an HTML document
The rows have the following attributes set: bgcolor and vallign
Here is a snippet of the HTML table:
<table>
   <tbody>
      <tr bgcolor="#f01234" valign="top">
        <!--- td's follow ... -->
      </tr>
      <开发者_如何学编程;tr bgcolor="#c01234" valign="top">
        <!--- td's follow ... -->
      </tr>
   </tbody>
</table>
I've had a very quick look at BS's documentation. Its not clear what params to pass to findAll to match the rows I want.
Does anyone know what tp bass to findAll() to match the rows I want?
Don't use regex to parse html. Use a html parser
import lxml.html
doc = lxml.html.fromstring(your_html)
result = doc.xpath("//tr[(@bgcolor='#f01234' or @bgcolor='#c01234') "
    "and @valign='top']")
print result
That will extract all tr elements that match from your html, you can do further operation with them like change text, attribute value, extract, search further...
Obligatory link:
RegEx match open tags except XHTML self-contained tags
Something like
soup.findAll('tr', attrs={'bgcolor': re.compile(r'#f01234|#c01234'), 'valign': 'top'})
 
         
                                         
                                         
                                         
                                        ![Interactive visualization of a graph in python [closed]](https://www.devze.com/res/2023/04-10/09/92d32fe8c0d22fb96bd6f6e8b7d1f457.gif) 
                                         
                                         
                                         
                                         加载中,请稍侯......
 加载中,请稍侯......
      
精彩评论