开发者

How to use lxml to get a message from a website?

开发者 https://www.devze.com 2022-12-10 15:41 出处:网络
At exam.com is not about the weather: Tokyo: 25°C I want to use 开发者_StackOverflow中文版Django 1.1 and lxml to get information at the website. I want to get information that is of \"25\" only.

At exam.com is not about the weather:

Tokyo: 25°C

I want to use 开发者_StackOverflow中文版Django 1.1 and lxml to get information at the website. I want to get information that is of "25" only.

HTML exam.com structure as follows:

<p id="resultWeather">
    <b>Weather</b>
    Tokyo:
    <b>25</b>°C
</p>

I'm a student. I'm doing a small project with my friends. Please explain to me easily understand. Thank you very much!


BeautifulSoup is more suitable for html parsing than lxml.

something like this can be helpful:

def get_weather():
    import urllib
    from BeautifulSoup import BeautifulSoup
    data = urllib.urlopen('http://exam.com/').read()
    soup = BeautifulSoup(data)
    return soup.find('p', {'id': 'resultWeather'}).findAll('b')[-1].string

get page contents with urllib, parse it with BeautifulSoup, find P with id=resultWeather, find last B in our P and get it's content

0

精彩评论

暂无评论...
验证码 换一张
取 消