开发者

extract text from html using regex or other method

开发者 https://www.devze.com 2023-03-29 06:21 出处:网络
i am trying to extract the text \"abcdef\" from the following html using regex: <a href=\"xyz.com\" rel=\"bookmark\" title=\"hello_world\">abc def</a>

i am trying to extract the text "abcdef" from the following html using regex:

<a href="xyz.com" rel="bookmark" title="hello_world">abc def</a>

i am trying this pattern

$pattern = "<a href=(.*?) rel='bookmark' title=(.*?)>(.*?)</a>"

it would be helpful if anyone help me to figure out the pattern . I am 开发者_如何学Pythonusing PHP .

thanks


Use DOMDocument instead. Specifically, DOMDocument::loadHTML. Your life will be much easier.

You could use a pattern like the following, but I really don't recommend using regexes to manipulate HTML:

/<a\s+href\s*=\s*"([^"]+)"\s+rel\s*=\s*"([^"]+)"\s+title\s*=\s*"([^"]+)"\s*>([^<]+)<\/a>/

I also noticed that in your regular expression you have rel='bookmark' whereas the original string has rel="bookmark". This is probably why your original regex is not working.

0

精彩评论

暂无评论...
验证码 换一张
取 消