开发者

Convert text link to HTML with context considered

开发者 https://www.devze.com 2023-03-29 04:42 出处:网络
I want to co开发者_StackOverflow中文版nvert links such as http://google.com/ to HTML, however if they\'re already in an HTML link, either in the href=\"\" or in the text for the link, I don\'t want to

I want to co开发者_StackOverflow中文版nvert links such as http://google.com/ to HTML, however if they're already in an HTML link, either in the href="" or in the text for the link, I don't want to convert them.

I found this in another question:

preg_replace('@(https?:\/\/([-\w\.]+[-\w])+(:\d+)?(/([\w/_\.#-]*(\?\S+)?[^\.\s])?)?)@', '<a href="$1" target="_blank">$1</a>', $text);

However if I have something such as:

<a href="http://google.com/">http://google.com/</a>

already in the target text in question, it will create two links within that HTML. I can't seem to figure out the pattern for knowing if it's before /a or inside " ".


Do not use regular expressions for (X)HTML parsing. Use DOM instead! The XPath //text()[not(ancestor::a) and contains(., 'http://')][1] should find the first text node containing at least one HTTP URL that is not itself contained in an anchor tag. You may naively replace the text node with a text node containing preceding text, an anchor element node containing href attribute and href text node, and a text node containing remaining text. Do that until you find no more text nodes matching the XPath.


Based on mario's comment to my original post:

preg_replace('@(?<!href="|src="|">)(https?:\/\/([-\w\.]+[-\w])+(:\d+)?(/([\w/_\.#-]*(\?\S+)?[^\.\s])?)?)@', '<a href="$1">$1</a>', $text);

Works perfectly for replacing bbpress's unknown pasta salad.

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号