开发者

How to remove all tags from a Wordpress post except a child tag using DOM

开发者 https://www.devze.com 2023-04-02 10:45 出处:网络
I\'m trying to remove everything from the following string EXCEPT the object tag: <p>If a post is marked video, and there is text BEFORE the video, the video player does not appear! We only see

I'm trying to remove everything from the following string EXCEPT the object tag:

<p>If a post is marked video, and there is text BEFORE the video, the video player does not appear! We only see the actual text for the url…</p>
<p>&nbsp;</p>
<p><object width="584" height="463"><param value="http://www.youtube.com/v/Clp9AeBdgL0?version=3" name="movie"><param value="true" name="allowFullScreen"><param value="always" name="allowscriptaccess"><embed width="584" height="463" allowfullscreen="true" allowscriptaccess="always" type="application/x-shockwave-flash" src="http:开发者_JS百科//www.youtube.com/v/Clp9AeBdgL0?version=3"></object></p>
<p>Of course, you might even have a paragraph AFTER the video. Could be lots and lots of meaningless text &ndash; we should definitely limit this. Lorem ipsum</p>

As you can see above, the third 'p' tag contains an 'object' tag. I want to get rid of everything except the 'object' tag and its contents. In other words, I'd like to traverse the DOM and remove everything except:

<object width="584" height="463"><param value="http://www.youtube.com/v/Clp9AeBdgL0?version=3" name="movie"><param value="true" name="allowFullScreen"><param value="always" name="allowscriptaccess"><embed width="584" height="463" allowfullscreen="true" allowscriptaccess="always" type="application/x-shockwave-flash" src="http://www.youtube.com/v/Clp9AeBdgL0?version=3"></object>

I was able to write a function that removed any particular tag (p, img, div, etc) and its contents from a string, by traversing the DOM, but I could NOT figure out how to preserve the contents of a child tag like in this case. Can anybody help?


Instead of traversing the DOM with XML-parsed object (which is what it sounds like you're doing, sorry if I'm incorrect), I'd suggest just using a regular-expressions type search on your string.

PHP supports PCREs

EDIT: It looks like '/<object .*<\/object>/' works. You can test PHP regex here -- I used the preg_match() function. Also, if you have multiple <object>s per page, you will want to make sure you're not using "greedy" matching. Lastly, this will not work with nested objects, although I don't expect you'll have them.

So the whole snippet might be:

$pattern = '/<object .*<\/object>/';
$subject = /* this is your string containing the html' */
$matches = array();

if(preg_match($pattern, $subject, $matches))
{
    echo $matches[0];
}
else
{
    echo "No match found."
}
0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号