开发者

PHP DomDocument removing an element scrambles HTML

开发者 https://www.devze.com 2023-04-08 10:49 出处:网络
I am having issues开发者_如何转开发 removing a node using PHP DomDocument. I have some HTML like so:

I am having issues开发者_如何转开发 removing a node using PHP DomDocument.

I have some HTML like so:

<!DOCTYPE HTML "-//W3C//DTD HTML 4.0 Transitional//EN">
<html>
<head> 
<title>Test</title>
<script id="fr21" type="text/javascript" src="jquery.min.js"></script>
</head>
<body> 
</body>
</html>

I attempt to remove the script node like so:

$jquery_node = $doc->getElementById('fr21'); 

$head_node = $jquery_node->parentNode;

$head_node->removeChild($jquery_node); 

I then try to view the HTML by echo:

echo $doc->saveHTML().'<br><br>';

The HTML then becomes this:

<!DOCTYPE HTML>
<html>
<body><p>-//W3C//DTD HTML 4.0 Transitional//EN"&gt;</p> 
<body> 
</body>
</html>

What just happened? The HTML has been mangled? Am I not removing the node correctly?

The weird thing is when I calculate the xPath for the jquery node it is shown as if its attached to the body node rather than the head node?

/html[1]/body[1]/script[1]


If you look at the errors, you will see that it says:

Warning: DOMDocument::loadHTML(): DOCTYPE improperly terminated in Entity, line: 1

Change the DOCTYPE to read

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

and it will work as expected: demo


try this:

$script_0 = $doc->getElementsByTagName('script')->item(0);
$doc->removeChild($script_0);
0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号