开发者

libxml2 - remove child, but not grandchildren

开发者 https://www.devze.com 2023-01-14 13:27 出处:网络
I\'m using libxml2 to parse HTML. I want to remove certain formatting tags like <center>, while keeping their content (for example, a link).

I'm using libxml2 to parse HTML. I want to remove certain formatting tags like <center>, while keeping their content (for example, a link).

This means I'll have to remove certain child nodes from my xmlNodeSet, but keep that node's children.

Right now开发者_如何学C, I got this code:

xmlNodePtr parentNode = nodes->nodeTab[i];

if (parentNode != NULL) {
    xmlNodePtr child = parentNode->children;
    xmlNodePtr parentNextSibling = parentNode->next;
    xmlNodePtr grandParent = NULL;

    while (child) {
        xmlUnlinkNode(child);
        if (parentNextSibling != NULL) {
            xmlAddPrevSibling(parentNextSibling, child);
        }
        else {
            if (grandParent == NULL)
                grandParent = parentNode->parent;
            xmlAddChild(grandParent, child);
        }

        child = child->next;
    }

    xmlUnlinkNode(parentNode);
    xmlFree(parentNode);
}

The code does add the child to the document, but it also deletes the node I was adding it as a sibling to. What am I doing wrong?


You're not saving off the child->next pointer before you cut it out of the tree. As soon as you unlink a node, it isn't part of the tree, so child->next becomes NULL. Then, after you reinsert it into the tree (before parentNode->next), the child->next pointer now points to what was previously parentNode->next, so the next time through the loop, you delete parentNode->next. Things can only go downhill from there. :-)

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号