开发者

How to match a block of <li></li> using regexp

开发者 https://www.devze.com 2023-01-04 19:13 出处:网络
how can I match a block of <li>item 1</li> <li>item 2</li> no matter t开发者_JAVA百科here is a blank line before or after the block

how can I match a block of

<li>item 1</li>
<li>item 2</li>

no matter t开发者_JAVA百科here is a blank line before or after the block and enclose it in <ul></ul> tags using PHP's preg_* functions.

Thank you for answer


If this is safe, controlled input, and you just got LIs with missing parent ULs, you can do:

preg_replace ( '#\s*(?:<li>.*</li>\s*)+#' , '<ul>$0</ul>', $input )

(You may want to add some \n to the replacement string before or after the UL.)

NOTE: This will fail if:

  • There are any existing UL/OL lists in the content.
  • There is anything other than whitespace between consecutive list items.
  • Any of the LIs span multiple lines (the . excludes newline by default).
  • There are any attributes on the LIs.
  • Possibly some things I haven't considered.

Some of these can relatively easily be catered for, but I'm not going to - if you haven't got known specific content, you should be using a real HTML parser instead.

The 'Regular' in Regular Expressions has a specific meaning, and full HTML is not a Regular language, so trying to handle all the intricacies of HTML with simple regex is liable to fail.
If you use a bad regex on user-supplied HTML, you may be introducing HTML injection vulnerabilities into your code.


David already stated, that php got xml and html-parsers. However if you really want to use a regex, it probably would be something like:

preg_match('#<li>(.*?)</li>#', $string);
// Same thing
preg_match('#<li>(.*)</li>#U', $string);
0

精彩评论

暂无评论...
验证码 换一张
取 消