开发者

PHP Substring of Regular Expression match,and regular expression not always working

开发者 https://www.devze.com 2023-04-01 08:03 出处:网络
I am trying to create an html parser like BBCode. For example I want to parse items from html text with the following format:

I am trying to create an html parser like BBCode. For example I want to parse items from html text with the following format: .....html..... [I]Item1[/I].....html....[I]Item2[/I]......

So I am using a regular expression to get the [I]XXXXX[/I] I also want the regex to return only the Item1 to avoid str_replace. At the moment I am using str_replace [I] with "" and [/I] with "" to get the Item1. The problem is that the regular expression is not always working.

I am using the code bellow:

$pattern="/\[I]([^\[].)+\[\/I]/m";
preg_match_all($pattern,$string,$out,PREG_SET_ORDER);
foreach($out as $i)
{
    $temp=$i[0];
    echo "Found!";
    $i[0]=str_replace("[I]","",$i[0]);
    $i[0]=str_replace("[/I]","",$i[0]);
    ......
}

My regular expression means: Starts with [I] continues with any character except [ (To avoid [I] [I] [/I] [/I]) and ends with [/I]. Some strings are failing such as aaaaa and others like aaa aa are found! Maybe there is a better way to create such an html parser?

Thank you!

Edi开发者_开发百科t: Ok, I found the solution, but I can't understand why this doesn't work! The solution was $pattern='#\[i\](.*?)\[/i\]#is' but whats the difference?

Edit 2: Raider was correct the main problem was in ([^\[.)+]. This will create the language [I](a)^2n[/I] so it will match [I]aa[/I], but not [I]aaaaa[/I]!


I think your subpattern ([^\[].)+ is the problem. Try ([^\[]+)


Try to use something like this:

$parsed_str = '[I]Item1[/I].....html....[I]Item2[/I].....';
preg_match_all('~\[I\]([^\[.]+?)\[\/I\]~i', $parsed_str, $result);
print_r($result[1]);

The same results is given by:

preg_match_all('~\[I\]([^\[].+?)\[\/I\]~i', $parsed_str, $result);


You problem is in line

$temp=$i[0];

Index 0 contains the entire matched pattern. Instead you need to use index 1 - the first parenthesised part of the regexp:

$temp = $i[1]
0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号