I have been trying to capture code blocks in a similar fashion to wiki tags:
{{code:
code goes here
}}
Ex开发者_JAVA技巧ample code is shown below,
$strings = array('AbCd1zyZ9', 'foo!#$bar');
foreach ($strings as $testcase) {
if (ctype_alnum($testcase)) {
echo "It is The string $testcase consists of all letters or digits.\n";
} else {
echo "The string $testcase does not consist of all letters or digits.\n";
}
}
Essentially I want to capture anything between the {{..}}. There are multiple blocks like this embedded in an HTML page.
I would appreciate any help.
Well to start off, regex is not a good way to solve this problem. The right approach is to write a parser that understands language semantics and can tease out the subtleties. Having said that, if you still want a quick and dirty regex based approach that will work 99.99% of the time but has a couple of acknowledged bugs (see end of answer), Here you go:
You can use preg_match_all(). Here is a proof of concept:
$input = "
<html>
<head>
<title>{{code:echo 'Hello World';}}</title>
</head>
<body>
<h1>{{code:\$strings = array('AbCd1zyZ9', 'foo!#$bar');
foreach (\$strings as \$testcase) {
if (ctype_alnum(\$testcase)) {
echo \"It is The string \$testcase consists of all letters or digits.\\n\";
} else {
echo \"The string $testcase does not consist of all letters or digits.\\n\";
}
}
}}</h1>
</body>
</html>
";
$matches = array();
preg_match_all('/{{code:([^\x00]*?)}}/', $input, $matches);
print_r($matches[1]);
Outputs the following:
Array
(
[0] => echo 'Hello World';
[1] => $strings = array('AbCd1zyZ9', 'foo!#');
foreach ($strings as $testcase) {
if (ctype_alnum($testcase)) {
echo "It is The string $testcase consists of all letters or digits.\n";
} else {
echo "The string does not consist of all letters or digits.\n";
}
}
)
Be careful. There are some edge case bugs involving early termination by encountering }} within a "code" block:
- If
}}appears in a quoted string, the regex matches too early - If
}is the last character of your "code" block and it's immediately followed by}}, you'll lose the closing}from your code block.
As I've said in the comments, Asaph's answer is a good solid regex, but breaks down when }} is contained within the code block. Hopefully this won't be a problem, but as there is a possibility of it, it would be best make your regex a little more expansive. If we can assume that any }} appearing between two single-quotes does not signify the end of the code, as in Asaph's example of <div>{{code:$myvar = '}}';}}</div>, we can expand our regex a bit:
{{code:((?:[^']*?'[^']*?')*?[^']*?)}}
[^']*?' looks for a set of non-' characters, followed by a single quote, and [^']*?'[^']*?' looks for two of them in succession. This "swallows" strings like '}}'. We lazily look for any number of these strings, then the rest of any non-string code with [^']*?, and finally our ending }}.
This allows us to match the entire string {{code:$myvar = '}}';}} rather than just {{code:$myvar = '}}.
There are still problems with this method, however. Escaping a quote within a string, such as in {{code:$myvar = '\'}}\'';}} will not work, as we will "swallow" '\' first, and end with the }} immediately following. It may be possible to determine these escaped single-quotes as well, or to add in support for double-quoted strings, but you need to ask yourself at what point using a code-parser is a better idea.
See the entire Regex in action here. (If it doesn't match anything at first, just click the window.)
how can I use the result to say place it in new ,
<div>
Use the replace function:
preg_replace($expression, "<div>$0</div>", $input)
$0 inserts the entire match, and will place it between a new <div> block. Alternatively, if you just want the actual source code, use $1, as we captured the source code in a separate capture group.
Again, see the replacement here.
I went deeper down the rabbit hole...
{{code:((?:(?:[^']|\\')*?(?<!\\)'(?:[^']|\\')*?(?<!\\)')*?(?:[^']|\\')*?)}}
This won't break with escaped single-quotes, and correctly matches {{code:$myvar = '\'}}\'';}}.
Ta-da.
use
preg_match_all("/{{(.)*}}/", $text, $match)
where text is the text that might contain code
this captures anything between {{ }}
加载中,请稍侯......
精彩评论