开发者

Is there a theorical expression size limit for "or" operator on Regex.Replace

开发者 https://www.devze.com 2023-04-07 15:50 出处:网络
Is there a theorical expression size limit for \"or\" operator on Regex.Replace such asRegex.Replace(\"abc\",\"(a|c|d|e...continue say 500000 elements here)\",\"zzz\") ?

Is there a theorical expression size limit for "or" operator on Regex.Replace such as Regex.Replace("abc","(a|c|d|e...continue say 500000 elements here)","zzz") ?

Any sta开发者_JAVA技巧ckoverflowException on .NET's implementation ?

Thanks


There is no theoretical limit, though each regular expression engine will have its own implementation limits. In this case, since you are using .NET the limit is due to the amount of memory the .NET runtime can use.

A regular expression with one million alernations works fine for me:

string input = "a<142>c";
var options = Enumerable.Range(0, 1000000).Select(x => "<" + x + ">");
string pattern = string.Join("|", options);
string result = Regex.Replace(input, pattern, "zzz");

Result:

azzzc

It's very slow though. Increasing the number of options to 10 million gives me an OutOfMemoryException.

You probably would benefit from looking at another approach.


The way regular expressions work mean that the memory requirements and performance for a simple a|b|c.....|x|y|z expression as described are not too bad, even for a very large number of variants.

However, if your expression is even slightly more complex than that, you could cause the expression to lose performance exponentially, as well as massively growing its memory footprint, as an large number of or options like this can cause it to have to do massive amounts of backtracking if other parts of the expression don't match immediately.

You may therefore want to excersise caution either doing this sort of thing. Even if it works now, it would only take a small and relatively innocent change to make the whole thing come to a grinding halt.

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号