开发者

Backtracking a balancing group in a greedy repetition may cause imbalance?

开发者 https://www.devze.com 2023-01-16 07:09 出处:网络
As a generically brewed example for the purpose of this question, my intent is to match some number of a\'s, then an equal number of b\'s, plus one more b.

As a generically brewed example for the purpose of this question, my intent is to match some number of a's, then an equal number of b's, plus one more b.

Examine the two patterns exhibited in this snippet (also on ideone.com):

var r1 = new Regex(@"(?xn)
   (?<A> a)+   (?<B-A> b)+    (?(A)(?!))   b
");
var r2 = new Regex(@"(?xn)
   (?<A> a)+   (?<B-A&g开发者_运维问答t; b)+?   (?(A)(?!))   b
");

Console.WriteLine(r1.Match("aaabbb"));
// aaabbb

Console.WriteLine(r2.Match("aaabbb"));
// aabbb

Note that there is a difference in the matches of the two patterns. r1, which uses a greedy repetition on the balancing group construct, matches 3 a's and 3 b's, which is NOT as intended. r2, which uses a reluctant repetition, gives me 2 a's and 3 b's, which IS as intended.

The only way I can explain this is that when (?<B-A> b)+ backtracks to match one less b, it pops from the B stack but DOES NOT push back what was correspondingly popped from the A stack. Thus, even though one less b is now matched due to backtracking, the A stack remains empty. This is the only way I can explain how r1 can match aaabbb.

Note that using reluctant +? in r2 doesn't cause this problem. The way I see it, this is because unlike greedy repetition, a reluctant repetition doesn't have to "undo the damage" to the A stack, so-to-speak. By contrast, the greedy repetition causes as much "damage" as possible, but the backtracking fails to "leave things as they were" to the A stack.

Is this a correct analysis of what happened? And if so, is this behavior by design? Because what it basically looks like to me is that backtracking a balancing group in a greedy repetition may cause imbalance, and thus this could potentially be categorized as a bug (or at the very least a somewhat astonishing behavior that is inadequately documented).


This is a bug in Mono.

The reason why people are getting .NET-like Environment.Version on IdeOne is Mono requirement of backward compatibility with .NET, including compatibility with applications that take decisions based on the framework version.

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号