开发者

Resolving ambiguous grammar without resorting to GLR-Parser

开发者 https://www.devze.com 2023-01-28 21:22 出处:网络
I have a grammar that has two different possibilities when parsing \'if\' expr \'then\'. There\'s a simple \"assignment\", such as if foo then bar=1; else bar=0;then there\'s what I\'m calling an \"if

I have a grammar that has two different possibilities when parsing 'if' expr 'then'. There's a simple "assignment", such as if foo then bar=1; else bar=0; then there's what I'm calling an "if_block" of code which can contain one or more "assignments":

if foo then
{
    bar = 1;
    if xyz then abc = -1;
}
else
{
    bar = 0;
    if xyz then
   开发者_Python百科 {
       abc = 0;
    }
}

I'm handling nested if_blocks by way of the dangling else matched/unmatched block.

My (very simplified) grammar is basically:

program : if_blocks
if_blocks : if_block | if_block if_blocks
if_block : assignments
assignments : assignment | assignment assignments
assignment : simple_assignment | if_assignment

So my predicament is with an assignment followed by an if_block. For example:

foo = bar;
if foo then
{
   foo = foo + 1;
}

foo = bar; is an assignment that, in this case, should be reduced to an if_block. if foo then { ... } is itself an if_block. Thus this whole code is if_block+if_block (reduced to if_blocks). But after foo = bar; is reduced to an assignment, there's not enough lookahead to know whether if foo then is another assignment (within the foo = bar; if_block) or if it's a separate if_block.

I've added %glr-parser, which seems to resolve this, but I run into other situations where multiple branches of the parsing survive and I can't seem to reconcile the different S/R branches. What's the accepted practice for this kind of situation, absent switching to a completely different scanner/parser (which would be a lot of work for me to learn and rewrite code) or changing the language (which I can't do)? Is there an easy resolution (somehow defining with %dprec?) using GLR or a tweak to the grammar?


Classically the dangling-else problem is solved by insisting that an else is attached to the nearest if, which (conceptually) resolves the ambiguity. Somehow you need to communicate that idea to the parser generator to make the ambiguity really go away.

Most parser generators (including YACC and Bison) have some way of saying that when there is shift-vs-reduce conflict for a token, prefer the "shift", which can be used to acheives exactly this effect for the else keyword. I don't know what the idiom is for YACC or Bison, but it should be easy to find in the grammar description information.

(I use my own GLR parser, and it is still useful to say this, because it gets rid of the ambiguous parse in an easy way).

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号