开发者

Why does the Java compiler only report one kind of error at a time?

开发者 https://www.devze.com 2023-04-08 09:18 出处:网络
I\'ve a snippet class T{ int y; public static void main(String... s){ int x; System.out.println(x); System.out.println(y);

I've a snippet

class T{
    int y;
    public static void main(String... s){
        int x;
        System.out.println(x);
        System.out.println(y);
    }
}

Here there are two error, but on compilation why only one error is shown?

The error shown is:

non-static variable y cannot be referenced from a static context
    System.开发者_StackOverflow社区out.println(y);
                       ^

But what about the error

variable x might not have been initialized
    System.out.println(x);
                       ^


The Java compiler compiles your code in several passes. In each pass, certain kinds of errors are detected. In your example, javac doesn't look to see whether x may be initialised or not, until the rest of the code actually passes the previous compiler pass.


@Greg Hewgill has nailed it.

In particular, the checks for variables being initialized, exceptions being declared, unreachable code and a few other things occur in a later pass. This pass doesn't run if there were errors in earlier passes.

And there's a good reason for that. The earlier passes construct a decorated parse tree that represent the program as the compiler understands it. If there were errors earlier on, that tree will not be an accurate representation of the program as the developer understands it. (It can't be!). If the compiler then were to go on to run the later pass to produce more error messages, the chances are that a lot of those error messages would be misleading artifacts of the incorrect parse tree. This would only confuse the developer.

Anyway, that's the way that most compilers for most programming languages work. Fixing some compilation errors can cause other (previously unreported) errors to surface.


The Dragon Book ("Compilers: Principles, Techniques, and Tools" by Aho, Sethi, and Ullman) describe several methods that compiler writers can employ to try to improve error detection and reports when given input files that don't conform to the language specification.

Some of the techniques they give:

  • Panic mode recovery: skip all input tokens in the input stream until you have found a "synchronizing token" -- think of ;, }, or end statement and block terminators or separators, or do, while, for, function, etc. keywords that can show intended starts of new code blocks, etc. Hopefully the input from that point forward will make enough sense to parse and return useful error messages.

  • Phrase level recovery: guess at what a phrase might have meant: insert semicolons, change commas to semicolons, insert = assignment operators, etc., and hope the result makes enough sense to parse and return useful error messages.

  • Error productions: in addition to the "legitimate" productions that recognize allowed grammar operators, include "error productions" that recognize mistakes in the language that would be against the grammar, but that the compiler authors recognize as very likely mistakes. The error messages for these can be very good, but can drastically grow the size of the parser to optimize for mistakes in input (which should be the exception, not the common case).

  • Global correction: try to make minimal modifications to the input string to try to bring it back to a program that can be parsed. Since the corrections can be made in many different ways (inserting, changing, and deleting any number of characters), try to generate them all and discover the "least cost" change that makes the program parse well enough to continue parsing.

These options are presenting in the chapter on parsing grammars (page 161 in my edition) -- obviously, some errors are only discovered after the input file has been parsed, but is beginning to be converted into basic blocks for optimization and code generation. Any errors that occur in the earlier syntactic levels will prevent the typical compiler from starting the code optimization and generation phases, and any errors that might be detected in these phases must wait for fixed input before they can be run.

I strongly recommend finding a copy of The Dragon Book, it'll give you some sympathy and hopefully new-found respect for our friendly compiler authors.


The whole point of an error is that the compiler doesn't know how to interpret your code correctly. That's why you should always ignore errors after the first one, because the compiler's confusion may result in extra/missing errors.

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号