Tuesday, 18 September 2012

Puzzle 44: Cutting Class


Consider these two classes:
public class Strange1 {

    public static void main(String[] args) {

        try {

            Missing m = new Missing();

        } catch (java.lang.NoClassDefFoundError ex) {

            System.out.println("Got it!");

        }

    }

}



public class Strange2 {

    public static void main(String[] args) {

        Missing m;

        try {

            m = new Missing();

        } catch (java.lang.NoClassDefFoundError ex) {

            System.out.println("Got it!");

        }

    }

}


Both Strange1 and Strange2 use this class:
class Missing {

    Missing() { }

}


If you were to compile all three classes and then delete the file Missing.class before running Strange1 and Strange2, you'd find that the two programs behave differently. One throws an uncaught NoClassDefFoundError, whereas the other prints Got it! Which is which, and how can you explain the difference in behavior?

Solution 44: Cutting Class

The Strange1 program mentions the missing type only within its try block, so you might expect it to catch the NoClassDefFoundError and print Got it! The Strange2 program, on the other hand, declares a variable of the missing type outside the try block, so you might expect the NoClassDefFoundError generated there to be uncaught. If you tried running the programs, you saw exactly the opposite behavior: Strange1 tHRows an uncaught NoClassDefFoundError, and Strange2 prints Got it! What could explain this strange behavior?
If you look to the Java language specification to find out where the NoClassDefFoundError should be thrown, you don't get much guidance. It says that the error may be thrown "at any point in the program that (directly or indirectly) uses the type" [JLS 12.2.1]. When the VM invokes the main method of Strange1 or Strange2, the program is using class Missing indirectly, so either program would be within its rights to throw the error at this point.
The answer to the puzzle, then, is that either program may exhibit either behavior, depending on the implementation. But that doesn't explain why in practice these programs behave exactly opposite to what you would naturally expect, on all Java implementations we know of. To find out why this is so, we need to study the compiler-generated bytecode for these programs.
If you compare the bytecode for Strange1 and Strange2, you'll find them nearly identical. Aside from the class name, the only difference is the mapping of the catch parameter ex to a VM local variable. Although the details of which program variables are assigned to which VM variables can vary from compiler to compiler, they are unlikely to vary much for programs as simple as these. Here is the code for Strange1.main as displayed by javap -c Strange1:
 0: new            #2; // class Missing

 3: dup

 4: invokespecial  #3; // Method Missing."<init>":()V

 7: astore_1

 8: goto 20

11: astore_1

12: getstatic      #5; // Field System.out:Ljava/io/PrintStream;

15: ldc            #6; // String "Got it!"

17: invokevirtual  #7; // Method PrintStream.println:(String;)V

20: return

Exception table:

 from to target type

   0   8   11   Class java/lang/NoClassDefFoundError


The corresponding code for Strange2.main differs in only one instruction:
11: astore_2


This is the instruction that stores the caught exception of the catch block into the catch parameter ex. In Strange1, this parameter is stored in VM variable 1; in Strange2, it is stored in VM variable 2. That is the only difference between these two classes, but what a difference it makes in their behavior!
To run a program, the VM loads and initializes the class containing its main method. In between loading and initialization, the VM must link the class [JLS 12.3]. The first phase of linking is verification. Verification ensures that a class is well formed and obeys the semantic requirements of the language. Verification is critical to maintaining the guarantees that distinguish a safe language like Java from an unsafe language like C or C++.
In classes Strange1 and Strange2, the local variable m happens to be stored in VM variable 1. Both versions of main also have a join point, where the flow of control from two different places converge. The join point is instruction 20, which is the instruction to return from main. Instruction 20 can be reached either by completing the TRy block normally, in which case we goto 20 at instruction 8, or by completing the catch block and falling through from instruction 17 to instruction 20.
The existence of the join point causes an exception during the verification of class Strange1 but not class Strange2. When it performs flow analysis [JLS 12.3.1] of Strange1.main, the verifier must merge the types contained in variable 1 when instruction 20 is reached by the two different paths. Two types are merged by computing their first common superclass [JVMS 4.9.2]. The first common superclass of two classes is the most specific superclass they share.
The state of VM variable 1 when instruction 20 is reached from instruction 8 in Strange1.main is that it contains an instance of the class Missing. When reached from instruction 17, it contains an instance of the class NoClassDefFoundError. In order to compute the first common superclass, the verifier must load the class Missing to determine its superclass. Because Missing.class has been deleted, the verifier can't load it and throws a NoClassDefFoundError. Note that this exception is thrown during verification, before class initialization and long before the main method begins execution. This explains why there is no stack trace printed for the uncaught exception.
To write a program that can detect when a class is missing, use reflection to refer to the class rather than the usual language constructs [EJ Item 35]. Here is how the program looks when rewritten to use this technique:
public class Strange {

    public static void main(String[] args) throws Exception {

        try {

            Object m = Class.forName("Missing").newInstance();

        } catch (ClassNotFoundException ex) {

            System.err.println("Got it!");

        }

    }

}


In summary, do not depend on catching NoClassDefFoundError. The language specification carefully describes when class initialization occurs [JLS 12.4.1], but class loading is far less predictable. More generally, it is rarely appropriate to catch Error or its subclasses. These exceptions are reserved for failures from which recovery is not feasible.

No comments:

Post a Comment

Your comments are welcome!