«

»

May 08

The JVM – Java Virtual Machine

The JVM – Java Virtual Machine

 

אייקון The JVM – Java Virtual Machine

 
 

:Theoretical Background

 

The java virual machine is specific per machine platform, it is responsible for running the java application. It provides several features of java such as the Garbage Collection, security and more.

 

JIT – Just In Time Compiter:

The Just In Time Compiler, Is part of the JVM, and is responsible to transform the byte code to the specific machine code of the platform which we run on.

During the JVM execution (and sometimes during startup – resulting in slow startup but fast run time) it will analyze the code and decide upon the “Hot spots” parts of the code – the code segments that are most commonly used (or it will use some other more sophisticated algorithm for deciding these code segments) then it will use the JIT to compile these code segments so running these code segments again can be done natively as the code is already compiled and the JVM won’t need to interpret the byte code to run these code segments.

JIT is more complex than what I described and does optimization such as caching compiled code and in-lining functions.

In general the JIT is used to improve JVM performance by dynamically compiling Java byte codes into native machine language code to be executed. Then the JVM won’t have to run interpreted code, and the compiled code can be run natively, which is faster.

 

Java Class Loader:

The java class loader is responsible for loading classes, including the classes contained within referenced jar files.

When the JVM is started, three class loaders are used:

Bootstrap class loader – Loads the java core libraries in the JRE\lib folder.

Extensions class loader – Loads the java external libraries in the JRE\lib\ext folder.

System class loader – Loads the java libraries in the path of the environment variable of CLASSPATH.

 

External jars that we add to the project are loaded dynamically during run time when a call to one of the class was invoked.

 

Jar:

JAR files are archive files, built on the Zip file format and are used in order to contain many .class files.

 

Compilation and Interpretation:

When we write a software with the line:

System.out.println(“”);

 

What actually happens is:

 

1. The file structures:

System is a java framework class found in the JRE\lib\rt.jar.

We already have System class compiled as byte code class in rt.jar, so the compiler doesn’t have to compile it.

During compile time we compile the code – our code is a call to a static member in the class System.

As we can see in the source code of System.Java at: <JDK Path>\src.zip\java\lang\System.java

 

public final static PrintStream out = null;

out is a reference to another class: PrintStream.

And indeed if we look at the PrintStream.java class at: <JDK Path>\src.zip \java\io\PrintStream.java we can see its implementation of the println function:

    /**
     * Prints a String and then terminate the line.  This method behaves as
     * though it invokes <code>{@link #print(String)}</code> and then
     * <code>{@link #println()}</code>.
     *
     * @param x  The <code>String</code> to be printed.
     */
    public void println(String x) {
        synchronized (this) {
            print(x);
            newLine();
        }
    }

 

2. Compilation and Runtime:

 

Compiling:

The java compiler – javac – compiles our *.java files to *.class files which is a file containing the byte code, the compiler does not need to compile the framework files as they are already compiled, the line: Console.out will refer to the already compiled class of System.class which is found in the rt.jar.

 

Running:

1. During the startup of the JVM, the JVM will use the Java Class Loader to load our own written *.class files, and in this example it will also use the Bootstrap class loader to load the java core libraries in the JRE\lib folder, and in our case, the System.class located in the JRE\lib\rt.jar. (Just to clarify – it will load only the *.class files that were imported and not ALL the classes in the core libraries)

 

Note: If we referred an external JAR file, only during run time its classes will be loaded by the Java Class Loader and then either the JIT will compile it to machine code during run time or if it is not “Hot spot code” the JVM will simply run the .class code using the interpreter, reducing our performance compared to if the external JAR would have been compiled during the startup of the JVM.

 

2. Now the JVM can run our code, it runs the code by using the interpreter to interpret the byte code.

However, if the JVM decides that a certain segment of code is a “Hot spot” (usually because this code segment is run frequently during the application life) – the JVM will use the JIT Compiler to compile our code or the framework’s code to our platform specific assembly/machine code. The native code is then stored in a region called the “Code Cache” which is a none heap memory.

Next time the JVM will have to run this code segment, instead of using the interpreter to run the byte code, it will run the platform specific machine code stored in the “Code Cache” that was already precompiled by the JIT

 

Question:

Every java application uses the framework code, then it means every JVM always needs to load the framework code, so during the startup of the JVM, the class loader will have a lot of work and it would be slow to start every java program.

Isn’t there some optimization that can be done?

 

Answer:

There is an optimization for this:

 

Class Data Sharing:

During the installation of the JRE/JDK, the installer also loads a set of classes from the core libraries into a private internal representation specific to the machine it was designated to.

Meaning, when we downloaded the JRE it came with the framework code compiled to machine code of the platform the JRE we downloaded was designed for, it dumps this machine code to a file referred to as the “Shared Archive”.

During many calls from JVMS, the shared archive is memory-mapped and it saves the cost of each JVM to load it’s necessary classes from the core classes and compile them, allowing the core libraries code to be shared among multiple JVMs and each JVM doesn’t have to load and compile these libraries over again, it is done once for all JVMs.

So in my above description, in step 1, the Bootstrap would only load the necessary classes from the core library JRE\lib\rt.js if they were not already found in the Shared Archive.

With the Shared Archive the startup of the JVM is improved producing better results proportionately to the size of the application and how much “cached” core libraries it uses. The smaller the application and the more core libraries it uses – the better performances of the startup time of the JVM we get.

 

The Shared Archive file can be found in: JRE\bin\server\classes.jsa

 

Note:

The JVM operations are:

Class Loading -> Linking [verifying, preparing, optionally resolving] -> Byte code Interpretation

I have not detailed about the Verifier, Linker and interpreter in this article, but in general:

Linking is the process of taking a class or interface verifying and preparing the type and its direct superclass and super interfaces.

The verifier validates the byte code and ensures that the binary representation of a class or interface is structurally correct.

Preparing is the part responsible for allocation of memory necessary for the class (for example creating and initializing static field to their default values).

Resolving is an optional stage which checks the symbolic references by loading the referenced classes or interfaces and checking that the references are correct. (This is referred to as “Static initialization”). It is optional as we can also use “Dynamic initialization” which means that the symbols will be resolved only during runtime when a byte code instruction that uses them is performed.

 

The interpreter also perform checks on the running code such as array bound and type check.

 

Note 2:

Just for clarification as this issue is sometimes not clear I will repeat the difference between the JIT and the JVM Interpreter.

When you write a = b + c; it gets translated to a .class file as byte code.

The Interpreter when it sees this command will not produce machine code to make it happen, it practically runs the byte code, gets the value of b, gets the value of c, adds them together and puts the result in c.

The JIT will produce machine code from the byte code.

The Java compiler creates byte code, which is machine code for a nonexistent machine. A JRE that doesn’t have JIT compilation, interprets byte code. A JRE that does have JIT can do that plus decide to produce machine code to certain code segments using the JIT.

 

For example, Writing code such as:

String command = input.readLine();
if(command.equals("add")) { executeAdd(); }

Can be considered as an interpreter. The command string is the byte code command, and the executeAdd is done by the VM and the VM knows how to actually execute an Add command in the specific platform it was built for.

 

Just to clarify – Hotspot is a version of the JVM, and the Hotspot JVM uses the JIT

 

The JVM – Java Virtual Machine

 

בהצלחה !

Leave a Reply

Your email address will not be published.

אתם יכולים להשתמש באפשרויות ותגי ה-HTMLהבאים: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>