Java bytecode to machine code

Is it possible to compile Java into machine code? (Not bytecode)

Can you have Java compiled straight into machine code? I want to do this so I have control over what platforms it’s used on, and don’t know C,C++ etc.

4 Answers 4

It appears that the GNU Compiler for Java can convert Java source code into either Java bytecode or machine code. It can also convert existing Java bytecode into machine code. However, the last news is from 2009, so I’m not sure how current it is and if it can handle the latest features of the Java language.

If Java bytecode hasn’t changed since 2009, this should still work without developers waving «Hey! I’m still here» flags over the software.

@RobertHarvey I believe Java 7 introduced some new language concepts, so it might fail at converting source files into machine code. If the bytecode has also changed with these new features, then that would fail as well

Excelsior JET seems to be still active (commercial). There was Fujitsu TowerJ, but that seems to have died a decade ago and was pretty pointless back then.

Читайте также:  Php json encode files

The website of GCJ says it hadn’t fully support even Java 1.5. See this thread: stackoverflow.com/a/4040404/1257384

Not quite directly answering the OP, but an perhaps an interesting aside. Java can be run in three modes:

  1. Mixed (default) — A combination of Interpreted and Machine compiled code (machine compiled == compiled by JIT at runtime)
  2. With -Xint flag — Interpreted — Byte code only
  3. With -Xcomp flag — Compiled — machine compiled

@Martjin Are this for the HotSpot? By chance do you have reference for the -Xcomp because I could not find that one in the JDK 7 documentation or the HotSpot options documentation and not sure if you have some hidden mailing list secrets that we, the mere mortals, are not aware of 🙂

It is a deliberately hidden option yes :-). There are various openjdk (openjdk.java.net) mailing lists you can glean this sort of info from — or read the source code 🙂

Please note that according to my tests, -Xcomp can have poorer performance (by a factor of two) in some instances.

Well yeah. Just in time optimizations are hard to do when the JIT isn’t even running. Can have != will have.

ummm so these are options to javac ? when using the -Xcomp does it by default output a single binary file?

Источник

Understanding Java Compilation: From Bytecodes to Machine Code in the JVM

When I was at university, one of my favourite Computer Science courses was compiler theory. Something about how you convert from a human-readable programming language to machine and operating system specific instructions seems particularly intriguing.

For the Java platform, compilation is different to many other languages because of the Java Virtual Machine (JVM). To run an application with the JVM, Java code is compiled into a set of class files that contain instructions for the JVM, not the operating system and hardware on which the JVM is installed. This provides the Write Once, Run Anywhere capability for which Java has been famous.

How does this conversion from virtual machine instructions to native instructions happen?

This is not a simple question to answer, so I’ve decided to write a series of blog posts exploring the different aspects of interpreting and adaptive compilation within the JVM.

At Azul, we’re constantly looking at ways to improve the performance of JVM-based applications, and I’ll include plenty of information about the work we’ve done in the following areas:

  • Replacing parts of the OpenJDK compilation system to generate more heavily optimized native code.
  • Reducing application warmup time using profiles recorded from previous application runs and compiled code stashing.
  • Separating the JVM internal compiler so you can use a shared Compiler-as-a-Service, which is particularly useful in a cloud environment.

Since we’re not the only people with ideas in this space, I’ll also look at alternative approaches like the Graal compiler (not to be confused with the Graal VM).

Let’s start with some fundamental concepts that we’ll build on in the rest of the blog series.

Source Code

What is Source Code?

Source code is high-level statements and expressions developers write to define the application’s instructions. We call this high-level because these types of programming languages provide strong abstractions of the operating system and hardware used to run the application.

Source Code Example

As a simple example, if we want to sum the numbers from one to ten, we could write this in Java using a loop, one of the fundamental constructs in many languages:

public class Sum public static void main(String[] args) int sum = 0;

for (int i = 1; i <= 10; i++) sum += i;
>

System.out.println(sum);
>
>

This hides the complexity of how an operating system and processor work for developers. For example, we can declare a local integer variable and give it a meaningful name, sum. This is simpler for us to work with than using an explicit memory address. Similarly, we can call a method in the PrintStream core library class via a reference through the System class that will print a string on whatever the standard output is for our application. How this magically appears as characters in a terminal, which is controlled by a window manager and gets drawn on the screen via a graphics card, is not our concern.

However, our high-level code needs to be converted into a set of numeric instructions and operands that can be understood by the machine on which we run the application.

To better understand what’s involved in that conversion, we could rewrite our Sum.java example in a low-level language. Unlike a high-level language, this does not provide abstractions but allows us to control the operating system and processor directly using instructions they understand.

For this example, we’ll assume we’re going to run our application on a Linux machine with an x64 processor.

One way we could write the loop part of our application in assembly language is shown below. (As we’ll see later, just like in Java, there are multiple ways we could write this code to do the same thing).

section .text
global _start
_start:
mov eax, 1
mov ecx, 10
xor edx, edx
L:
add edx, eax
inc eax
dec ecx
jnz L
EL:
mov eax, 1
mov ebx, 0
int 0x80

In this code, I’ve left out the part that prints the result at the end; doing this in assembler requires a lot more code than for the loop.

As you can see, this is considerably less readable than it is in Java. But even this is still somewhat human-readable. If you understand basic computer architecture and the instruction set being used, you can see that most of the work involves manipulating registers and performing basic calculations. More complex tasks can be achieved through interrupt calls such as the one at the end where we use the Linux interrupt 80H to invoke a system call to terminate the application (without which, as I learnt in writing this article, you get a segmentation fault).

Even this is too high-level for the computer hardware. The computer just needs a stream of multi-byte words to understand which instruction to execute with which operands.

Using an assembler and linker, we can convert the assembly code to object code and an executable. This is generated chiefly by mapping from textual instructions like JNZ to the appropriate value (in this case, 0x75). Finally, we end up with a file that the operating system can execute.

Our executable file looks like this when dumped as a series of hexadecimal values:


0000160 0000 0020 0000 0000 0000 0000 0000 0000
0000200 01b8 0000 b900 000a 0000 d231 c201 c0ff
0000220 c9ff f875 01b8 0000 bb00 0000 0000 80cd
0000240 2e00 6873 7473 7472 6261 2e00 6574 7478
0000260 0000 0000 0000 0000 0000 0000 0000 0000

Note: this is not the complete file, just the loop execution part.

However, for our high-level Java code, we cannot map directly from the statements and expressions we use to machine instructions.

For this, we must use compilation.

Java Compilation

Generically, compilation is the process of translating source code into target code using a compiler.

As we know, the Java platform uses the JVM to run Java applications. However, the JVM is an abstract computer. The JVM specification, which is part of the Java SE specification, defines features every JVM must have (what the JVM must do). However, it does not specify details of the implementation of these features (how the JVM does those things). This is the reason, for example, why there are such a variety of garbage collection algorithms available in different JVM implementations.

Learn About Garbage Collection

What is Garbage Collection? Learn
more about GC and the various
techniques and algorithms

Источник

How does JVM convert bytecode into machine code?

Java is known for its platform-independent feature which is based on the principle of “Write once, Run everywhere”.

The reason for being platform-independent is JVM (Java Virtual Machine) that produces bytecode and this bytecode provides the freedom to run anywhere and on any machine where JVM is installed.

However, the execution time slightly decreases as we compare to platform-dependent programming languages such as C, C++.

JVM makes Java codes portable and secure as it provides a sandbox where all the programs run and restrict untrusted code.

Key Terms :

bytecode: It is the intermediate code which is an optimized set of instructions generated by Java Compiler.

JVM( Java Virtual Machine): It is a program responsible for the execution of bytecode. It provides a Java Runtime environment to run a Java program.

How does a Java Program execute?

Jvm convert code

  • A Java source file is always is saved with .java extension
  • After Java file is created, Java compiler compiles the code into an intermediate code termed as bytecode with an extension of .class. This bytecode is packaged in a JAR file (Java Archive file)
  • Now, this newly created bytecode is accepted by JVM. JVM stands for Java Virtual Machine that converts the bytecode to native code.
  • This native code can be easily understood by System (OS)

JVM Converts Bytecode to Machine Code

Java converts code to machine code

JVM ( Java Virtual Machine ) receives this bytecode which is generated by Java Compiler. In JVM, there are two main components that perform all the jobs to convert the bytecode to native code, Classloader, and Execution Engine.

Classloader [ Click here to read detailed working ] :

As the name suggests, it reads .class file (bytecode) and loads necessary files and resources onto JRE. It is responsible for loading, linking, and initialization in order to load, verify, prepare and initialize the class.

JVM uses memory in order to run a program, that memory area is referred to as Runtime Data Area which consists Method Area, Heap, Java Stack, Pc Registers and Native method Stack.

Execution Engine

After the code successfully is loaded, the main task which is the execution of code is done by Execution Engine.

Execution Engine is the second components where Interpreter, JIT compiler and Garbage collector comprised.

  • Interpreter: It reads Java bytecode line by line and accountable for program execution. So, if we compare the performance then interpreter somewhere falls behind the compiler’s code execution.
  • JIT Compiler: It precompiles the portions of bytecode into native code as it is not possible to compile the entire Java program while multiple runtimes checks. Only essential parts are compiled and the remaining code is interpreted by the interpreter. Interpreter directly refers to this precompiled code without reinterpreting it. Hence, It boosts up the performance.

You Might Also Like

Create an application to display addition and product of two numbers on the client side using JSP

Read more about the article Java Program to Implement Stack

December 7, 2019

Источник

Оцените статью