First of all, we said a very well deserved HAPPY BIRTHDAY to Java! The day of this vJUG session (May 20th) marks the 20th anniversary of Java’s first public release. A combination of Sun, Oracle, and the first-class Java community have helped to create the amazing and vibrant Java ecosystem, enabling Java to have a huge impact on the world..
Our speaker for this session, Rafael Winterhalter, lives in Oslo and is the main author and contributor to ByteBuddy, a runtime code generation library. He is also part of the excellent community that helps put together JavaZone, a great conference (2nd largest in Europe actually) that happens each year in Oslo. This will be JavaZone’s 14th year.
Rafael took time to talk to us about bytecode, the full session can be watched here below.
Also, you can see Rafael’s slides here:
It all starts with why Java bytecode makes a perfect medium for such a rich ecosystem of JVM languages and tools that we have now. The reason is simple enough, bytecode is a simple language that is easy to manipulate. Java bytecode is the universal compilation target for all the JVM languages and the reason why Java libraries and tools, like ByteBuddy, which Rafael created and maintains, are so proficient at working with almost any codebase you can throw at them.
The session format that Rafael chose was excellent. He showed the audience pieces of Java code and the corresponding bytecode with all the moving parts highlighted: bytecode operations, JVM stack, local variable table, constant pool. All you have to know or take care of to write a piece of bytecode on your own without using a compiler, or just know because knowledge is power.
Now the next bit is important, all the bytecode operations are represented as bytes, but referred by their mnemonic codes: iload, fadd, astore_2. The first character represents what data type the operation handles: i stands for int, f for float, a is for reference type.
On the image below you can see how Java types are represented in the bytecode. The JVM type is exactly what would be the prefix of the operation mnemonic. For example, int operations are called i___: iadd, imul.
Rafael then continued with showing us pieces of Java source code and the respective bytecode it compiles to. The greatest thing about that, is that on such small examples, one can really start understanding how constant pool and the runtime stack, or local variable table operate.
While Rafael is explaining these things, everything about bytecode seems nice and easy, until you get to the method invocations. Which basically means, bytecode is easy to understand, until you look at the bytecode of any non-trivial program.
How do methods get invoked? Naturally, there’s a set of bytecode instructions to handle it and they operate in the same way as all others, you put values that correspond to the method name, this instance, parameters on the stack and call, say, invokevirtual.
There’s a bit of complexity, because there are multiple invoke instructions: invokestatic, invokespecial, invokevirtual, invokeinterface, and invokedynamic. Here, you can see the differences between their functionality and usage.
Ignoring their differences, the main takeaway is that normally methods are called with invokevirtual.
Now, let’s say we were to write bytecode by hand, which is necessary for some applications. For example, writing a secure application or a javaagent that makes classes in the JVM reloadable (like JRebel – makes your classes reloadable, so you can avoid restarting your server or redeploying your application and just reload the code changes instantly). This is when things can get really complicated really quickly.
There are a number of libraries for bytecode manipulation: cglib, javassist, asm, bytebuddy. They all have their strengths and weaknesses. For example, javaassist allows you to provide Java source code inside a Java String. Which is nice and easy, until you figure out that having a strong typechecker system has its benefits and writing Java code in a string without any compiler support until runtime can be worrisome.
ByteBuddy, which Rafael created and maintains, on the other hand, is a type safe code generation library for Java bytecode.
ByteBuddy provides you with a simple DSL for bytecode and uses the power of Java compiler to help you. On the next image, you can see an example of how one would intercept a method call using ByteBuddy.
The code creates a dynamic class that has the toString() method modified to call the MyInterceptor.intercept() method.
The power of ByteBuddy comes from a well-thought API and the fact that it supports dependency injection through annotations.
In the session Rafael showed a simple way to programmatically modify the code loaded into the JVM class, limited to what changes HotSwap can do: changing only method bodies. However, since it’s not something you probably want to do anyway, the example is a really inspired argument for the ease of use of ByteBuddy.
Check out the full session recording, it’s available at the beginning of this post. It was an excellent educational session, Rafael is a world class speaker and I encourage anyone who deals with Java code to watch it and become more familiar with such an important part of Java ecosystem: Java bytecode.
Final thoughts and Interview
As usual we interviewed our vJUG speaker, to hear more about them and to hear their thoughts on the wider Java space. Watch Oleg and Rafael chat about life in general below.
As usual, if you enjoyed this content and want more of the same, you should subscribe to our feed and we’ll send you the content as quick as we write it! Enter your email below to join our RebelLabs community.