Tag Archives: Java

Java Virtual Machine (Microsoft VM) 問題

沒有安裝Service Pack1的Windows XP,在使用IE時有時遇到要下載Microsoft VM的對話框,但始終未能下載,致使不能使用含有Java Virtual Machine的網頁。這是由於Microsoft與Sun Microsystem關於Java VM的訴訟問題,Microsoft不再提供Microsoft VM下載的服務。除了Windows XP的SP1提供外,亦可到Sun Microsystem網站下載安裝。
注意安裝後,在桌面上多出「Java Web Star」的捷徑,如果不用可直接刪除。每當有Java applet啟動時,系統列也會出現Java圖示,這現象是正常的。

參考網址:
http://java.com/zh_tw/download/help/win_manual.jsp

Behind the scenes: How do lambda expressions really work in Java?

Look into the bytecode to see how Java handles lambdas.

September 25, 2020

Download a PDF of this article

What does a lambda expression look like inside Java code and inside the JVM? It is obviously some type of value, and Java permits only two sorts of values: primitive types and object references. Lambdas are obviously not primitive types, so a lambda expression must therefore be some sort of expression that returns an object reference.

Let’s look at an example:

public class LambdaExample {
    private static final String HELLO = "Hello World!";

    public static void main(String[] args) throws Exception {
        Runnable r = () -> System.out.println(HELLO);
        Thread t = new Thread(r);
        t.start();
        t.join();
    }
}

Programmers who are familiar with inner classes might guess that the lambda is really just syntactic sugar for an anonymous implementation of Runnable. However, compiling the above class generates a single file: LambdaExample.class. There is no additional class file for the inner class.

This means that lambdas are not inner classes; rather, they must be some other mechanism. In fact, decompiling the bytecode via javap -c -p reveals two things. First is the fact that the lambda body has been compiled into a private static method that appears in the main class:

private static void lambda$main$0();
    Code:
       0: getstatic     #7                  // Field java/lang/System.out:Ljava/io/PrintStream;
       3: ldc           #9                  // String Hello World!
       5: invokevirtual #10                 // Method java/io/PrintStream.println:(Ljava/lang/String;)V
       8: return

You might guess that the signature of the private body method matches that of the lambda, and indeed this is the case. A lambda such as this

public class StringFunction {
    public static final Function<String, Integer> fn = s -> s.length();
}

will produce a body method such as this, which takes a string and returns an integer, matching the signature of the interface method

private static java.lang.Integer lambda$static$0(java.lang.String);
    Code:
       0: aload_0
       1: invokevirtual #2                  // Method java/lang/String.length:()I
       4: invokestatic  #3                  // Method java/lang/Integer.valueOf:(I)Ljava/lang/Integer;
       7: areturn

The second thing to notice about the bytecode is the form of the main method:

public static void main(java.lang.String[]) throws java.lang.Exception;
    Code:
       0: invokedynamic #2,  0              // InvokeDynamic #0:run:()Ljava/lang/Runnable;
       5: astore_1
       6: new           #3                  // class java/lang/Thread
       9: dup
      10: aload_1
      11: invokespecial #4                  // Method java/lang/Thread."<init>":(Ljava/lang/Runnable;)V
      14: astore_2
      15: aload_2
      16: invokevirtual #5                  // Method java/lang/Thread.start:()V
      19: aload_2
      20: invokevirtual #6                  // Method java/lang/Thread.join:()V
      23: return

Notice that the bytecode begins with an invokedynamic call. This opcode was added to Java with version 7 (and it is the only opcode ever added to JVM bytecode). I discussed method invocation in “Real-world bytecode Handling with ASM” and in “Understanding Java method invocation with invokedynamic” which you can read as companions to this article.

The most straightforward way to understand the invokedynamic call in this code is to think of it as a call to an unusual form of the factory method. The method call returns an instance of some type that implements Runnable. The exact type is not specified in the bytecode and it fundamentally does not matter.

The actual type does not exist at compile time and will be created on demand at runtime. To better explain this, I’ll discuss three mechanisms that work together to produce this capability: call sites, method handles, and bootstrapping.

Call sites

A location in the bytecode where a method invocation instruction occurs is known as a call site.

Java bytecode has traditionally had four opcodes that handle different cases of method invocation: static methods, “normal” invocation (a virtual call that may involve method overriding), interface lookup, and “special” invocation (for cases where override resolution is not required, such as superclass calls and private methods).

Dynamic invocation goes much further than that by offering a mechanism through which the decision about which method is actually called is made by the programmer, on a per-call site basis.

Here, invokedynamic call sites are represented as CallSite objects in the Java heap. This isn’t strange: Java has been doing similar things with the Reflection API since Java 1.1 with types such as Method and, for that matter, Class. Java has many dynamic behaviors at runtime, so there should be nothing surprising about the idea that Java is now modeling call sites as well as other runtime type information.

When the invokedynamic instruction is reached, the JVM locates the corresponding call site object (or it creates one, if this call site has never been reached before). The call site object contains a method handle, which is an object that represents the method that I actually want to invoke.

The call site object is a necessary level of indirection, allowing the associated invocation target (that is, the method handle) to change over time.

There are three available subclasses of CallSite (which is abstract): ConstantCallSiteMutableCallSite, and VolatileCallSite. The base class has only package-private constructors, while the three subtypes have public constructors. This means that CallSite cannot be directly subclassed by user code, but it is possible to subclass the subtypes. For example, the JRuby language uses invokedynamic as part of its implementation and subclasses MutableCallSite.

Note: Some invokedynamic call sites are effectively just lazily computed, and the method they target will never change after they have been executed the first time. This is a very common use case for ConstantCallSite, and this includes lambda expressions.

This means that a nonconstant call site can have many different method handles as its target over the lifetime of a program.

Method handles

Reflection is a powerful technique for doing runtime tricks, but it has a number of design flaws (hindsight is 20/20, of course), and it is definitely showing its age now. One key problem with reflection is performance, especially since reflective calls are difficult for the just-in-time (JIT) compiler to inline.

This is bad, because inlining is very important to JIT compilation in several ways, not the least of which is because it’s usually the first optimization applied and it opens the door to other techniques (such as escape analysis and dead code elimination).

A second problem is that reflective calls are linked every time the call site of Method.invoke() is encountered. That means, for example, that security access checks are performed. This is very wasteful because the check will typically either succeed or fail on the first call, and if it succeeds, it will continue to do so for the life of the program. Yet, reflection does this linking over and over again. Thus, reflection incurs a lot of unnecessary cost by relinking and wasting CPU time.

To solve these problems (and others), Java 7 introduced a new API, java.lang.invoke, which is often casually called method handles due to the name of the main class it introduced.

A method handle (MH) is Java’s version of a type-safe function pointer. It’s a way of referring to a method that the code might want to call, similar to a Method object from Java reflection. The MH has an invoke() method that actually executes the underlying method, in just the same way as reflection.

At one level, MHs are really just a more efficient reflection mechanism that’s closer to the metal; anything represented by an object from the Reflection API can be converted to an equivalent MH. For example, a reflective Method object can be converted to an MH using Lookup.unreflect(). The MHs that are created are usually a more efficient way to access the underlying methods.

MHs can be adapted, via helper methods in the MethodHandles class, in a number of ways such as by composition and the partial binding of method arguments (currying).

Normally, method linkage requires exact matching of type descriptors. However, the invoke() method on an MH has a special polymorphic signature that allows linkage to proceed regardless of the signature of the method being called.

At runtime, the signature at the invoke() call site should look like you are calling the referenced method directly, which avoids type conversions and autoboxing costs that are typical with reflected calls.

Because Java is a statically typed language, the question arises as to how much type-safety can be preserved when such a fundamentally dynamic mechanism is used. The MH API addresses this by use of a type called MethodType, which is an immutable representation of the arguments that a method takes: the signature of the method.

The internal implementation of MHs was changed during the lifetime of Java 8. The new implementation is called lambda forms, and it provided a dramatic performance improvement with MHs now being better than reflection for many use cases.

Bootstrapping

The first time each specific invokedynamic call site is encountered in the bytecode instruction stream, the JVM doesn’t know which method it targets. In fact, there is no call site object associated with the instruction.

The call site needs to be bootstrapped, and the JVM achieves this by running a bootstrap method (BSM) to generate and return a call site object.

Each invokedynamic call site has a BSM associated with it, which is stored in a separate area of the class file. These methods allow user code to programmatically determine linkage at runtime.

Decompiling an invokedynamic call, such as that from my original example of a Runnable, shows that it has this form:

0: invokedynamic #2,  0

And in the class file’s constant pool, notice that entry #2 is a constant of type CONSTANT_InvokeDynamic. The relevant parts of the constant pool are

#2 = InvokeDynamic      #0:#31
   ...
  #31 = NameAndType        #46:#47        // run:()Ljava/lang/Runnable;
  #46 = Utf8               run
  #47 = Utf8               ()Ljava/lang/Runnable;

The presence of 0 in the constant is a clue. Constant pool entries are numbered from 1, so the 0 reminds you that the actual BSM is located in another part of the class file.

For lambdas, the NameAndType entry takes on a special form. The name is arbitrary, but the type signature contains some useful information.

The return type corresponds to the return type of the invokedynamic factory; it is the target type of the lambda expression. Also, the argument list consists of the types of elements that are being captured by the lambda. In the case of a stateless lambda, the return type will always be empty. Only a Java closure will have arguments present.

A BSM takes at least three arguments and returns a CallSite. The standard arguments are of these types:

  • MethodHandles.Lookup: A lookup object on the class in which the call site occurs
  • String: The name mentioned in the NameAndType
  • MethodType: The resolved type descriptor of the NameAndType

Following these arguments are any additional arguments that are needed by the BSM. These are referred to as additional static arguments in the documentation.

The general case of BSMs allows an extremely flexible mechanism, and non-Java language implementers use this. However, the Java language does not provide a language-level construct for producing arbitrary invokedynamic call sites.

For lambda expressions, the BSM takes a special form and to fully understand how the mechanism works, I will examine it more closely.

Decoding the lambda’s bootstrap method

Use the -v argument to javap to see the bootstrap methods. This is necessary because the bootstrap methods live in a special part of the class file and make references back into the main constant pool. For this simple Runnable example, there is a single bootstrap method in that section:

BootstrapMethods:
  0: #28 REF_invokeStatic java/lang/invoke/LambdaMetafactory.metafactory:
        (Ljava/lang/invoke/MethodHandles$Lookup;Ljava/lang/String;
         Ljava/lang/invoke/MethodType;Ljava/lang/invoke/MethodType;
         Ljava/lang/invoke/MethodHandle;Ljava/lang/invoke/MethodType;)Ljava/lang/invoke/CallSite;
    Method arguments:
      #29 ()V
      #30 REF_invokeStatic LambdaExample.lambda$main$0:()V
      #29 ()V

That is a bit hard to read, so let’s decode it.

The bootstrap method for this call site is entry #28 in the constant pool. This is an entry of type MethodHandle (a constant pool type that was added to the standard in Java 7). Now let’s compare it to the case of the string function example:

0: #27 REF_invokeStatic java/lang/invoke/LambdaMetafactory.metafactory:
        (Ljava/lang/invoke/MethodHandles$Lookup;Ljava/lang/String;
         Ljava/lang/invoke/MethodType;Ljava/lang/invoke/MethodType;
         Ljava/lang/invoke/MethodHandle;Ljava/lang/invoke/MethodType;)Ljava/lang/invoke/CallSite;
    Method arguments:
      #28 (Ljava/lang/Object;)Ljava/lang/Object;
      #29 REF_invokeStatic StringFunction.lambda$static$0:(Ljava/lang/String;)Ljava/lang/Integer;
      #30 (Ljava/lang/String;)Ljava/lang/Integer;

The method handle that will be used as the BSM is the same static method LambdaMetafactory.metafactory( ... ).

The part that has changed is the method arguments. These are the additional static arguments for lambda expressions, and there are three of them. They represent the lambda’s signature and the method handle for the actual final invocation target of the lambda: the lambda body. The third static argument is the erased form of the signature.

Let’s follow the code into java.lang.invoke and see how the platform uses metafactories to dynamically spin the classes that actually implement the target types for the lambda expressions.

The lambda metafactories

The BSM makes a call to this static method, which ultimately returns a call site object. When the invokedynamic instruction is executed, the method handle contained in the call site will return an instance of a class that implements the lambda’s target type.

The source code for the metafactory method is relatively simple:

public static CallSite metafactory(MethodHandles.Lookup caller,
                                       String invokedName,
                                       MethodType invokedType,
                                       MethodType samMethodType,
                                       MethodHandle implMethod,
                                       MethodType instantiatedMethodType)
            throws LambdaConversionException {
        AbstractValidatingLambdaMetafactory mf;
        mf = new InnerClassLambdaMetafactory(caller, invokedType,
                                             invokedName, samMethodType,
                                             implMethod, instantiatedMethodType,
                                             false, EMPTY_CLASS_ARRAY, EMPTY_MT_ARRAY);
        mf.validateMetafactoryArgs();
        return mf.buildCallSite();
}

The lookup object corresponds to the context where the invokedynamic instruction lives. In this case, that is the same class where the lambda was defined, so the lookup context will have the correct permissions to access the private method that the lambda body was compiled into.

The invoked name and type are provided by the VM and are implementation details. The final three parameters are the additional static arguments from the BSM.

In the current implementation, the metafactory delegates to code that uses an internal, shaded copy of the ASM bytecode libraries to spin up an inner class that implements the target type.

If the lambda does not capture any parameters from its enclosing scope, the resulting object is stateless, so the implementation optimizes by precomputing a single instance—effectively making the lambda’s implementation class a singleton:

jshell> Function<String, Integer> makeFn() {
   ...>   return s -> s.length();
   ...> }
|  created method makeFn()

jshell> var f1 = makeFn();
f1 ==> $Lambda$27/0x0000000800b8f440@533ddba

jshell> var f2 = makeFn();
f2 ==> $Lambda$27/0x0000000800b8f440@533ddba

jshell> var f3 = makeFn();
f3 ==> $Lambda$27/0x0000000800b8f440@533ddba

This is one reason why the documentation strongly discourages Java programmers from relying upon any form of identity semantics for lambdas.

Conclusion

This article explored the fine-grained details of exactly how the JVM implements support for lambda expressions. This is one of the more complex platform features you’ll encounter, because it is deep into language implementer territory.

Along the way, I’ve discussed invokedynamic and the method handles API. These are two key techniques that are major parts of the modern JVM platform. Both of these mechanisms are seeing increased use across the ecosystem; for example, invokedynamic has been used to implement a new form of string concatenation in Java 9 and above.

Understanding these features gives you key insight into the innermost workings of the platform and the modern frameworks upon which Java applications rely.

Dig deeper

Ben Evans

Ben Evans (@kittylyst) is a Java Champion and Principal Engineer at New Relic. He has written five books on programming, including Optimizing Java (O’Reilly). Previously he was a founder of jClarity (acquired by Microsoft) and a member of the Java SE/EE Executive Committee.

Understanding the JDK’s New Superfast Garbage Collectors

ZGC, Shenandoah, and improvements to G1 get developers closer than ever to pauseless Java.

November 21, 2019

Download a PDF of this article

Some of the most exciting developments that have occurred in the last six months have been under the hood in the JDK’s garbage collectors (GCs). This article covers a range of different improvements, many of which first appeared in JDK 12 and continued in JDK 13. First, we’ll describe Shenandoah, a low-latency GC that operates mostly concurrently with the application. We will also cover recent improvements to ZGC (a low-latency concurrent GC introduced in Java 11) that were released as part of JDK 12. And we’ll explain in detail two improvements to the Garbage First (G1) GC, which has been the default GC from Java 9 onwards.

Overview of GCs

One of Java’s greatest productivity benefits for developers compared to older languages such as C and C++ is the use of garbage collection. As a Java developer, you mostly don’t need to worry about leaking memory if you don’t explicitly free memory locations, and you don’t need to worry about crashing your application if you free memory before you’re done using it. Garbage collection is a big productivity win, but time and time again, developers have been concerned with its performance implications. Will it slow your application down? Will it cause individual pauses to the application that will cause a poor experience for your users?

Many garbage collection algorithms have been tried and tested over the years, iteratively improving their performance. There are two common areas of performance for such algorithms. The first is garbage collection throughput: How much of your application’s CPU time is spent performing garbage collection work rather than running application code? The second is the delay created—that is, the latency of individual pauses.

For many pausing GCs (for example, Parallel GC, which was the default GC before Java 9), increasing the heap size of the application improves throughput but makes worst-case pauses longer. For GCs with this profile, larger heaps mean that your garbage collection cycles run less frequently and, thus, amortize their collection work more effectively, but the individual pause times take longer because there’s more work to do in an individual cycle. Using the Parallel GC on a large heap can result in significant pauses, because the time it takes to collect the old generation of allocated objects scales with the size of the generation and, thus, the heap. But if you’re running something such as a non-interactive batch job, the Parallel GC can be an efficient collector.

Since Java 9, the G1 collector has been the default GC in OpenJDK and Oracle JDK. G1’s overall approach to garbage collection is to slice up GC pauses according to a user-supplied time target. This means that if you want shorter pauses, set a lower target, and if you want less of the CPU used by your GC and more used by your application, set a larger target. Whereas Parallel GC was a throughput-oriented collector, G1 tries to be a jack of all trades: It offers lesser throughput but better pause times.

However, G1 isn’t a master of pause times. As the amount of work to be done during a garbage collection cycle increases, either due to a very large heap or to rapidly allocating lots of objects, the time-slicing approach starts to hit a wall. By analogy, chopping a big piece of food into small pieces makes those pieces easier to digest, but if you’ve just got too much food on your plate, it’s going to take you ages to eat dinner. Garbage collection works the same way.

This is the problem space that JDK 12’s Shenandoah GC attacks: It’s a latency specialist. It can consistently achieve low pause times, even on large heaps. It might spend a bit more CPU time performing garbage collection work than the Parallel GC, but the pause times are greatly reduced. That’s great for low-latency systems in the finance, gambling, or advertising industries or even for interactive websites where users can be frustrated by long pauses.

In this article, we explain the latest versions of these GCs as well as the recent updates to G1 and, we hope, help guide you to the balance of features that works best for your applications.

Shenandoah

Shenandoah is a new GC that was released as part of JDK 12. In fact, the Shenandoah development effort backports improvements to JDK 8u and 11u releases as well, which is great if you haven’t had the opportunity to upgrade to JDK 12.

Let’s look at who should think about switching over to it and why. We won’t be going into too much detail about how Shenandoah works under the hood, but if you’re interested in the technology, you should look at the accompanying article and also at the Shenandoah page on the OpenJDK wiki.

Shenandoah’s key advance over G1 is to do more of its garbage collection cycle work concurrently with the application threads. G1 can evacuate its heap regions, that is, move objects, only when the application is paused, while Shenandoah can relocate objects concurrently with the application. To achieve the concurrent relocation, it uses what’s known as a Brooks pointer. This pointer is an additional field that each object in the Shenandoah heap has and which points back to the object itself.

Shenandoah does this because when it moves an object, it also needs to fix up all the objects in the heap that have references to that object. When Shenandoah moves an object to a new location, it leaves the old Brooks pointer in place, forwarding references to the new location of the object. When an object is referenced, the application follows the forwarding pointer to the new location. Eventually the old object with the forwarding pointer needs to be cleaned up, but by decoupling the cleanup operation from the step of moving the object itself, Shenandoah can more easily accomplish the concurrent relocation of objects.

To use Shenandoah in your application from Java 12 onwards, enable it with the following options:

-XX:+UnlockExperimentalVMOptions -XX:+UseShenandoahGC

If you can’t yet make the jump to Java 12, but you are interested in trying out Shenandoah, backports to Java 8 and Java 11 are available. It’s worth noting that Shenandoah isn’t enabled in the JDK builds that Oracle ships, but other OpenJDK distributors enable Shenandoah by default. More details on Shenandoah can be found in JEP 189.

Shenandoah isn’t the only option when it comes to concurrent GCs. ZGC is another GC that is shipped with OpenJDK (including with Oracle’s builds), and it has been improved in JDK 12. So if you have an app that suffers from garbage collection pause problems and you are thinking about trying Shenandoah, you should also look at ZGC, which we describe next.

ZGC with Concurrent Class Unloading

The primary goals of ZGC are low latency, scalability, and ease of use. To achieve this, ZGC allows a Java application to continue running while it performs all garbage collection operations except thread stack scanning. It scales from a few hundred MB to TB-size Java heaps, while consistently maintaining very low pause times—typically within 2 ms.

The implications of predictably low pause times could be profound for both application developers and system architects. Developers will no longer need to worry about designing elaborate ways to avoid garbage collection pauses. And system architects will not require specialized GC performance tuning expertise to achieve the dependably low pause times that are very important for so many use cases. This makes ZGC a good fit for applications that require large amounts of memory, such as with big data. However, ZGC is also a good candidate for smaller heaps that require predictable and extremely low pause times.

ZGC was added to JDK 11 as an experimental feature. In JDK 12, ZGC added support for concurrent class unloading, allowing Java applications to continue running during the unloading of unused classes instead of pausing execution.

Performing concurrent class unloading is complicated and, therefore, class unloading has traditionally been done in a stop-the-world pause. Determining the set of classes that are no longer used requires performing reference processing first. Then there’s the processing of finalizers–which is how we refer to implementations of the Object.finalize() method. As part of reference processing, the set of objects reachable from finalizers must be traversed, because a finalizer could transitively keep a class alive through an unbounded chain of links. Unfortunately, visiting all objects reachable from finalizers could take a very long time. In the worst-case scenario, the whole Java heap could be reachable from a single finalizer. ZGC runs reference processing concurrently with the Java application (since the introduction of ZGC in JDK 11).

After reference processing has finished, ZGC knows which classes are no longer needed. The next step is to clean all data structures containing stale and invalid data as a result of these classes dying. Links from data structures that are alive to data structures that have become invalid or dead are cleared. The data structures that need to be walked for this unlinking operation include several internal JVM data structures, such as the code cache (containing all JIT-compiled code), class loader data graph, string table, symbol table, profile data, and so forth. After unlinking the dead data structures is finished, those dead data structures are walked again to delete them, so that memory is finally reclaimed.

Until now, all JDK GCs have done all of this in a stop-the-world operation, causing latency issues for Java applications. For a low-latency GC, this is problematic. Therefore, ZGC now runs all of this concurrently with the Java application and, hence, pays no latency penalty for supporting class unloading. In fact, the mechanisms introduced to perform concurrent class unloading improved latencies even further. The time spent inside of stop-the-world pauses for garbage collection is now proportional only to the number of threads in the application. The significant effect this approach has on pause times is shown in Figure 1.

The pause times of ZGC compared with other GCs

Figure 1. The pause times of ZGC compared with other GCs

ZGC is currently available as an experimental GC for the Linux/x86 64-bit platform and, as of Java 13, on Linux/Aarch. You can enable it with the following command-line options:

-XX:+UnlockExperimentalVMOptions -XX:+UseZGC

More information on ZGC can be found on the OpenJDK wiki.

G1 Improvements

Some organizations cannot change their runtime systems to use experimental GCs. They will be happy to know that G1 has enjoyed several improvements. The G1 collector time-slices its garbage collection cycles into multiple different pauses.

Objects are initially considered to be part of the “young” generation after they are allocated. As they stay alive over multiple garbage collection cycles, they eventually “tenure” and are then considered “old.” Different regions within G1 contain objects from only one generation and can thus be referred to as young regions or old regions.

For G1 to meet the pause-time goals, it needs to be able to identify a chunk of work that can be done within the pause time goal and finish that work by the time the pause goal expires. G1 has a complicated set of heuristics for identifying the right size of work, and these heuristics are good at predicting the required work time, but they are not always accurate. Complicating the picture still further is the fact that G1 can’t collect only parts of young regions; it collects all the young regions in one garbage collection pass.

In Java 12, this situation is improved by adding the ability to abort G1 collection pauses. G1 keeps track of how accurately its heuristics are predicting the number of regions to collect and proceeds only with abortable garbage collections if it needs to. It proceeds by splitting up the collection set (the set of regions that will be garbage collected in a cycle) into two groups: mandatory regions and optional regions.

Mandatory regions are always collected within a GC cycle. Optional regions are collected as time allows, and the collection pass is aborted if it runs out of time without collecting the optional regions. The mandatory regions are all the young regions and potentially some old regions. Old-generation regions are added to this set to respond to two criteria. Some are added to ensure that the evacuation of objects can proceed and some in order to use up the expected pause time.

The heuristic to calculate how many regions to add proceeds by dividing the number of regions in the collection set candidates by the value of -XX:G1MixedGCCountTarget. If G1 predicts there will be time left to collect more old-generation regions, then it also adds more regions to the mandatory region set until it expects to use up 80% of the available pause time.

The result of this work means that G1 is able to abort, or end, its mixed GC cycles. This results in lower GC pause latency and a high probability that G1 is able to achieve its pause-time target more frequently. This improvement is detailed in JEP 344.

Prompt Return of Unused, Committed Memory

One of the most common criticisms leveled at Java is that it’s a memory hog—well, not anymore! Sometimes, JVMs are allocated more memory than they need through command-line options; and if no memory-related command-line options are provided, the JVM may allocate more memory than needed. Allocating RAM that goes unused wastes money, especially in cloud environments where all resources are metered and costed appropriately. But what can be done to solve this situation, and can Java be improved in terms of resource consumption?

A common situation is that the workload that a JVM must handle changes over time: Sometimes it needs more memory and sometimes less. In practice, this is often irrelevant because JVMs tend to allocate a large quantity of memory on startup and greedily hold onto it even when they don’t need it. In an ideal world, unused memory could be returned from the JVM back to the operating system so other applications or the container would be able to use it. As of Java 12, this return of unused memory is now possible.

G1 already has the capability to free unused memory, but it does so only during full garbage collection passes. Full garbage collection passes are often infrequent and an undesirable occurrence, because they can entail a long stop-the-world application pause. In JDK 12, G1 gained the ability to free unused memory during concurrent garbage collection passes. This feature is especially useful for mostly empty heaps. When heaps are mostly empty, it can take a while for a GC cycle to scoop up the memory and return it to the operating system. To ensure that memory is promptly returned to the operating system, as of Java 12, G1 will try to trigger concurrent garbage collection cycles if a garbage collection cycle hasn’t happened for the period specified on the command line by the G1PeriodicGCInterval argument. This concurrent garbage collection cycle will then release memory to the operating system at the end of the cycle.

To ensure that these periodic concurrent garbage collection passes don’t add unnecessary CPU overhead, they are run only when the system is partially idle. The measurement used to trigger whether the concurrent cycle runs or not is the average one-minute system load value, which has to be below the value specified by G1PeriodicGCSystemLoadThreshold.

More details can be found in JEP 346.

Conclusion

This article presented several ways in which you can stop worrying about GC-induced pause times in your applications. While G1 continues to improve, it’s good to know that as heap sizes increase and the acceptability of pause times is reduced, new GCs such as Shenandoah and ZGC offer a scalable, low-pause future.

Also in This Issue

Epsilon: The JDK’s Do-Nothing Garbage Collector
Understanding Garbage Collectors
Testing HTML and JSF-Based UIs with Arquillian
Take Notes As You Code—Lots of ’em!
For the Fun of It: Writing Your Own Text Editor, Part 2
Quiz Yourself: Identify the Scope of Variables (Intermediate)
Quiz Yourself: Inner, Nested, and Anonymous Classes (Advanced)
Quiz Yourself: String Manipulation (Intermediate)
Quiz Yourself: Variable Declaration (Intermediate)
Book Review: The Pragmatic Programmer, 20th Anniversary Edition

Raoul-Gabriel Urma

Raoul-Gabriel Urma (@raoulUK) is the CEO and cofounder of Cambridge Spark, a leading learning community for data scientists and developers in the UK. He is also chairman and cofounder of Cambridge Coding Academy, a community of young coders and students. Urma is coauthor of the best-selling programming book Java 8 in Action (Manning Publications, 2015). He holds a PhD in computer science from the University of Cambridge.

Richard Warburton

Richard Warburton (@richardwarburto) is a software engineer, teacher, author, and Java Champion. He is the author of the best-selling Java 8 Lambdas (O’Reilly Media, 2014) and helps developers learn via Iteratr Learning and at Pluralsight. Warburton has delivered hundreds of talks and training courses. He holds a PhD from the University of Warwick.

12 recipes for using the Optional class as it’s meant to be used

Follow these dozen best practices to protect your applications against ugly null pointer exceptions—and make your code more readable and concise.

June 22, 2020

Download a PDF of this article

Every serious Java developer or architect has heard about or experienced the nuisance of NullPointerException exceptions.

What can you do? Often programmers use the null reference to denote the absence of a value when they return values from methods, but that is a significant source of many problems.

To have good insight into the null reference problem, consider reading Raoul-Gabriel Urma’s article, “Tired of Null Pointer Exceptions? Consider Using Java SE 8’s ‘Optional’!” That’ll bring you up to speed and introduce you to the Optional class.

Let’s build on Urma’s work by seeing how to use Optional the way it should be used. From the experience and hands-on point of view I gained when I was reviewing developers’ code, I realized developers are using the Optional class in their day-to-day code. That led me to come up with these 12 best practices that will help you improve your skills—and avoid antipatterns.

This article and a follow-up article that will appear in Java Magazine soon will go through all the Optional class methods released through Java 14. Since Java 15 is nearing completion, that will be covered too.

The origin of the Optional class

Java engineers have been working on the null problem for a long time and they tried to ship a solution with Java 7, but that solution wasn’t added to the release. Let’s think back, though, and imagine the language designers’ thoughts about the Stream API. There is a logical return type (such as 0) to some methods such as count() and sum() when the value is absent. A zero makes sense there.

But what about the findFirst() and findAny() methods? It doesn’t make sense to return null values if there aren’t any inputs. There should be a value type that represents the presence (or absence) of those inputs.

Therefore, in Java 8, a new type was added called Optional<T>, which indicates the presence or absence of a value of type TOptional was intended to be a return type and for use when it is combined with streams (or methods that return Optional) to build fluent APIs. Additionally, it was intended to help developers deal with null references properly.

By the way, here is how Optional is described in the Java SE 11 documentation: “Optional is primarily intended for use as a method return type where there is a clear need to represent ‘no result,’ and where using null is likely to cause errors. A variable whose type is Optional should never itself be null; it should always point to an Optional instance.”

My own definition, as you are going to see in my code recipes, is this: The Optional class is a container type for a value that may be absent.

Moreover, several corner cases and temptations can be considered traps that could downgrade the quality of your code or even cause unexpected behaviors. I am going to show those in this series of articles.

You might now wonder, “Where is the code?” So, let’s jump in. My approach is to answer typical developer questions that categorize all the uses of the Optional class. In this article, I will cover the following three big categories with 12 recipes:

  • Why am I getting null even when I use Optional?
  • What should I return or set when no value is present?
  • How do I consume Optional values effectively?

In an upcoming article, I will cover two more categories:

  • How do I avoid Optional antipatterns?
  • I like Optional; what can I do more professionally?

Why am I getting null even when I use Optional?

Usually, this question applies to the creation of an Optional class and how to get the data.

Recipe 1: Never assign null to an optional variable. Sometimes when developers are dealing with a database to search for an employee, they design a method to return Optional<Employee>. I have found developers still returning null if no result is returned from the database, for example:

1 public Optional<Employee> getEmployee(int id) {
2    // perform a search for employee 
3    Optional<Employee> employee = null; // in case no employee
4    return employee; 
5 }

The code above is not correct, and you should avoid it completely. To correct it, you should replace line 3 with the following line, which initializes Optional with an empty Optional:

Optional<Employee> employee = Optional.empty();

Optional is a container that may hold a value, and it is useless to initialize it with null.

API note: The empty() method has existed since Java 8.

Recipe 2: Don’t call get() directly. Consider the following piece of code. What is wrong with it?

Optional<Employee> employee = HRService.getEmployee();
Employee myEmployee = employee.get();

Did you guess that the “employee” Optional is prone to being empty, so calling get() directly throws a java.util.NoSuchElementException? If so, you are right. Otherwise, if you think calling get() is going to make your day, you are mistaken. You should always check first for the presence of a value by using the isPresent() method, as in the following:

if (employee.isPresent()) {
    Employee myEmployee = employee.get();
    ... // do something with "myEmployee"
} else {
    ... // do something that doesn't call employee.get()
}

Note that the code above is boilerplate and is not preferable. Next, you are going to see a lot of elegant alternatives to calling isPresent()/get() pairs.

API note: The isPresent() and get() methods have existed since Java 8.

Recipe 3: Don’t use null directly to get a null reference when you have an Optional. In some cases, you need to have a null reference. If you have an Optional, don’t use null directly. Instead, use orElse(null).

Consider the following example of calling the Reflection API’s invoke() method of the Method class. It invokes the method at runtime. The first argument is null if the called method is static; otherwise, it passes the method containing the class instance.

1 public void callDynamicMethod(MyClass clazz, String methodName) throws ... {
2    Optional<MyClass> myClass = clazz.getInstance();
3    Method = MyClass.class.getDeclaredMethod(methodName, String.class);
4    if (myClass.isPresent()) {
5        method.invoke(myClass.get(), "Test");
6    } else {
7        method.invoke(null, "Test");
8    }
9 }

Generally, you should avoid using orElse(null), although in such a case, using orElse(null) is preferable to using the code above. So, you can replace lines 4 through 8 with the following concise line of code:

4    method.invoke(myClass.orElse(null), "Test");

API note: The orElse() method has existed since Java 8.

What should I return or set when no value is present?

The previous section covered how to avoid null reference problems even when you have Optional. Now it’s time to explore different ways of setting and returning data using Optional.

Recipe 4: Avoid using an isPresent() and get() pair for setting and returning a value. Consider the following code. What can you change to make it more elegant and effective?

public static final String DEFAULT_STATUS = "Unknown";
...
public String getEmployeeStatus(long id) {
    Optional<String> empStatus = ... ;
    if (empStatus.isPresent()) {
        return empStatus.get();
    } else {
        return DEFAULT_STATUS;
    }
}

Similar to recipe #3, just replace the isPresent() and get() pair with orElse(), as in the following:

public String getEmployeeStatus(long id) {
    Optional<String> empStatus = ... ;
    return empStatus.orElse(DEFAULT_STATUS); 
}

A very important note to consider here is a probable performance penalty: The value returned by orElse() is always evaluated regardless of the optional value’s presence. So the rule here is to use orElse() when you have already preconstructed values and you don’t use an expensive computed value.

API note: The orElse() method has existed since Java 8.

Recipe 5: Don’t use orElse() for returning a computed value. As mentioned in recipe #4, avoid using orElse() to return a computed value because there is a performance penalty. Consider the following code snippet:

Optional<Employee> getFromCache(int id) {
    System.out.println("search in cache with Id: " + id);
    // get value from cache
}

Optional<Employee> getFromDB(int id) {
    System.out.println("search in Database with Id: " + id);    
    // get value from database
}

public Employee findEmployee(int id) {        
    return getFromCache(id)
            .orElse(getFromDB(id)
                    .orElseThrow(() -> new NotFoundException("Employee not found with id" + id)));}

First, the code tries to get an employee with a given ID from the cache, and if it is not in the cache, it tries to get it from the database. Then, if the employee is not in the cache or the database, the code throws a NotFoundException. If you run this code and the employee is in the cache, the following is printed:

Search in cache with Id: 1
Search in Database with Id: 1

Even though the employee will be returned from the cache, the database query is still called. That is very expensive, right? Instead, I will use orElseGet(Supplier<? extends T> supplier), which is like orElse() but with one difference: If the Optional is empty, orElse() returns a default value directly, whereas orElseGet() allows you to pass a Supplier function that is invoked only when the Optional is empty. That’s great for performance.

Now, consider that you rerun the previous code with the same assumption that the employee is in the cache but with the following change to orElseGet():

public Employee findEmployee(int id) {        
    return getFromCache(id)
        .orElseGet(() -> getFromDB(id)
            .orElseThrow(() -> {
                return new NotFoundException("Employee not found with id" + id);
            }));
}

This time, you will get what you want and a performance improvement: The code will print only the following:

Search in cache with Id: 1

Finally, don’t even think of using isPresent() and get() pairs because they are not elegant.

API note: The orElseGet() method has existed since Java 8.

Recipe 6: Throw an exception in the absence of a value. There are cases when you want to throw an exception to indicate a value doesn’t exist. Usually this happens when you develop a service interacting with a database or other resources. With Optional, it is easy to do this. Consider the following example:

public Employee findEmployee(int id) {        
    var employee = p.getFromDB(id);
    if(employee.isPresent())
        return employee.get();
    else
        throw new NoSuchElementException();
}

The code above could be avoided and the task could be expressed elegantly, as follows:

public Employee findEmployee(int id) {        
    return getFromDB(id).orElseThrow();
}

The code above will throw java.util.NoSuchElementException to the caller method.

API note: The orElseThrow() method has existed since Java 10. If you’re still stuck on Java 8 or 9, consider recipe #7.

Recipe 7: How can I throw explicit exceptions when no value is present? In recipe #6, you were limited to throwing one kind of implicit exception: NoSuchElementException. Such an exception is not enough to report the problem with more-descriptive and relevant information to the client.

If you recall, recipe #5 used the method orElseThrow(Supplier<? extends X> exceptionSupplier), which is an elegant alternative to isPresent() and get() pairs. Therefore, try to avoid this:

@GetMapping("/employees/{id}")
public Employee getEmployee(@PathVariable("id") String id) {
    Optional<Employee> foundEmployee = HrRepository.findByEmployeeId(id);
    if(foundEmployee.isPresent())
        return foundEmployee.get();
    else
        throw new NotFoundException("Employee not found with id " + id);
}

The orElseThrow() method simply throws the explicit exception you pass to it if there is no value present with the Optional. So, let’s change the previous method to be more elegant, like this:

@GetMapping("/employees/{id}")
public Employee getEmployee(@PathVariable("id") String id) {
    return HrRepository
    .findByEmployeeId(id)
    .orElseThrow(
        () -> new NotFoundException("Employee not found with id " + id));
}

Additionally, if you are only interested in throwing an empty exception, here it is:

return status.orElseThrow(NotFoundException::new);

API note: If you pass a null to the orElseThrow() method, it will throw a NullPointerException if no value is present. The orElseThrow(Supplier<? extends X> exceptionSupplier) method has existed since Java 8.

How do I consume Optional values effectively?

Recipe 8: Don’t use isPresent()-get() if you want to perform an action only when an Optional value is present. Sometimes you want to perform an action only if an Optional value is present and do nothing if it is not. That is the job of the ifPresent(Consumer<? super T> action) method, which takes a consumer action as an argument. For the record, avoid the following:

1 Optional<String> confName = Optional.of("CodeOne");
2 if(confName.isPresent())
3    System.out.println(confName.get().length());

It is perfectly fine to use ifPresent(); just replace lines 2 and 3 above with one line, as follows:

confName.ifPresent( s -> System.out.println(s.length()));

API note: The ifPresent() method returns nothing, and it has existed since Java 8.

Recipe 9: Don’t use isPresent()-get() to execute an empty-based action if a value is not present. Developers sometimes write code that does something if an Optional value is present but executes an empty-based action if it is not, like this:

1 Optional<Employee> employee = ... ;
2 if(employee.isPresent()) {
3    log.debug("Found Employee: {}" , employee.get().getName());
4 } else {
5    log.error("Employee not found");
6 }

Note that ifPresentOrElse() is like ifPresent() with the only difference being that it covers the else branch as well. Therefore, you can replace lines 2 through 6 with this:

employee.ifPresentOrElse(
emp -> log.debug("Found Employee: {}",emp.getName()), 
() -> log.error("Employee not found"));

API note: The ifPresentOrElse() method has existed since Java 9.

Recipe 10: Return another Optional when no value is present. In some cases, if the value of the Optional at hand is present, return an Optional describing the value; otherwise, return an Optional produced by the supplying function. Avoid doing the following:

Optional<String> defaultJobStatus = Optional.of("Not started yet.");
public Optional<String> fetchJobStatus(int jobId) {
    Optional<String> foundStatus = ... ; // fetch declared job status by id
    if (foundStatus.isPresent())
        return foundStatus;
    else
        return defaultJobStatus; 
}

Don’t overuse the orElse() or orElseGet() methods to accomplish this because both methods return an unwrapped value. So avoid doing something like this also:

public Optional<String> fetchJobStatus(int jobId) {
    Optional<String> foundStatus = ... ; // fetch declared job status by id
    return foundStatus.orElseGet(() -> Optional.<String>of("Not started yet."));
}

The perfect and elegant solution is to use the or (Supplier<? extends Optional<? extends T>> supplier) method, as follows:

1 public Optional<String> fetchJobStatus(int jobId) {
2    Optional<String> foundStatus = ... ; // fetch declared job status by id
3    return foundStatus.or(() -> defaultJobStatus);
4 }

Or, even without defining the defaultJobStatus optional at the beginning, you could replace the code in line 3 with this:

return foundStatus.or(() -> Optional.of("Not started yet."));

API note: The or() throws NullPointerException if the supplying function is null or produces a null result. This method has existed since Java 9.

Recipe 11: Get an Optional’s status regardless of whether it is empty. Since Java 11, you can directly check if an Optional is empty by using the isEmpty() method, which returns true if the Optional is empty. So, instead of this code

1 public boolean isMovieListEmpty(int id){
2    Optional<MovieList> movieList = ... ;
3    return !movieList.isPresent();
4 }

You can replace line 3 with the following line to make the code more readable:

return movieList.isEmpty();

API note: The isEmpty() method has existed since Java 11.

Recipe 12: Don’t overuse Optional. Sometimes developers tend to overuse things they like and the Optional class is one of them. I have found that developers can see a use case for Optional everywhere, by chaining its methods just for the single purpose of getting a value, and they forget about clarity, memory footprint, and being straightforward. So, avoid this:

1 public String fetchJobStatus(int jobId) {
2    String status = ... ; // fetch declared job status by id
3    return Optional.ofNullable(status).orElse("Not started yet.");
4 }

Be straightforward by replacing line 3 with this clearer line of code:

return status == null ? "Not started yet." : status;

Conclusion

Just as with any other Java language feature, Optional can be used correctly or abused. To know the best way to use the Optional class, you need to understand what has been explored in this article and keep the recipes handy to strengthen your toolkit.

I like this saying from Oliver Wendell Holmes, Sr.: “The young man knows the rules, but the old man knows the exceptions.”

I have only scratched the surface of the Optional class here. In a forthcoming article, I will dive into the Optional antipatterns I have observed from developers’ code and even my own code. I’ll then go through more-advanced recipes that deal with the Stream API, transformations, and many more cases to cover the use of all the remaining Optional class methods.

To learn more, see the Java SE 15 API documentation for Optional and see Java 8 in Action」 by Raoul-Gabriel Urma, Mario Fusco, and Alan Mycroft (Manning, 2014).

Mohamed Taman

Mohamed Taman (@_tamanm) is the CEO of SiriusXI Innovations and a Chief Solutions Architect for Effortel Telecommunications. He is based in Belgrade, Serbia, and is a Java Champion, and Oracle Groundbreaker, a JCP member, and a member of the Adopt-a-Spec program for Jakarta EE and Adopt-a-JSR for OpenJDK.