Tag Archives: Java

Easy Searching with Elasticsearch

Using Elasticsearch’s high- and low-level APIs to search synchronously and asynchronously

January 10, 2020

Download a PDF of this article

Elasticsearch is an open source search engine built on top of a full-text search library called Apache Lucene. Apache Lucene is a Java library that provides indexing and search technology, spell-checking, and advanced analysis/tokenization capabilities. Lucene dates back to 1999 as a SourceForge project and joined the Apache Software Foundation in 2001. It is the backbone for at least two popular search engines: Solr and Elasticsearch. Both of these search engines are powerful and have their own strengths and weaknesses. Although Solr has been around longer, and historically has better documentation, Elasticsearch is a better choice for applications that require not only text search but also time series search and aggregations. This article concentrates on Elasticsearch only. To learn more about comparing the Solr and Elasticsearch engines, refer to this article.

Learning about Elasticsearch is a large topic. It encompasses search-optimized document design, query and analysis, mappings, cluster management, data ingestion, and security. In this article, I introduce core concepts of Elasticsearch and then explore in depth how to use the Elasticsearch Java API to create, update, delete, and search a document in an index. I describe both the low-level API and the high-level API for performing these operations as well as how to execute these tasks synchronously and asynchronously. I will also discuss how to stream data into an Elasticsearch cluster, which is necessary if you are reading data from a stream, a queue, or another source that is too large to be loaded into memory.

Getting Started

First, download Elasticsearch. Then start it by navigating to the installation bin directory and running elasticsearch.bat. Once the Elasticsearch engine has started, you will see “started” in the log output. Then you can open http://localhost:9200/ and you will receive a JSON response letting you know that your single-node cluster is up (see Figure 1).

JSON response showing an Elasticsearch cluster is running

Figure 1. JSON response showing an Elasticsearch cluster is running

Core Concepts

Now that you have Elasticsearch running, let’s examine a few core concepts so you can familiarize yourself with the terminology. To make things easier, in Table 1, I compare each concept with a similar concept used in database technology. The comparisons are not entirely accurate, but they make learning the new terms a little easier.

Comparison of Elasticsearch and database concepts

Table 1. Comparison of Elasticsearch and database concepts

In this article, I work a lot with catalog items. For my purposes, catalog items have an ID, description, price, and sales rank (a number representing how popular the item is). A catalog item will belong to a category and will be produced by a manufacturer. A category will have a name and a parent category, while a manufacturer has a name and an address.

Low-Level Synchronous CRUD API

Now let’s explore the create, read, update, and delete (CRUD) API of the low-level client. The low-level client requires a minimal number of Elasticsearch dependencies, and it mirrors the REST endpoint API provided by Elasticsearch. As such, new releases of Elasticsearch should be backward compatible with the low-level client dependencies. The reason the client is called “low-level” is because you will need to do all the work of creating a JSON object request and also manually parse the response. In an environment where memory is limited, this might be the only solution available to you. The high-level client API is built on top of the low-level API, so it makes sense to start with the low-level API.

To get the low-level Elasticsearch libraries, all you need to do is to import the REST client as shown below for Maven. In my code, I also import libraries that will help serialize and deserialize JSON into the model classes, which are not shown here.

<dependency>
  <groupId>org.elasticsearch.client</groupId>
  <artifactId>elasticsearch-rest-client</artifactId>
  <version>7.0.0</version>
</dependency>

To build the REST client, I use the REST client builder and point it to the host (or hosts) that the client will communicate with. The client is thread-safe, so it could be used for the entire lifecycle of the application. To release the underlying HTTP client resources, the client needs to be closed when the application is done using it. As an example, I use try-with-resources to initialize the client and once initialized pass it to the CrudMethodsSynchronous constructor. I use the CrudMethodsSynchronous class as the wrapper to call Elasticsearch’s create/update/delete API:

public static void main(String[] args) {
  try (RestClient client = RestClient.builder(
          new HttpHost("localhost", 9200, "http")).build()){
              CrudMethodsSynchronous scm = 
             new CrudMethodsSynchronous(
                 "catalog_item_low_level", client);
  }
}

To insert the document into an Elasticsearch index, I create a PUT request and ask the client to execute the request. The important part of the document creation is the HTTP method you use to create it: I chose PUT. You can use a POST method as well; in fact, POST is the preferred method for creating a record while PUT is the preferred method for updating a record. But in my case, because I run this sample program several times and sometimes I don’t clean the index, PUT works better. The PUT method is used here as an upsert (that is, an insert or update).

The URI is important as well. Let’s look at an example:

http://localhost:9200/catalog_item_low_level/_doc/1

The first part of it is the index that is similar to a database where documents are stored, which in this case is catalog_item_low_level. The second part is _doc, which indicates that you are dealing with a document. Before Elasticsearch 7, you would have specified the type here, for example, catalogitem. But types are no longer used. The last part is the ID of the document. Note that the index name needs to be in lowercase. In this example, I am creating documents one by one; later in this article, I will show how to create several items in bulk.

public void createCatalogItem(List<CatalogItem> items) {
  items.stream().forEach(e-> {
      
    Request request = new Request("PUT", 
            String.format("/%s/_doc/%d", 
                 getIndex(), e.getId()));
    try {
        request.setJsonEntity(
            getObjectMapper().writeValueAsString(e));
      
        getRestClient().performRequest(request);
      } catch (IOException ex) {
        LOG.warn("Could not post {} to ES", e, ex);
      }
  });
}

Once the items are created, I would like to find one item via a full-text search—in this example, I’ll look for a flashlight. This search will consider all the fields in the document, and it will return records in which any field has a flashlight as a token. One thing to note here is that Elasticsearch has the concept of both filters and searches. Filters are faster searches that are intended to return results but not rate their relevance, whereas searches return results and rate each result with a relevance score. (In this article, I look at searches only.)

List<CatalogItem> items = 
   scm.findCatalogItem("flashlight");
LOG.info("Found {} items: {}", items.size(), items);

To run a search with a low-level client, I need to issue a GET request that will run against my index with the following URI: /<indexname>/_search. Because the low-level API uses the Elasticsearch REST interface, I need to construct the REST query object by hand, which in my case is { "query" : {"query_string" : { "query": "flashlight" } } }.

After sending the request to Elasticsearch, I will receive a result in a Response object. The result contains the return status as well as an entity that represents the JSON response. To get CatalogItem results, I need to navigate through the response structure.

public List<CatalogItem> findCatalogItem(String text) {
    Request request = new Request("GET", 
             String.format("/%s/_search", getIndex()));
       
    request.setJsonEntity(String.format(SEARCH, text));
    try {
        Response response = client.performRequest(request);
   if (response.getStatusLine().getStatusCode()==OK) {
           List<CatalogItem> catalogItems = 
                parseResultsFromFullSearch(response);
          
           return catalogItems;
       } 
    } catch (IOException ex) {
        LOG.warn("Could not post {} to ES", text, ex);
    }
    return Collections.emptyList();
}

I first find the documents my search returned, and then I convert the returned JSON documents into my model. As you can see, I have to heavily rely on Elasticsearch REST documentation to both create requests and parse responses. The easiest way to see how to form a request and test what Elasticsearch will return is to use the Advanced REST Client (ARC) plugin to Chrome or the Postman app, or install Kibana.

To change this search so it looks only at a particular field (for example, the category name of a catalog item), all you need to do is to change the previous query to { "query" : { "match" : { "category.category_name" : "Home" } } }. Then use the same process to submit the request and parse the results.

These two searches are different from retrieving an item by its ID. Here, I only need to issue a GET request to an index passing the ID, /<indexname>/_doc/5, for example, and parsing the return object is different because I am not getting an array of items that were found, but only the one item, if it exists. Here is the code that shows how to use the low-level API to search by ID. As with all low-level API calls, I need to parse the JSON response, skipping over the metadata and extracting CatalogItem information, as shown here:

public Optional<CatalogItem> getItemById(Integer id) {
  Request request = new Request("GET", 
         String.format("/%s/_doc/%d", getIndex(), id));
  try {
    Response response = client.performRequest(request);
    if (response.getStatusLine().getStatusCode() == 200) {
      String rBody = 
          EntityUtils.toString(response.getEntity());
      LOG.debug("find by item id response: {}", rBody);
      int start = rBody.indexOf(_SOURCE);
      int end = rBody.indexOf("}}");
      String json = rBody.substring(
                     start + _SOURCE.length(), end+2); 
      LOG.debug(json);
      CatalogItem item = 
          jsonMapper.readValue(json, CatalogItem.class);
      return Optional.of(item);
    } 
  } catch (IOException ex) {
      LOG.warn("Could not post {} to ES", id, ex);
  }
  return Optional.empty();
}

Updating a document. There are a couple of ways to update a document: You can issue an update to an entire document or issue an update that modifies a particular field only. Because CatalogItem is a rather small document, I will update it fully:

public void updateCatalogItem(CatalogItem item) {
  Request request = 
      new Request("POST", 
                  String.format("/%s/_update/%d", 
                  getIndex(), item.getId()));
  try {
    request.setJsonEntity("{ \"doc\" :" + 
               jsonMapper.writeValueAsString(item)+"}");
      
    Response response = client.performRequest(request);
    LOG.debug("update response: {}", response);
  } catch (IOException ex) {
    LOG.warn("Could not post {} to ES", item, ex);
  }    
}

To update only a particular field, you issue a POST request to the same URI, but instead of sending an object in the request, you send just the field that you need to update, as shown next:

public void updateDescription(Integer id, String desc) {
  Request request = new Request("POST", 
           String.format("/%s/_update/%d", index,  id));
  try {

    request.setJsonEntity(
         String.format(
            "{ \"doc\" : { \"description\" : \"%s\" }}", 
      desc));
      
    Response response = client.performRequest(request);
    LOG.debug("update response: {}", response);
  } catch (IOException ex) {
    LOG.warn("Could not post {} to ES", id, ex);
  }
}

Deleting an item. Deleting an item is also straightforward: All you need to do is send a DELETE request with the index and the document ID, for example, /<indexname>/_doc/5.

public void deleteCatalogItem(Integer id) {
  Request request = new Request("DELETE", 
        String.format("/%s/_doc/%d", getIndex(),  id));
  try {
    Response response = client.performRequest(request);
    LOG.debug("delete response: {}", response);
  } catch (IOException ex) {
    LOG.warn("Could not post {} to ES", id, ex);
  }
}

This completes the overview of CRUD methods in the low-level synchronous client.

Asynchronous calls. To make an asynchronous call using the low-level client, you just need to call the performRequestAsync method instead of the performRequest method. You must supply a response listener to the asynchronous call. The response listener needs to implement two methods: onSuccess and onFailure. I show these in lines 5 and 9 of the following code. In this example, I am upserting several items into an Elasticsearch index asynchronously.

A countdown latch is part of Java concurrent package, and it is used here as a thread synchronization mechanism. The countdown latch is initialized with an integer count, and it will block the thread calling its await() method until the number of countDown calls that are made by other threads are equal to the value it is initialed with.

To do that, I create a countdown latch and use it to make sure createCatalogMethod will not return until all the items are sent and processed by Elasticsearch. In my response listener implementation, I use latch countdown methods upon both success and failure to indicate that Elasticsearch has processed an item.

public void createCatalogItem(List<CatalogItem> items) {
  CountDownLatch latch = new CountDownLatch(items.size());
  ResponseListener listener = new ResponseListener() {
    @Override
    public void onSuccess(Response response) {
      latch.countDown();
    }
    @Override
    public void onFailure(Exception exception) {
      latch.countDown();
      LOG.error(
        "Could not process ES request. ", exception);
    }
  };
      
  itemsToCreate.stream().forEach(e-> {
    Request request = new Request(
                 "PUT", 
                 String.format("/%s/_doc/%d", 
                               index, e.getId()));
      try {
        request.setJsonEntity(
            jsonMapper().writeValueAsString(e));
        client.performRequestAsync(request, listener);
      } catch (IOException ex) {
        LOG.warn("Could not post {} to ES", e, ex);
      }
    });
  try {
    latch.await(); //wait for all the threads to finish
    LOG.info("Done inserting all the records to the index");
  } catch (InterruptedException e1) {
    LOG.warn("Got interrupted.",e1);
  }
}

Using an asynchronous client is much easier with the high-level REST client.

High-Level REST Client

The high-level REST client is built on top of the low-level client. It adds a few Elasticsearch dependencies to the project, but as you will see, it makes coding much easier and enjoyable for both the synchronous and asynchronous API. One thing to keep in mind when choosing to use the high-level API is that it is recommended to upgrade client dependencies with each major update to the Elasticsearch cluster. This dependency upgrade is not needed when using the low-level API, but you might have to adjust your implementation to compensate for any underlying Elasticsearch API changes. While the high-level client makes coding easier, the low-level client gives you more control and has a smaller binary footprint.

To get the high-level Elasticsearch libraries, all you need to do is import the REST client as shown below for a Maven project.

<dependency>
  <groupId>org.elasticsearch.client</groupId>
  <artifactId>
    elasticsearch-rest-high-level-client
  </artifactId>
  <version>7.4.2</version>
</dependency>

Building a high-level REST client is very similar to building a low-level REST client. The only difference is that I need to wrap my low-level client in a high-level client API:

try(RestHighLevelClient client = 
    new RestHighLevelClient(
        RestClient.builder(
            new HttpHost("localhost", 9200, "http")))) {

    CrudMethodsSynchronous scm = 
        new CrudMethodsSynchronous(
            "catalog_item_high_level",  client);

In the attached code, I have all the same methods the low-level client implemented, but because the high-level model is so much easier to work with, I will describe only two methods here: how to create a document and how to search for a document.

To create a document in the Elasticsearch high-level API, you need to use IndexRequest and initialize it with the name of the desired index. Then set the ID on the request and add JSON as a source. Calling the high-level client index API with the request synchronously will return the index response, which could then be used to see if a document was created or updated.

public void createCatalogItem(List<CatalogItem> items) {
  items.stream().forEach(e-> {
    IndexRequest request = new IndexRequest(index);
      try {
        request.id(""+e.getId());
        request.source(jsonMapper.writeValueAsString(e), 
                   XContentType.JSON);
        request.timeout(TimeValue.timeValueSeconds(10));
        IndexResponse response = client.index(request,  
                            RequestOptions.DEFAULT);
        if (response.getResult() == 
                  DocWriteResponse.Result.CREATED) {
          LOG.info("Added catalog item with id {} "
                   + "to ES index {}", 
               e.getId(), response.getIndex());  
             
        } else if (response.getResult() == 
                  DocWriteResponse.Result.UPDATED) {
          LOG.info("Updated catalog item with id {} " +
           " to ES index {}, version of the " +
         "object is {} ", 
         e.getId(), response.getIndex(), 
         response.getVersion()); 
        } 
   
      } catch (IOException ex) {
        LOG.warn("Could not post {} to ES", e, ex);
      }
    }
   );
}

Similarly, a full text search is much easier to read. Here, I create a search request by passing an index and then use a search query builder to construct a full text search. The search response encapsulates the JSON navigation and allows you easy access to the resulting documents via the SearchHits array in the following code:

public List<CatalogItem> findCatalogItem(String text) {
    try {
        SearchRequest request = new SearchRequest(index); 
        SearchSourceBuilder scb = new SearchSourceBuilder();
        SimpleQueryStringBuilder mcb = 
       QueryBuilders.simpleQueryStringQuery(text);
        scb.query(mcb); 
        request.source(scb);
         
        SearchResponse response = 
            client.search(request, RequestOptions.DEFAULT);
        SearchHits hits = response.getHits();
        SearchHit[] searchHits = hits.getHits();
        List<CatalogItem> catalogItems = 
            Arrays.stream(searchHits)
                  .filter(Objects::nonNull)
                  .map(e -> toJson(e.getSourceAsString()))
                  .collect(Collectors.toList());
         
        return catalogItems;
    } catch (IOException ex) {
        LOG.warn("Could not post {} to ES", text, ex);
    }
    return Collections.emptyList();
}

In the code above, I searched a specific index in Elasticsearch. To search all indexes, I would need to create a SearchRequest without any parameters. What you can see from these two examples is a pattern that spans the rest of the CRUD methods: You first create a specific request, passing it an index and a document ID. Such a request could be an IndexRequest to create a document, a GetRequest to get a document by ID, an UpdateRequest to update the document, and so on. Then, you issue the appropriate request to Elasticsearch, for example, to get, update, or delete and you receive a response that has the status and source objects, if applicable.

Asynchronous calls. Asynchronous calls are a bit less painful to write with the high-level client. To write them, you call a similar synchronous method and add the Async postfix and supply either an Elasticsearch ActionListener or a higher-level object, such as a PlainActionFuture as a last argument. A PlainActionFuture is an Elasticsearch class that is imported by the high-level API dependencies. It implements both an Elasticsearch ActionListener interface and Java Future interface, making it an ideal choice for response processing.

The following sample code implements all the methods asynchronously: It is identical to the synchronous example, aside from the fact that I create a PlainActionFuture, which will hold a search response and which I pass to the searchAsync API of the high-level REST client. The caller of this method will then inspect the future and when the search completes, parsing the search response will be done in exactly the same way as with the synchronous API.

The biggest advantage of asynchronous APIs is that you can perform other operations in the working thread until you need the results of the search.

public PlainActionFuture<SearchResponse> 
                                 findItem(String text) {
                 
    SearchRequest request = new SearchRequest(getIndex()); 
    SearchSourceBuilder ssb = new SearchSourceBuilder();
    SimpleQueryStringBuilder mqb = 
             QueryBuilders.simpleQueryStringQuery(text);
    ssb.query(mqb); 
    request.source(ssb);
    
    PlainActionFuture<SearchResponse> future = 
                              new PlainActionFuture<>();
    client.searchAsync(request, 
                       RequestOptions.DEFAULT, future);
    return future;
}

Streaming Data into Elasticsearch

As the last part of this article, I want to show how to stream data into Elasticsearch and I want to introduce bulk operations. Bulk operations allow you to execute multiple index, update, or delete operations using a single request. The advantage of doing a bulk request is that you do everything in only one round trip to the Elasticsearch server instead of doing a round trip for every request. Bulk operations also fit very well for streaming data into Elasticsearch. The only caveat is that you need to figure out how to size a batch to avoid extra latency by ensuring you don’t make the batch so big that the entire request times out before fully completing the work.

A batch request can have different operations in it, but in the following example, it will just have an index request to insert data into an Elasticsearch index. The following routine creates one bulk request that adds an index request for each item passed in a batch. It relies on the caller to make sure that the batch size is reasonable. Once all the items are added, a synchronous client bulk request is submitted.

The bulk request call returns a bulk response that contains bulk response items—each of which corresponds to an item in the request, indicating what operation was requested, whether it was successful or not, and so on. I am not using a bulk response in this example for the sake of simplicity.

private void sendBatchToElasticSearch(
                 List<LineFromShakespeare> linesInBatch, 
      RestHighLevelClient client, 
      String indexName) throws IOException {
         
    BulkRequest request = new BulkRequest();
    linesInBatch.stream().forEach(l -> {
        try {
            request.add(new IndexRequest(indexName)
                   .id(l.getId())
                   .source(jsonMapper.writeValueAsString(l), 
             XContentType.JSON));
        
       } catch (JsonProcessingException e) {
           LOG.error("Problem mapping object {}", l, e);
       }});
    LOG.info("Sending data to ES");
    client.bulk(request, RequestOptions.DEFAULT);
}

The call above is invoked by a procedure that streams a file and makes a batch of 1,000 documents per load into Elasticsearch.

public void loadData(String file, String index) 
               throws IOException, URISyntaxException {
  Path filePath = Paths.get(
           ClassLoader.getSystemResource(file).toURI());

  List<String> errors = new ArrayList<>();
  List<LineFromShakespeare> lines = new ArrayList<>();
  final int maxLinesInBatch = 1000;
  try(RestHighLevelClient client = 
    new RestHighLevelClient(
      RestClient.builder(
       new HttpHost("localhost", 9200, "http")))) {
          
      Files.lines(file).forEach( e-> {
        try {
            LineFromShakespeare line = 
                jsonMapper.readValue(e, LineFromShakespeare.class);
            //enrich...
            linesInBatch.add(line);
            if (linesInBatch.size() >= maxLinesInBatch) {
                sendBatchToElasticSearch(linesInBatch, 
                                         client, indexName);
               linesInBatch.clear();
            }
        } catch (IOException ex) {
          errors.add(e);
          linesInBatch.clear();
        }});

    if (linesInBatch.size() != 0) {
        sendBatchToElasticSearch(linesInBatch, 
                            client, indexName);
        linesInBatch.clear();
    }
  }
       
  LOG.info("Errors found in {} batches", errors.size());
}

Here I have used the Java Files.lines API to stream a file line by line, convert the text to an object, enrich it, and add it to the batch to be sent to Elasticsearch. Once the batch size reaches 1,000, I ship the batch to Elasticsearch by using the sendBatchToElasticSearch method.

As you can see, using the high-level API simplifies your code and makes it much more readable. So, if the binary footprint is not an issue, and you can live with upgrading dependencies with each major upgrade to the Elasticsearch cluster, I would highly recommend sticking with the high-level API.

Conclusion

In this article, I introduced Elasticsearch and focused on a Java CRUD API used in both a low-level and high-level client, showing most of the needed functions for CRUD applications. The APIs, along with streaming data into Elasticsearch, make up the basic knowledge you need before embarking on an Elasticsearch adventure that includes document design, data analyzers, advanced searching including a multifield search, proximity matching, paging, suggestions, highlighting, result scoring, and different types of data aggregations, geolocation, security, and cluster management. The playing field here is really vast. Enjoy!

Henry Naftulin

Henry Naftulin has been designing Java EE distributed systems for more than 15 years. He is currently a lead developer on a proprietary award-winning fixed-income trading platform for one of the largest financial companies in the United States.

Five Code Review Antipatterns

Everyone cares about best practices, but worst practices can sometimes be more illuminating.

May 4, 2020

Download a PDF of this article

Code reviews are essential, but they are not always done correctly. This article points out, and rants about, particular antipatterns all developers have probably experienced while their code is reviewed or when they submit pull requests.

Antipattern: Nit-Picking

Imagine the following scenario. The code authors put hours, if not days, of effort into creating the solution they thought would work best. They considered multiple design options and took the path that seemed most relevant. They considered the architecture of the existing application and made changes in the appropriate places. Then they submitted their solution as a pull request or they started the code review process, and the expert feedback they received is

  • “You should be using tabs, not spaces.”
  • “I don’t like where the braces are in this section.”
  • “You do not have a blank line at the end of your file.”
  • “Your enums are in uppercase and they should be in sentence case.”

Although it’s important that new code be consistent with the style of existing code, these are hardly things that require a human reviewer. Human reviewers are expensive and can accomplish things computers cannot. Checking that style standards have been met is something a computer can easily do and distracts from the real purpose of a code review.

If developers see many of these types of comments during code review, it suggests the team either does not have a style guide or it has one but checking the style has not been automated. The solution is to use tools such as checkstyle to make sure style guidelines have been followed or to use sonarqube to identify common quality and security issues. Rather than relying on human reviewers to warn of such problems, the continuous integration environment can do that.

Sometimes, it might be difficult to automate such checks, for example, if no code guidelines exist or if the in-house code style has evolved over time and differs in various sections. There are approaches to applying automated checks in these situations. For example, a team can agree to do a single commit that applies an agreed-upon code style and contains no other changes. Or a team can agree that when a file is changed for a bug or feature, the file will also be updated to the new style and the automated tool can be configured to check only changed files.

If a team has a variety of code styles and it does not have a way to automate checking the style, it’s also prone to falling into the next trap.

Antipattern: Inconsistent Feedback

For every developer invited to review a code, at least one more opinion is invited—and probably more. Everyone can hold more than one opinion at the same time. Sometimes a code review can descend into an argument between reviewers about different approaches, such as whether streams or a classic for loop would be best. How is a developer supposed to make changes, close the review, and push the code to production if team members have opposite views on the same code?

Even a single reviewer’s mind can easily change, either during one review or over a series of reviews. On one review, a reviewer might be pushing the author to make sure a data structure with O(1) read operations is used, whereas in the next review, the reviewer might ask why there are several data structures for different use cases and suggest the code should be simplified by doing a linear search through a single structure.

This scenario happens when a team doesn’t have a clear idea of what its “best practices” look like and when it hasn’t figured out what its priorities are, for example:

  • Should the code be moving towards a more modern style of Java? Or is it more important that the code is consistent and, therefore, continues to use “classic” constructs everywhere?
  • Is it important to have O(1) read operations on data structures in all parts of the system? Or are there sections where O(n) is acceptable?

Almost all design questions can be answered with “it depends.” To have a better idea of the answer, developers need to understand the priorities for their applications and for their teams.

Antipattern: Last-Minute Design Changes

The most demoralizing feedback a developer can get during code review is when a reviewer fundamentally disagrees with the design or architecture of the solution and forces a complete rewrite of the code, either gradually through a series of reviews (see the following section) or by brutally rejecting the code and making the author start again.

Code review is not the right time to review the design. If the team is following a classic “gateway” code review, the code should be working and all the tests should be passing before the final step of having another developer look at the code. At this point, hours, days, or possibly weeks (although I really hope not weeks; code reviews should be small, but that’s another subject) of effort have gone into the code under review. Stating during code review that the underlying design is wrong is a waste of everyone’s time.

Code reviews can be used as design reviews, but if this is the intention of the code review, the review should be happening at the start of implementation. Then, before developers get too far down the road, they can sketch out their ideas with maybe some stub classes and methods and some tests with meaningful names and steps, and perhaps submit some text or diagrams, in order to have team members give feedback on the approach to be taken.

If team members are finding genuinely showstopping design problems in a gateway review (that is, when the code is complete and working), the team should update its process to locate these problems much earlier. This might mean doing alternative types of reviews such as the one suggested in the previous paragraph, whiteboarding ideas, pair programming, or talking through the proposed solution with a tech lead. Finding design problems in a final code review is a waste of development time and enormously demoralizing for code authors.

Antipattern: Ping-Pong Reviews

In an ideal world, authors would submit their code for review, the reviewers would make a couple of comments with clear solutions, the authors would make the suggested changes and resubmit the code, the review would be complete, and the code would be pushed. But if that happened regularly, who could justify the code review process at all?

In real life, what often happens is this:

  1. A code review is started.
  2. A number of reviewers make several suggestions: some small and easy, some fluffy without obvious solutions, and some complicated.
  3. The author makes some changes: at least the easy fixes or maybe several changes in an effort to make the reviewers happy. The author might ask the reviewers questions to clarify things, or the author might make comments to explain why particular changes weren’t made.
  4. The reviewers come back, accept some of the updates, make further comments on others, find other things they don’t like about the original code, respond to questions, and argue with other reviewers or the author over their comments in the review.
  5. The code author makes more changes, adds more comments and questions, and so on.
  6. The reviewers check the changes, make more comments and suggestions, and so on.
  7. Steps 5 and 6 are repeated, perhaps forever.

During the process, in theory, changes and comments should decline towards zero until the code is ready. The most depressing case is when each iteration brings at least as many new issues as old ones that were closed. In such a case, the team has entered the “infinite loop of code review.” This happens for a number of reasons:

  • It happens if reviewers nitpick and if they give inconsistent feedback. For reviewers who fall into these habits, there’s an infinite number of problems to identify and an infinite number of comments to make.
  • It happens when there’s no clear purpose to the review or no guidelines to follow during reviews, because every reviewer then feels every possible problem must be found.
  • It happens when it’s unclear what a reviewer’s comments require from the code author. Does each comment imply a change that must be made? Are all questions an implication that the code isn’t self-documenting and needs to be improved? Or are some comments made simply to educate the code author for next time, and are questions posed just to help the reviewer understand and learn?

Comments should be understood either as showstoppers or not showstoppers, and if reviewers decide the code needs to change before they can sign off on it, they need to be explicit about exactly what actions the code author should take.

It’s also important to understand who is responsible for deciding the review is “done.” This can be achieved through a task list of items that have been checked and have passed, or it can be accomplished by an individual who is empowered to say, “good enough.” There usually needs to be someone who can break stalemates and resolve disagreements. This might be a senior developer, a lead, or an architect—or it might even be the code author in teams in which there is a high degree of trust. But at some point, someone needs to say either “the review is finished” or “when these steps are complete, the review is finished.”

Antipattern: Ghost Reviewer

Here’s where I admit the antipattern that I am most guilty of committing: ghosting. Whether I’m a reviewer or a code author, there comes a point in the code review (sometimes right at the start!) where I simply don’t respond during the review. Maybe there’s an important or interesting feature that I’ve been asked to review, so I decide to leave it until “a better time” when I can “really look at it properly.” Or maybe the review is large, and I want to set aside plenty of time. Or perhaps I’m the author, and after an iteration (or twenty) I just can’t face reading and answering the comments anymore, so I decide to wait “until my head is in the right place.”

Sound familiar?

Whatever the cause, sometimes someone in the review process simply doesn’t respond. This can mean the review is dead in the water until this person looks at the code. This is a waste: Even though someone has invested time in creating an asset (the new code), until it’s in production it’s not adding value. In fact, it’s probably rotting as it gets further and further behind the rest of the codebase.

Several factors can cause ghost reviews. Large code reviews are one factor, because who wants to wade through tens or hundreds of changed files? Not valuing code reviews as real work or part of the deliverables is another factor. Difficult or demoralizing code review experiences are another major factor: No one wants to stop coding (something developers generally enjoy) to participate in a time-consuming and soul-destroying activity.

Here are suggestions for addressing ghost reviews:

  • Make sure code reviews are small. Each team has to work out its definition of this, but it’ll be in the region of hours or days of work to review, not weeks.
  • Make sure the purpose of the code review is clear and what reviewers should be looking for is clear. It’s hard to motivate yourself to do something when the scope is “find any possible problem with the code.”
  • Allow time in the development process for doing code reviews.

This last point might require team discipline, or the team might want to encourage allowing time by (for example) rewarding good code review behavior via objectives or whatever mechanism is used to determine a developer’s productivity.

What Can Your Team Do?

Focus on creating a solid code review process. I’ve written about it on my blog but would like to share part of that process here.

There are many things to consider when doing a code review, and if developers worried about all of them for every code review, it would be nearly impossible for any code to pass the review process. The best way to implement a code review process that works for everyone is to consider the following questions:

  • Why is the team doing reviews? Reviewers have an easier job when there’s a clearly defined purpose, and code authors will have fewer nasty surprises from the review process.
  • What are team members looking for? When there’s a purpose, developers can create a more focused set of things to check when reviewing code.
  • Who is involved? Who does the reviews, who is responsible for resolving conflicts of opinion, and who ultimately decides if the code is good to go?
  • When does the team do reviews, and when is a review complete? Reviews could happen iteratively while developers are working on the code or at the end of the process. A review could go on forever if there isn’t clear guidance on when the code is finally good to go.
  • Where does the team do reviews? Code reviews don’t require a specific tool, so reviews could be as simple as authors walking a colleague through their code at their desk.

Once these questions are answered, your team should be able to create a code review process that works well. Remember, the goal of a review should be to get the code into production, not to prove how clever the developers are.

Conclusion

Code review antipatterns can be eliminated, or at least mitigated, by having a clear code review process. Many teams believe they should be doing code reviews, but they don’t have clear guidelines on why they’re doing them.

Different teams need different types of code review, in the same way that different applications have different business and performance requirements. The first step is to figure out why the team needs to review code, and then the team can work on

  • Automating easy checks (for example, checking code style, identifying common bugs, and finding security issues)
  • Creating clear guidelines on when a review happens, what to look for, and who decides when reviews are finished
  • Making code reviews a key part of the development process

Focusing on why code reviews are done will help teams create best practices for a code review process so it will be much easier to avoid code review antipatterns.

Trisha Gee

Trisha Gee has developed Java applications for a range of industries, including finance, manufacturing, software and non-profit, for companies of all sizes. She has expertise in Java high performance systems, is passionate about enabling developer productivity, and dabbles with Open Source development. Based in Spain, Trisha is a leader of the Sevilla Java User Group and a Java Champion. She believes healthy communities and sharing ideas help us to learn from mistakes and build on successes. As a Developer Advocate for JetBrains, she gets to share all the interesting things she’s constantly discovering. Follow her on Twitter at @trisha_gee.

Programming the GPU in Java

Accessing the GPU from Java unleashes remarkable firepower. Here’s how the GPU works and how to access it from Java.

January 10, 2020

Download a PDF of this article

Programming a graphics processing unit (GPU) seems like a distant world from Java programming. This is understandable, because most of the use cases for Java are not applicable to GPUs. Nonetheless, GPUs offer teraflops of performance, so let’s explore their possibilities.

To make the topic approachable, I’ll spend some time explaining GPU architecture along with a little history, which will make it easier to dive into programming the hardware. Once I’ve shown how the GPU differs from CPU computing, I’ll show how to use GPUs in the Java world. Finally, I will describe the leading frameworks and libraries available for writing Java code and running it on GPUs, and I’ll provide some code samples.

A Little Background

The GPU was first popularized by Nvidia in 1999. It is a special processor designed to process graphical data before it is transferred to the display. In most cases, it enables some of the computation to be offloaded from the CPU, thus freeing CPU resources while speeding up those offloaded computations. The result is that more input data can be processed and presented at much higher output resolutions, making the visual representation more attractive and the frame rate more fluid.

The nature of 2D/3D processing is mostly matrix manipulation, so it can be handled with a massively parallel approach. What would be an effective approach for image processing? To answer this, let’s compare the architecture of standard CPUs (shown in Figure 1) and GPUs.

Block architecture of a CPU

Figure 1. Block architecture of a CPU

In the CPU, the actual processing elements—the fetchers, the arithmetic logic unit (ALU), and the execution contexts—are just a small part of the whole system. To speed up the irregular calculations arriving in unpredictable order, there are a large, fast, and expensive cache; different kinds of prefetchers; and branch predictors.

You don’t need all of this on a GPU, because the data is received in a predictable manner and the GPU performs a very limited set of operations on the data. Thus, it is possible to make a very small and inexpensive processor with a block architecture similar to that in Figure 2.

Block architecture for a simple GPU core

Figure 2. Block architecture for a simple GPU core

Because these kinds of processors are cheap and they process data in parallel chunks, it’s easy to put many of them to work in parallel. This design is referred to as multiple instruction, multiple data or MIMD (pronounced “mim-dee”).

A second approach focuses on the fact that often a single instruction is applied to multiple data items. This is known as single instruction, multiple data or SIMD (pronounced “sim-dee”). In this design, a single GPU contains multiple ALUs and execution contexts, with a small area dedicated to shared context data, as shown in Figure 3.

Comparing a MIMD-style GPU block architecture (left) with a SIMD design (right)

Figure 3. Comparing a MIMD-style GPU block architecture (left) with a SIMD design (right)

Combining SIMD and MIMD processing provides the maximal processing throughput, which I’ll discuss shortly. In such a design, you have multiple SIMD processors running in parallel, as shown in Figure 4.

Running multiple SIMD processors in parallel; here, 16 cores with 128 total ALUs

Figure 4. Running multiple SIMD processors in parallel; here, 16 cores with 128 total ALUs

Because you have a bunch of small, simple processors, you can program them to gain special effects in the output.

Running Programs on the GPU

Most of the early visual effects in games were actually hardcoded small programs running on a GPU and applied to the data stream from the CPU.

It was obvious, even then, that hardcoded algorithms were insufficient, especially in game design, where visual representation is actually one of the main selling points. In response, the big vendors opened access to GPUs, and then third-party developers could code for them.

The typical approach was to write small programs, called shaders, in a special language (usually a subset of C) and compile them with a special compiler for the corresponding architecture. The term shaders was chosen because shaders were often used to control lighting and shading effects, but there’s no reason they can’t handle other special effects.

Each GPU vendor had its own specific programming language and infrastructure for creating shaders for its hardware. From these efforts, several platforms have been created. The major ones include

  • DirectCompute: A proprietary shader language/API from Microsoft that is part of Direct3D, starting with DirectX 10
  • AMD FireStream: An ATI/Radeon proprietary technology, which was discontinued by AMD
  • OpenACC: A multivendor-consortium parallel computing solution
  • C++ AMP: A Microsoft proprietary library for data parallelism in C++
  • CUDA: Nvidia’s proprietary platform, which uses a subset of the C language
  • OpenCL: A common standard originally designed by Apple but now managed by the consortium Khronos Group

Most of the time, working with GPUs is low-level programming. To make it a little bit more understandable for developers to code, several abstractions are provided. The most famous are DirectX, from Microsoft, and OpenGL, from the Khronos Group. These APIs are for writing high-level code, which then can be offloaded to the GPU mostly seamlessly by the developer.

As far as I know, there is no Java infrastructure that supports DirectX, but there is a nice binding for OpenGL. JSR 231 was started in 2002 to address GPU programming, but it was abandoned in 2008 and supported only OpenGL 2.0. Support of OpenGL has been continued in an independent project called JOCL, (which also supports OpenCL)and it’s publicly available. By the way, the famous Minecraft game was written with JOCL underneath.

Advent of the GPGPU

Still, Java and GPUs are not a seamless fit, although they should be. Java is heavily used in enterprises, data science, and the financial sector, where many computations and a lot of processing power are needed. This is how the idea of the general-purpose GPU (GPGPU) came about.

The idea to use the GPU this way started when the vendors of video adapters started to open the frame buffer programmatically, enabling developers to read the contents. Some hackers recognized that they could then use the full power of the GPU for general-purpose computations. The recipe was straightforward:

  1. Encode the data as a bitmap array.
  2. Write a shader to process it.
  3. Submit them both to the video card.
  4. Retrieve the result from the frame buffer.
  5. Decode the data from the bitmap array.

This is a very simplified explanation. I’m not sure this process was ever heavily used in production, but it did work.

Then several researchers from Stanford University began looking for a way to make using a GPGPU easier. In 2005 they released BrookGPU, which was a small ecosystem that included a language, a compiler, and a runtime.

BrookGPU compiled programs written in the Brook stream programming language, which is a variant of ANSI C. It could target OpenGL v1.3+, DirectX v9+, or AMD’s Close to Metal for the computational back end, and it ran on both Microsoft Windows and Linux. For debugging, BrookGPU could also simulate a virtual graphics card on the CPU.

However, it did not take off, because of the hardware available at the time. In the GPGPU world, you need to copy the data to the device (in this context, device refers to the GPU and the board on which it is situated), wait for the GPU to process the data, and then copy the data back to the main runtime. This creates a lot of latency. And in the mid-2000s, when the project was under active development, this latency almost precluded extensive use of GPUs for general computing.

Nevertheless, many companies saw a future in this technology. Several video adapter vendors started providing GPGPUs with their proprietary technologies, and others formed alliances to provide more-general, versatile programming models to run a larger variety of hardware devices.

Now that I’ve shared this background, let’s examine the two most successful technologies for GPU computing—OpenCL and CUDA—and see how Java works with them.

OpenCL and Java

Like many other infrastructure packages, OpenCL provides a base implementation in C. It is technically accessible via Java Native Interface (JNI) or Java Native Access (JNA), but such access would be a bit too much work for most developers. Fortunately, this work has already been done by several libraries: JOCL, JogAmp, and JavaCL. Unfortunately, JavaCL is a dead project. But the JOCL project is alive and quite up to date. I will use it in the following examples.

But first I should explain what OpenCL is. As I mentioned earlier, OpenCL provides a very general model, suitable for programming all sorts of devices—not only GPUs and CPUs but even digital signal processors (DSPs) and field-programmable gate arrays (FPGAs) as well.

Let’s explore the easiest example: vector addition, probably the most representative and simple example. You have two integer arrays you’re adding and one resulting array. You take an element from the first array and an element from the second array, and then you put the sum of them in the result array, as shown in Figure 5.

Adding the contents of two arrays and storing the sums in a result array

Figure 5. Adding the contents of two arrays and storing the sums in a result array

As you can see, the operation is highly concurrent and thus very parallelizable. You can push each of the add operations to a separate GPU core. This means that if you have 2,048 cores, as on an Nvidia 1080 graphics card, you can perform 2,048 simultaneous add operations. That means there are potentially teraflops of computing power waiting for you. Here is the code for arrays with 10 million integers taken from the JOCL site:

public class ArrayGPU {
    /**
     * The source code of the OpenCL program 
     */
    private static String programSource =
        "__kernel void "+
        "sampleKernel(__global const float *a,"+
        "             __global const float *b,"+
        "             __global float *c)"+
        "{"+
        "    int gid = get_global_id(0);"+
        "    c[gid] = a[gid] + b[gid];"+
        "}";
    
    public static void main(String args[])
    {
        int n = 10_000_000;
        float srcArrayA[] = new float[n];
        float srcArrayB[] = new float[n];
        float dstArray[] = new float[n];
        for (int i=0; i<n; i++)
        {
            srcArrayA[i] = i;
            srcArrayB[i] = i;
        }
        Pointer srcA = Pointer.to(srcArrayA);
        Pointer srcB = Pointer.to(srcArrayB);
        Pointer dst = Pointer.to(dstArray);


        // The platform, device type and device number
        // that will be used
        final int platformIndex = 0;
        final long deviceType = CL.CL_DEVICE_TYPE_ALL;
        final int deviceIndex = 0;

        // Enable exceptions and subsequently omit error checks in this sample
        CL.setExceptionsEnabled(true);

        // Obtain the number of platforms
        int numPlatformsArray[] = new int[1];
        CL.clGetPlatformIDs(0, null, numPlatformsArray);
        int numPlatforms = numPlatformsArray[0];

        // Obtain a platform ID
        cl_platform_id platforms[] = new cl_platform_id[numPlatforms];
        CL.clGetPlatformIDs(platforms.length, platforms, null);
        cl_platform_id platform = platforms[platformIndex];

        // Initialize the context properties
        cl_context_properties contextProperties = new cl_context_properties();
        contextProperties.addProperty(CL.CL_CONTEXT_PLATFORM, platform);
        
        // Obtain the number of devices for the platform
        int numDevicesArray[] = new int[1];
        CL.clGetDeviceIDs(platform, deviceType, 0, null, numDevicesArray);
        int numDevices = numDevicesArray[0];
        
        // Obtain a device ID 
        cl_device_id devices[] = new cl_device_id[numDevices];
        CL.clGetDeviceIDs(platform, deviceType, numDevices, devices, null);
        cl_device_id device = devices[deviceIndex];

        // Create a context for the selected device
        cl_context context = CL.clCreateContext(
            contextProperties, 1, new cl_device_id[]{device}, 
            null, null, null);
        
        // Create a command-queue for the selected device
        cl_command_queue commandQueue = 
            CL.clCreateCommandQueue(context, device, 0, null);

        // Allocate the memory objects for the input and output data
        cl_mem memObjects[] = new cl_mem[3];
        memObjects[0] = CL.clCreateBuffer(context,
            CL.CL_MEM_READ_ONLY | CL.CL_MEM_COPY_HOST_PTR,
            Sizeof.cl_float * n, srcA, null);
        memObjects[1] = CL.clCreateBuffer(context,
            CL.CL_MEM_READ_ONLY | CL.CL_MEM_COPY_HOST_PTR,
            Sizeof.cl_float * n, srcB, null);
        memObjects[2] = CL.clCreateBuffer(context,
            CL.CL_MEM_READ_WRITE,
            Sizeof.cl_float * n, null, null);
        
        // Create the program from the source code
        cl_program program = CL.clCreateProgramWithSource(context,
            1, new String[]{ programSource }, null, null);
        
        // Build the program
        CL.clBuildProgram(program, 0, null, null, null, null);
        
        // Create the kernel
        cl_kernel kernel = CL.clCreateKernel(program, "sampleKernel", null);
        
        // Set the arguments for the kernel
        CL.clSetKernelArg(kernel, 0,
            Sizeof.cl_mem, Pointer.to(memObjects[0]));
        CL.clSetKernelArg(kernel, 1,
            Sizeof.cl_mem, Pointer.to(memObjects[1]));
        CL.clSetKernelArg(kernel, 2,
            Sizeof.cl_mem, Pointer.to(memObjects[2]));
        
        // Set the work-item dimensions
        long global_work_size[] = new long[]{n};
        long local_work_size[] = new long[]{1};
        
        // Execute the kernel
        CL.clEnqueueNDRangeKernel(commandQueue, kernel, 1, null,
            global_work_size, local_work_size, 0, null, null);
        
        // Read the output data
        CL.clEnqueueReadBuffer(commandQueue, memObjects[2], CL.CL_TRUE, 0,
            n * Sizeof.cl_float, dst, 0, null, null);
        
        // Release kernel, program, and memory objects
        CL.clReleaseMemObject(memObjects[0]);
        CL.clReleaseMemObject(memObjects[1]);
        CL.clReleaseMemObject(memObjects[2]);
        CL.clReleaseKernel(kernel);
        CL.clReleaseProgram(program);
        CL.clReleaseCommandQueue(commandQueue);
        CL.clReleaseContext(context);

    }

    private static String getString(cl_device_id device, int paramName) {
        // Obtain the length of the string that will be queried
        long size[] = new long[1];
        CL.clGetDeviceInfo(device, paramName, 0, null, size);

        // Create a buffer of the appropriate size and fill it with the info
        byte buffer[] = new byte[(int)size[0]];
        CL.clGetDeviceInfo(device, paramName, buffer.length, Pointer.to(buffer), null);

        // Create a string from the buffer (excluding the trailing \0 byte)
        return new String(buffer, 0, buffer.length-1);
    }
}

This code doesn’t look like Java, but it actually is. I’ll explain the code next; don’t spend a lot of time on it now, because I will shortly discuss less complicated solutions.

The code is well documented, but let’s do a small walk-through. As you can see, the code is very C-like. This is quite normal, because JOCL is just the binding to OpenCL. At the start, there is some code inside a string, and this code is actually the most important part: It gets compiled by OpenCL and then sent to the video card and executed there. This code is called a kernel. Do not confuse this term with an OS kernel; this is the device code. This kernel code is written in a subset of C.

After the kernel comes the Java binding code to set up and orchestrate the device, to chunk the data, and to create proper memory buffers on the device where the data is going to be stored as well as the memory buffers for the resulting data.

To summarize: There is “host code,” which is usually a language binding (in this case, Java), and the “device code.” You always distinguish what runs on the host and what should run on the device, because the host controls the device.

The preceding code should be viewed as the GPU equivalent of “Hello World!” As you see, the amount of ceremony is vast.

Let’s not forget the SIMD capabilities. If your hardware supports SIMD extensions, you can make arithmetic code run much faster. For example, let’s look at the matrix multiplication kernel code. This is the code in the raw string of your Java application.

__kernel void MatrixMul_kernel_basic(int dim,
                  __global float *A,
                  __global float *B,
                  __global float *C){

    int iCol = get_global_id(0);
    int iRow = get_global_id(1);
    float result = 0.0;
    for(int i=0; i< dim; ++i)
    {
          result +=
          A[iRow*dim + i]*B[i*dim + iCol];
    }
    C[iRow*dim + iCol] = result;
}

Technically, this code will work on a chunk of data that was set up for you by the OpenCL framework, with the instructions you supply in the preparation ceremony.

If your video card supports SIMD instructions and is able to process vectors of four floats, a small optimization may turn the previous code into the following code:

#define VECTOR_SIZE 4    
__kernel void MatrixMul_kernel_basic_vector4(
    size_t dim, // dimension is in single floats
    const float4 *A,
    const float4 *B,
    float4 *C)
{
    size_t globalIdx = get_global_id(0);
    size_t globalIdy = get_global_id(1);
    float4 resultVec = (float4){ 0, 0, 0, 0 };
    size_t dimVec = dim / 4;
    for(size_t i = 0; i < dimVec; ++i) {
        float4 Avector = A[dimVec * globalIdy + i];
        float4 Bvector[4];
        Bvector[0] = B[dimVec * (i * 4 + 0) + globalIdx];
        Bvector[1] = B[dimVec * (i * 4 + 1) + globalIdx];
        Bvector[2] = B[dimVec * (i * 4 + 2) + globalIdx];
        Bvector[3] = B[dimVec * (i * 4 + 3) + globalIdx];
        resultVec += Avector[0] * Bvector[0];
        resultVec += Avector[1] * Bvector[1];
        resultVec += Avector[2] * Bvector[2];
        resultVec += Avector[3] * Bvector[3];
    }

    C[dimVec * globalIdy + globalIdx] = resultVec;
}

With this code, you can double the performance.

Cool. You have unlocked the GPU for the Java world! But as a Java developer, do you really want to do all of this binding, write C code, and work with such low-level details? I certainly don’t. But now that you have some knowledge of how the GPU architecture is used, let’s look at other solutions beyond the JOCL code I’ve just presented.

CUDA and Java

CUDA is Nvidia’s solution to these coding issues. CUDA provides many more ready-to-use libraries for standard GPU operations, such as matrices, histograms, and even deep neural networks. The emerging library list already contains many useful bindings. These are from the JCuda project:

  • JCublas: all about matrices
  • JCufft: fast Fourier transforms
  • JCurand: all about random numbers
  • JCusparse: sparse matrices
  • JCusolver: factorization
  • JNvgraph: all about graphs
  • JCudpp: CUDA Data Parallel Primitives Library and some sorting algorithms
  • JNpp: image processing on a GPU
  • JCudnn: a deep neural network library

I’ll describe using JCurand, which generates random numbers. You can directly use it from Java code with no other specific kernel languages. For example:

...
int n = 100;
curandGenerator generator = new curandGenerator();
float hostData[] = new float[n];
Pointer deviceData = new Pointer();
cudaMalloc(deviceData, n * Sizeof.FLOAT);
curandCreateGenerator(generator, CURAND_RNG_PSEUDO_DEFAULT); 
curandSetPseudoRandomGeneratorSeed(generator, 1234);
curandGenerateUniform(generator, deviceData, n);
cudaMemcpy(Pointer.to(hostData), deviceData, 
        n * Sizeof.FLOAT, cudaMemcpyDeviceToHost);
System.out.println(Arrays.toString(hostData));
curandDestroyGenerator(generator);
cudaFree(deviceData);
...

Here the GPU is used to create more random numbers of high quality, based on some very strong mathematics.

In JCuda you can also write generic CUDA code and call it from Java by just adding some JAR files to your classpath. See the JCuda documentation for more examples.

Staying Above Low-Level Code

This all looks great, but there is too much ceremony, too much setup, and too many different languages to get this running. Is there a way to use a GPU at least partially?

What if you don’t want to think about all of this OpenCL, CUDA, and other internal stuff? What if you just want to code in Java and not think about the internals? The Aparapi project can help. Aparapi stands for “a parallel API.” I think of it as a kind of Hibernate for GPU programming that uses OpenCL under the hood. Let’s look at an example of vector addition.

public static void main(String[] _args) {
    final int size = 512;
    final float[] a = new float[size];
    final float[] b = new float[size];

    /* fill the arrays with random values */
    for (int i = 0; i < size; i++){
        a[i] = (float) (Math.random() * 100);
        b[i] = (float) (Math.random() * 100);
    }
    final float[] sum = new float[size];

    Kernel kernel = new Kernel(){
        @Override public void run() {
I           int gid = getGlobalId();
            sum[gid] = a[gid] + b[gid];
        }
    };

    kernel.execute(Range.create(size));
    for(int i = 0; i < size; i++) {
        System.out.printf("%6.2f + %6.2f = %8.2f\n", a[i], b[i], sum[i])
    }
    kernel.dispose();
}

This is pure Java code (taken from the Aparapi documentation), although here and there, you can spot some GPU domain-specific terms such as Kernel and getGlobalId. You still need to understand how the GPU is programmed, but you can approach GPGPU in a more Java-friendly way. Moreover, Aparapi provides an easy way to bind OpenGL contexts to the OpenCL layer underneath—thus enabling the data to stay entirely on the video card—and thereby avoid memory latency issues.

If many independent computations need to be done, consider Aparapi. This rich set of examples gives some use cases that are perfect for massive parallel computations.

In addition, there are several projects such as TornadoVM that automatically offload suitable calculations from the CPU to the GPU, thus enabling massive optimizations out of the box.

Conclusion

Although there are many applications where GPUs can bring some game-changing benefits, you might say there are still some obstacles. However, Java and GPUs can do great things together. In this article, I have only scratched the surface of this vast topic. My intention was to show various high- and low-level options for accessing a GPU from Java. Exploring this area will deliver huge performance benefits, especially for complex problems that require numerous calculations that can be performed in parallel.

Dmitry Aleksandrov

Dmitry Aleksandrov (@bercut2000) is a chief architect at T-Systems, a Java Champion, Oracle Groundbreaker, and blogger. He has more than a decade experience mainly in Java Enterprise in banking/telecom, but he is also interested in dynamic languages on JVM and features such as massive-parallel computations on GPUs. He is a colead of the Bulgarian Java User Group and co-organizer of jPrime Conf. Dmitry is also a frequent speaker at local events as well as conferences such as JavaOne/Code One, Devoxx, JavaZone, and Joker/JPoint.

Modern Java toys that boost productivity, from type inference to text blocks

Developers using older versions of the Java platform are missing out.

October 23, 2020

Download a PDF of this article

Although Java is one of the industry’s most widely used programming languages, it has gotten an undeserved bad reputation over the years as being verbose and stagnant. Yes, sometimes you have to write a lot of code to do the most basic things. And yes, the releases of Java 7, 8, and 9 were each three years apart—and that’s an eternity in software development.

Fortunately, the powers that be have heard us loud and clear: Java has received a much-needed makeover, and now new versions of the language are being released every 6 months, with the most recent version being Java 15, which was released in September 2020.

With so many features, it may be hard to keep up, especially when it comes to identifying the parts of the platform that can make applications faster and easier to write. In this article, I’ll demonstrate several of the newer features of Java that I find most useful.

Local variable type inference

The following example follows Java conventions and uses good names for both the class and the object:

AccountsOverviewPage accountsOverviewPage = page.login(username, password);

However, many times, as you see here, these two names are the same, which is redundant and makes for a lot of typing.

In version 10, Java introduced local variable type inference. What this means is that instead of explicitly declaring an object or a variable’s type, you can instead use the keyword var, and Java will infer what the type is based on what is being assigned to it, for example:

var accountsOverviewPage = page.login(username, password);

This feature saves developers some keystrokes and addresses some of the verbosity of the language.

Java is still a statically typed language. The use of type inference does not make Java a dynamically typed language such as JavaScript or Python. The type is still there; it’s just inferred from the right-hand side of the statement, which means you can use var only if you’re actually initializing the variable. Otherwise, Java will not be able to infer what the type is, as in the following example:

var accountsOverviewPage; //gives compilation error

Type inference cannot infer type on global variables. As the name of the feature implies, it works only for local variables. You can use var inside of methods, loops, and decision structures; however, you cannot use var for global variables, even if you are initializing them. The following code produces an error:

public class MyClass {
    var accountsOverviewPage = page. login(username, password); //gives compilation error
}

Type inference is not allowed in headers. While local variable type inference can be used within the body of local constructs, it cannot be used in the headers of methods or constructors, as shown in the following example. This is because the caller needs to know the data type of the arguments to send.

public class MyTests {
    
    public MyTests(var data) {} //gives compilation error
}

Type inference means that naming is even more important now. Given the following variable name and the following method name, I have no idea what the inferred data type of x would be:

var x = getX();

Java will know because it can infer the type based on what’s returned from getX(). However, as someone reading the code, I can’t easily tell what it is. That makes it difficult to work with this variable.

You should always use good variable names, but it’s even more important if you’re going to use var because some of the context is removed.

Not everything needs to be a var. Once you start using type inference, you’re going to love it! However, please don’t overdo it.

For example, using var as shown below doesn’t do you any favors, because it removes context for no good reason:

var numberOfAccounts = 5;

Be careful with cases that lead to potential ambiguity. In the following declaration, what would you guess the inferred type would be?

var expectedAccountIdsList = new ArrayList();

If you guessed an ArrayList of Objects, you’re correct!

While this may be OK in many cases, if you want to use the dot operator (.) on any of the elements in this collection, you’ll be limited to the methods available in the Object class.

For more specific inference, use as much information as you can on the right-hand side of the assignment. For example, using the diamond operator to specify the type as String ensures that expectedAccountIdsList is defined as an ArrayList of Strings.

var expectedAccountIdsList = new ArrayList<String>();

New operations can improve stream efficiency

Java 9 introduced two new operations in the Stream API: takeWhile and dropWhile.

The takeWhile() operation processes the items of a collection and keeps each one while a given condition (known as a predicate) is true. The dropWhile() operator does the opposite: It disregards the items of a collection while the predicate is true.

In the example below, I get a list of accounts, and then I use takeWhile() to keep all the accounts that have a type of CHECKING but only until the code gets to an account that does not have this type:

var accountsList = APIUtil.getAccounts(customerId);
var checkingAccountsList = accountsList
        .stream()
        .takeWhile(account -> account.type().equals("CHECKING"))
        .collect(Collectors.toList());

Given the list of accounts shown in Figure 1, calling takeWhile() with a predicate of type equals CHECKING would lead to the first three entries being kept. Although there are additional entries here that match the predicate, the stream ends when the predicate is not met. Since the fourth element is of type SAVINGS, the stream is stopped when this element is reached.

A list of bank accounts

Figure 1. A list of bank accounts

Similarly (yet the opposite situation), if you invoked dropWhile() on this stream, the elements at index 3–10 will be kept: dropWhile() drops the first three entries because they matched the predicate, and once it reaches type SAVINGS on the fourth element, the stream ends.

var accountsList = APIUtil.getAccounts(customerId);
var checkingAccountsList = accountsList
        .stream()
        .dropWhile(account -> account.type().equals("CHECKING"))
        .collect(Collectors.toList());

Sort collections for deterministic results. If you’re interested in collecting or dropping all the elements that match the predicate, be sure to sort the stream before calling takeWhile() or dropWhile(), for example:

var accountsList = APIUtil.getAccounts(customerId);
var checkingAccountsList = accountsList
        .stream()
        .sorted(Comparator.comparing(Account::type))
        .takeWhile(account -> account.type().equals("CHECKING"))
        .collect(Collectors.toList());

Sorting the collection, as seen on line 4 above, guarantees that all elements that match the predicate are accepted or dropped as expected.

The difference between takeWhile and filter. A common question is “What’s the difference between takeWhile and filter?” Both use a predicate to narrow a stream.

The difference is that the filter() operation looks through the entire collection and gathers all elements that match the predicate, whereas takeWhile() short-circuits this process by stopping the operation once it encounters an element that does not match the predicate, which makes takeWhile() faster.

Using takeWhile or dropWhile on parallel streams. Performance suffers if takeWhile() or dropWhile() are used on parallel streams, even when the streams are ordered.

It’s recommended that you use these operations on standalone streams for optimal performance.

Switch expressions

Java 12 introduced switch expressions, which enable you to use switch to directly assign a value to a variable. In the following example, notice I am using switch on the right side of a statement to initialize the variable id.

String id = switch(name) {
        case "john" -> "12212";
        case "mary" -> "4847474";
        case "tom" -> "293743";
        default -> "";
};

This code is saying if the name is john, then assign 12212 to the variable id.

The case statements don’t need a colon in switch expressions, but instead they use an arrow.

Fall-through in switch expressions. You don’t need a break statement in switch expressions because there is no fall-through with switch expressions. This is one of the benefits of using switch expressions, because a common error is to forget a break statement in switch statements, which results in unexpected behavior. This error can be avoided with switch expressions.

However, there are times where you may want to address multiple cases with a single block. You can do so in switch expressions by specifying each case in a comma-delimited list, as shown below:

return switch(name) {
        case "john", "demo" -> "12212";
        case "mary" -> "4847474";
        case "tom" -> "293743";
        default -> "";
    };

Notice that in the first case, if the name is john or demo, then 12212 will be returned.

Executing additional logic in switch expressions. While the primary purpose of switch expressions is to assign a value, additional logic may be required to determine that value.

To accomplish this, you may implement a block of code within the case statements of switch expressions by enclosing the statements inside a set of curly braces.

However, the final statement of the switch expression must be the yield method, which provides a value for the assignment, as seen in the case statement for john below:

return switch(name) {
    case "john" -> {
        System.out.println("Hi John");
        yield "12212";
    }
    case "mary" -> "4847474";
    case "tom" -> "293743";
    default -> "";
};

Throwing exceptions from switch expressions. You can use any of the case statements to throw an exception.

return switch(name){
    case "john" -> "12212";
    case "mary" -> "4847474";
    case "tom" -> "293743";
    default -> throw new InvalidNameException();
};

Of course, in the default case, no value is being returned because the entire flow is interrupted by the exception.

Throwing exceptions is not limited to the default case. An exception can be thrown from any of the case statements, as shown below:

return switch(name){
    case "john" -> "12212";
    case "mary" -> throw new AccountClosedException();
    case "tom" -> "293743";
    default -> throw new InvalidNameException();
};

When to use switch expressions. Switch expressions are not a replacement for switch statements; they are an addition to the language. You certainly can still use switch statements, and in some cases, that may be the more favorable option.

As a rule of thumb, use switch expressions when you are using this construct to assign a value; and use switch statements when you’re not assigning a value, but you just need to conditionally invoke statements.

Records

Records are a new type of class introduced in Java 14 as a preview feature. Records are great for simple classes that only need to contain fields and access to those fields. Here is a record that can serve as a model for an Account:

public record Account(
       int id,
       int customerId,
       String type,
       double balance) {}

Notice that instead of the word class, I used record. Also, the fields are defined in the class declaration within a set of parentheses, followed by a set of curly braces.

That’s it! This simple declaration creates a record with these fields. You don’t need to create any getters or setters. You don’t need to override the inherited methods of equals()hashCode(), or toString(). All that is done for you.

However, if you want to override anything or add additional methods, you can do so within the curly braces, for example:

public record Account(
       int id,
       int customerId,
       String type,
       double balance
) {
    @Override
    public String toString(){
        return "I've overridden this!";
    }
}

Instantiating records. Records can be instantiated just like classes. In the following example, Account is the name of my record, and I use the new keyword and call the constructor passing in all of the values:

Account account = new Account(13344, 12212, "CHECKING", 4033.93);

Records are immutable. The fields of a record are final, so there are no setter methods generated for records. Of course, you can add a setter within the curly braces of the record, but there’s no good reason to do that since the fields are final and cannot be modified.

Account account = new Account(13344, 12212, "CHECKING", 4033.93);
account.setType("SAVINGS"); //gives compilation error

For the same reason, you cannot directly use records as builder classes. Attempting to modify the final fields of a record results in a compilation error, for example:

public record Account(
        int id,
        int customerId,
        String type,
        double balance)
{
    //gives compilation error
    public Account withId(int id){
        this.id = id;
    }
}

Accessor methods. With records, you do have accessor methods; however, they do not start with the word get. Instead, the accessor method name is the same as the field name. Notice below that account.balance() is called rather than account.getBalance().

Account account = new Account(13344, 12212, "CHECKING", 4033.93);
double balance = account.balance();

Inheritance is not supported. Since records are final, they cannot inherit from other classes or records. Attempting to use the extends clause in a record declaration will result in a compilation error, as shown below:

public record CheckingAccount() extends Accounts {} //gives compilation error

Records can implement interfaces. Records can, however, implement interfaces. Just like classes, records use the implements keyword in their declaration to specify their intent, and the methods can be implemented within the records’ curly braces, for example:

public interface AccountInterface {
    
    void someMethod();
}

public record Account(
        int id,
        int customerId,
        String type,
        double balance) implements AccountsInterface
{
    public void someMethod(){
        
    }
}

Text blocks

Representing big blocks of complex text within a Java string can be very tedious. In the following example, notice how all of the quotation marks need to be escaped, new line characters are needed for each line break, and plus signs are needed to join each line:

String response = 
"[\n" +
"  {\n" +
    "    \"id\": 13344,\n" +
    "    \"customerId\": 12212,\n" +
    "    \"type\": \"CHECKING\",\n" +
    "    \"balance\": 4022.93\n" +
    "  },\n" +
    "  {\n" +
    "    \"id\": 13455,\n" +
    "    \"customerId\": 12212,\n" +
    "    \"type\": \"CHECKING\",\n" +
    "    \"balance\": 1000\n" +
    "  }\n" +
    "]";

Text blocks, introduced in Java 13, allow you to use three quotation marks to open and close a big block of text, for example:

return """
        [
          {
            "id": 13344,
            "customerId": 12212,
            "type": "CHECKING",
            "balance": 3821.93
          },
          {
            "id": 13455,
            "customerId": 12212,
            "type": "LOAN",
            "balance": 989
          }
        ]
       """;

Notice that you don’t need to escape anything. The individual quotation marks are still there on the fields and the line breaks are respected.

Text cannot begin on the same line as the opening quote. You cannot include the entire text block on the same line. If you do, you’ll get a compilation error, as shown below:

return """ Hey y'all! """; //gives compilation error

A new line must be after the opening quotes as shown in the next example. This is legal but it is not the preferred way.

return """ 
             Hey y'all!""";

The preferred way is to have both the opening and the closing quotes aligned on their own respective lines with the text block in between, for example:

return """ 
       Hey y'all!
       """;

Conclusion

These are just a few of my favorite new features from recent versions of Java. As you can see, the language has certainly improved in the areas of verbosity and adopting modern programming trends. Cheers to beloved Java!

Dig deeper

Angie Jones

Angie Jones is a Java Champion who specializes in test automation strategies and techniques. She shares her wealth of knowledge by speaking and teaching at software conferences all over the world, and she leads the online learning platform Test Automation University. As a master inventor, Jones is known for her innovative and out-of-the-box thinking style, which has resulted in more than 25 patented inventions in the US and China. In her spare time, Jones volunteers with Black Girls Code to teach coding workshops to young girls in an effort to attract more women and minorities to tech.

Inside Java 15: Fourteen JEPs in five buckets

Hidden classes, sealed classes, text blocks, records, and EdDSA: There’s lots of goodness in JDK 15.

August 28, 2020

Download a PDF of this article

As one of my favorite expressions says, there’s lots of rich chocolaty goodness in Java 15. There are 14 important JDK Enhancement Proposals (JEPs) in the September 15, 2020, release. This article provides a quick overview of what’s new, based on information in the JEPs themselves.

The 14 JEPs can be lumped into five buckets. See each JEP’s documentation for a more in-depth look.

Fun exciting new features:

  • JEP 339: Edwards-Curve Digital Signature Algorithm (EdDSA)
  • JEP 371: Hidden Classes

Additions to existing Java SE standards:

  • JEP 378: Text Blocks
  • JEP 377: Z Garbage Collector (ZGC)
  • JEP 379: Shenandoah: A Low-Pause-Time Garbage Collector (Production)

Modernization of a legacy Java SE feature:

  • JEP 373: Reimplement the Legacy DatagramSocket API

A look forward to new stuff:

  • JEP 360: Sealed Classes (Preview)
  • JEP 375: Pattern Matching for instanceof (Second Preview)
  • JEP 384: Records (Second Preview)
  • JEP 383: Foreign-Memory Access API (Second Incubator)

Removals and deprecations:

  • JEP 372: Remove the Nashorn JavaScript Engine
  • JEP 374: Disable and Deprecate Biased Locking
  • JEP 381: Remove the Solaris and SPARC Ports
  • JEP 385: Deprecate RMI Activation for Removal

Fun exciting new features

I’ll be the first to admit that the Edwards-Curve Digital Signature Algorithm (EdDSA) covered in JEP 339 is a bit beyond my knowledge of encryption. Okay; it’s entirely beyond my knowledge. However, this JEP is designed to be a platform-independent implementation of EdDSA with better performance than the existing C-language implementation, ECDSA. The whole point is to avoid side-channel attacks.

According to the JDK documentation,

EdDSA is a modern elliptic curve signature scheme that has several advantages over the existing signature schemes in the JDK. The primary goal of this JEP is an implementation of this scheme as standardized in RFC 8032. This new signature scheme does not replace ECDSA.

Additional implementation goals:

Develop a platform-independent implementation of EdDSA with better performance than the existing ECDSA implementation (which uses native C code) at the same security strength. For example, EdDSA using Curve25519 at ~126 bits of security should be as fast as ECDSA using curve secp256r1 at ~128 bits of security.

In addition, the implementation will not branch on secrets. These properties are valuable for preventing side-channel attacks.”

Now you know more than I do. You can look forward to a Java Magazine article explaining EdDSA soon.

Hidden classes (JEP 371) are classes that cannot be used directly by the bytecode of other classes. They are intended for use by frameworks that dynamically generate classes at runtime and use them indirectly, via reflection. Dynamically generated classes might be needed only for a limited time, so retaining them for the lifetime of the statically generated class might unnecessarily increase the memory footprint.

The dynamically generated classes are also nondiscoverable. Being independently discoverable by name would be harmful, since it undermines the goal that the dynamically generated class is merely an implementation detail of the statically generated class.

The release of hidden classes lays the groundwork for developers to stop using the nonstandard API sun.misc.Unsafe::defineAnonymousClass. Oracle intends to deprecate and remove that class in the future.

Additions to existing Java SE standards

Text blocks (JEP 378) continue to evolve after being previewed in JDK 13 and JDK 14. Text blocks—which come from Project Amber—are multiline string literals that avoid the need for most escape sequences.Text blocks automatically format strings in a predictable way, but if that’s not good enough, the developer can take charge of the formatting. This second preview introduces two new escape sequences to control new lines and white spaces. For example, the \<line-terminator> escape sequence explicitly suppresses the insertion of a newline character.

Before, to indicate one long line of text, you would have needed this:

String literal =  "Lorem ipsum dolor sit amet, "+
                "consectetur adipiscing elit, " +
                  "sed do eiusmod tempor incididunt";

But now, using \<line-terminator> makes the code easier to read:

String literal = """
               Lorem ipsum dolor sit amet, \
             consectetur adipiscing elit, \
               sed do eiusmod tempor incididunt\
             """;

The \s escape sequence can prevent the stripping of trailing white spaces. So the following text represents three lines that are each exactly five characters long. (The middle line, for green, doesn’t need the \s because it’s already five characters long.)

String colors = """
        red \s
        green
        blue\s\
        """;

The Z Garbage Collector (JEP 377) was introduced in JDK 11 as an experimental feature. Now it’s an official, nonexperimental product feature. ZGC is a concurrent, NUMA-aware, scalable low-latency garbage collector, geared to deliver garbage-collection pauses of less than 10 milliseconds—even on multiterabyte heaps. The average pause time, according to Oracle’s tests, is less than 1 millisecond, and the maximum pause time is less than 2 milliseconds. Figure 1 shows a comparison of Java’s parallel garbage collector, G1, and ZGC—with the ZGC pause times expanded by a factor of 10.

Figure 1. Comparison of garbage collector pause times

That said, on many workloads, G1 (which is still the default) might be a little bit faster than ZGC. Also, for very small heaps, such as those that are only a few hundred megabytes, G1 also might be faster. So, you should do your own tests, on your own workloads, to see which garbage collector to use.

Important: Because ZGC is no longer experimental, you don’t need -XX:+UnlockExperimentalVMOptions to use it.

ZGC is included in Oracle’s OpenJDK build and in the Oracle JDK. Shenandoah (JEP 379) is another garbage collector option, and is available in some OpenJDK builds.

Modernization of a legacy Java SE feature

JEP 373 reimplements the legacy DatagramSocket API. Consider this to be mainly the refactoring of some Jurassic code, because this JEP replaces the old, hard-to-maintain implementations of the java.net.DatagramSocket and java.net.MulticastSocket APIs with simpler and more modern implementations that are easy to maintain and debug—and which will work with Project Loom’s virtual threads.

Because there’s so much existing code using the old API introduced with JDK 1.0, the legacy implementation will not be removed. In fact, a new JDK-specific system property, jdk.net.usePlainDatagramSocketImpl, configures the JDK to use the legacy implementation if the refactored APIs cause problems on regression tests or in some corner cases.

A look forward to new stuff

JDK 15 introduces the first preview of sealed classes (JEP 360), which comes from Project Amber. Sealed classes and interfaces restrict which other classes or interfaces may extend or implement them. Why is that important? The developer might want to control the code that’s responsible for implementing a specific class or interface. Sealed classes also provide a more declarative way than access modifiers to restrict the use of a superclass. Here’s an example:

package com.example.geometry;

public sealed class Shape
        permits com.example.polar.Circle,
                com.example.quad.Rectangle,
                com.example.quad.simple.Square {...}

The purpose of sealing a class is to let client code understand all permitted subclasses. After all, there may be use cases where the original class definition is expected to be fully comprehensive—and where the developer does not want to allow that class (or interface) to be extended only where permitted.

There are some constraints on sealed classes:

  • The sealed class and its permitted subclasses must belong to the same module, and, if they are declared in an unnamed module, they must exist in the same package.
  • Every permitted subclass must directly extend the sealed class.
  • Every permitted subclass must choose a modifier to describe how it continues the sealing initiated by its superclass—final, sealed, or nonsealed (a sealed class cannot prevent its permitted subclasses from doing this).

Also in JDK 15 is the second preview of pattern matching for instanceof (JEP 375), another Project Amber development. The first preview was in Java 14, and there are no changes relative to that preview.

The goal here is to enhance Java with pattern matching for the instanceof operator. Pattern matching allows common logic in a program, namely the conditional extraction of components from objects, to be expressed more concisely and safely. Let me refer you to Mala Gupta’s excellent article, “Pattern Matching for instanceof in Java 14,” for a primer.

A popular feature is records (JEP 384), which is in its second preview in Java 15. Records are classes that act as transparent carriers for immutable data. The new JEP incorporates refinements based on community feedback, and it supports a few new additional forms of local classes and interfaces. Records also come from Project Amber.

The record classes are an object-oriented construct that expresses a simple aggregation of values. By doing so, the record classes help programmers focus on modeling immutable data rather than extensible behavior. Records automatically implement data-driven methods such as the equals method and accessor methods, and records preserve long-standing Java principles such as nominal typing and migration compatibility. In other words, records make classes that contain immutable data easier to code and read.

The final new stuff comes in the second incubator release of the Foreign-Memory Access API (JEP 383), which lets Java programs safely and efficiently access foreign memory outside of the Java heap. The objective is to begin replacing java.nio.ByteBuffer and sun.misc.Unsafe. This is part of Project Panama, which improves connections between Java and non-Java APIs.

The JEP documentation aptly describes the need for this innovation, as follows:

When it comes to accessing foreign memory, developers are faced with a dilemma: Should they choose a safe but limited (and possibly less efficient) path, such as the ByteBuffer API, or should they abandon safety guarantees and embrace the dangerous and unsupported Unsafe API?

This JEP introduces a safe, supported, and efficient API for foreign memory access. By providing a targeted solution to the problem of accessing foreign memory, developers will be freed of the limitations and dangers of existing APIs. They will also enjoy improved performance, since the new API will be designed from the ground up with JIT optimizations in mind.

Removals and deprecations

None of these should be controversial.

JEP 372 concerns removing the Nashorn JavaScript engine. The Nashorn JavaScript engine, and its APIs and jjs tool, was deprecated back in Java 11. Now it’s time to say goodbye.

Disable and Deprecate Biased Locking (JEP 374) starts to get rid of an old optimization technique used in the HotSpot JVM to reduce the overhead of uncontended locking. Biasing the locks has historically led to significant performance improvements compared to regular locking techniques, but the performance gains seen in the past are far less evident today. The cost of executing atomic instructions has decreased on modern processors.

Biased locking introduced a lot of complex code, and this complexity is an impediment to the Java team’s ability to make significant design changes within the synchronization subsystem. By disabling biased locking by default, while leaving in the hands of the developer the option of re-enabling it, the Java team hopes to determine whether it would be reasonable to remove it entirely in a future release.

JEP 381, Remove the Solaris and SPARC Ports, eliminates all the source code specific to the Solaris operating system and the SPARC architecture. There’s not much else to say.

JEP 385, Deprecate RMI Activation for Removal, eases Java away from an obsolete part of remote method invocation that has been optional since Java 8.

There is a low and decreasing amount of use of RMI activation. The Java team has seen no evidence of any new applications being written to use RMI activation, and there is evidence that very few existing applications use RMI activation. A search of various open source codebases revealed barely any mention of any of the activation-related APIs. No externally generated bug reports on RMI activation have been received for several years.

Maintaining RMI activation as part of the Java platform incurs continuing maintenance costs. It adds complexity to RMI. RMI activation can be removed without affecting the rest of RMI. The removal of RMI activation does not reduce Java’s value to developers, but it does decrease the JDK’s long-term maintenance costs. And so, it’s time for it to begin going away.

Conclusion

Java 15 continues the six-month release cadence for the JDK and introduces a solid set of new features, feature revisions, and previews/incubators. Java 15 delivers lots of goodness for most developers. Please let me know what you think of this new release at javamag_us@oracle.com or on Twitter with hashtag #java15.

Finally, I’d like to thank Aurelio Garcia-Ribeyro, senior director of project management for Oracle’s Java Platform Group, for gathering up a lot of this information for his mid-August webcast on JDK 15 (available for replay here).

Dig deeper

Alan Zeichick

Alan Zeichick is editor in chief of Java Magazine and editor at large of Oracle’s Content Central group. A former mainframe software developer and technology analyst, Alan has previously been the editor of AI ExpertNetwork MagazineSoftware Development TimesEclipse Review, and Software Test & Performance. Follow him on Twitter @zeichick.