Calling Elasticsearch APIs using JAX-RS Client (Jersey Client), Jackson

Motivation:

There are plenty of full-featured client libraries for Elasticsearch out there which provide various facilities which may be used to either hide the complexities of the REST API or add some structure or helper methods to avoid dealing with big JSON strings. Among the Java-based ones, there is the Java Elasticsearch Client API which makes use of the Node or Transport clients as well as the JEST library which uses HTTP under the hood and has a similar API as the Elasticsearch Java Client APIs. Although these are great and offer various advantages, there may, in some cases, be pitfalls such as:

  • Having to bundle the elasticsearch jar and its many dependencies (lucene, guice, etc) and having to deal with the mess of classpath problems if there are common libraries that the client application itself uses
  • Errors and exceptions due to serialization or API changes if the elasticsearch jar version does not match the Elasticsearch server version (there have been issues even when the major version matches)
  • Having to redeploy application code even if the application is not using any changed Elasticsearch features/APIs in an upgraded server
  • Having to context switch between the REST API in the documentation and the client library specific APIs to achieve the same thing

The main issue is that the “official” Java library is the same jar as the elasticsearch server jar. The other client libraries all use the HTTP APIs under the hood and while they may sacrifice some features such as cluster knowledge or sniffing out data nodes, they remain sufficiently decoupled from the Elasticsearch server version whereas the main Java jar does not. This might sound irrelevant but is a subtle and crucial difference which has impact on deployments (and re-deployments) just as much as ease of development.

Another viable approach could be to just to develop client applications using the REST APIs. Although some might dread dealing with all the boilerplate and JSON processing code in Java and wanting to still use types for entities or responses, that’s a different problem to solve for which there are some pretty good solutions.

JAX-RS is a very common Java API for accessing REST/HTTP based services and has many available implementations (Jersey being one of them) which provides a lot of different options for serialization/deserialization as well as things like SSL, connection pooling, etc. In a polyglot or “microservices” environment, you may already be using it to access other HTTP based services you developed yourself or other databases. Even if you don’t, the library is very easy to use while at the same time being very flexible for your style of programming. For the rest of this article, I will show how to setup and use JAX-RS (with the Jersey implementation) and Jackson (the JSON serialization/deserialization framework) with Elasticsearch.

Dependencies:

I’m only interested in the Jersey Client features (it is also used to make REST server applications, but we don’t want that here) and the Jackson based serialization/deserialization. So the dependencies are:

<dependency>
    <groupId>org.glassfish.jersey.core</groupId>
    <artifactId>jersey-client</artifactId>
    <version>2.22.1</version>
</dependency>
<dependency>
    <groupId>com.fasterxml.jackson.jaxrs</groupId>
    <artifactId>jackson-jaxrs-json-provider</artifactId>
    <version>2.5.4</version>
</dependency>

Code:

Let’s get started with some code. Most people hit the root endpoint http://localhost:9200/ to check that the server is running, so we can start with that too in lieu of a “hello world!”.

import javax.ws.rs.client.Client;
import javax.ws.rs.client.ClientBuilder;
import javax.ws.rs.client.Entity;
import javax.ws.rs.client.WebTarget;
import javax.ws.rs.core.MediaType;
import javax.ws.rs.core.Response;

import com.fasterxml.jackson.jaxrs.json.JacksonJsonProvider;

public class Main {

	public static void main(String[] args) {
                // 1. Setup the JAX-RS Client and register the JacksonJsonProvider so that we can
                // marshall/unmarshall POJO <-> JSON
		Client client = ClientBuilder.newClient().register(
				JacksonJsonProvider.class);

                // 2. Create reference to an endpoint (we can't do all the calls using the fluent API too)
		WebTarget rootTarget = client.target("http://localhost:9200");

		// 3. Build and call default endpoint
		Response response = rootTarget
                                    .request() // Build request
                                    .get();    // Call get method

		System.out.println("Default endpoint");
		System.out.println("Response code: " + response.getStatus());
		System.out.println("Response: \n" + response.readEntity(String.class));
      }
}

The above code is quite simple to understand but it is doing quite a lot under the hood:

1. The first step is to create a Client object which encapsulates all the HTTP calls under the hood with a Connector provider implementation. The default uses a provider using HttpUrlConnection but there are other providers which may use other classes/libraries under the hood such as Apache Http Client. There are many configuration settings for these Connector implementations which can be tuned separately such as Keep-Alive settings, connection pooling, maximum number of concurrent connections, etc etc.

2. A JacksonJsonProvider is registered with the client. We will use this later to make use of POJOs in place of JSON Strings in our API calls and responses.

3. We create a target and make a call to it. Although we can just use the Client fluent API to do it all at once, it’s sometimes useful to break it up to get references to the specific endpoints using WebTarget.

4. The returned response object from the GET call is then queried for the HTTP status code and the body converted to a String for inspection.

Ok, so that’s good for warm-up but we want to do some more things.

Let’s say we want to create an index called “articles” using the POST API.

     // let's create an index called articles
     Response createIndexResponse = rootTarget.path("articles")
				.request(MediaType.APPLICATION_JSON) // don't really need to do this, but showing as example
                                .post(null);

    System.out.println("Creating articles index");
    System.out.println("Response code: " + createIndexResponse.getStatus());

So what did we do here? Similar to the first bit of code, we created another target from the root endpoint (calling /articles now) and passed an empty body to it using the POST method. We could have used PUT with a proper body as well here.

So how about creating an article? We can concoct a simple JSON string and POST that to the /articles/article endpoint (creates a document of type “article” in the “articles” index)

    String newArticleJSONString = "{ \"title\":\"New Article\","
				+ "\"body\":\"Lorem Ipsum\"," + "\"tags\":[\"some-tag\"]}";
		// let's create an article
    Response createArticleResponse1 = rootTarget.path("articles/article")
				.request(MediaType.APPLICATION_JSON).post(Entity.json(newArticleJSONString));
    
    System.out.println("Creating new article using JSON String");
    System.out.println("Response code: "+ createArticleResponse1.getStatus());
    System.out.println("Response :" + createArticleResponse1.readEntity(String.class));

It works but the string construction is ugly and error-prone (It took me a few tries for escaping the double quotes for example)

Can we do one better? Suppose we had an Article POJO class that we wanted to use instead?

import java.util.ArrayList;
import java.util.List;

public class Article {

	private String title;
	private String body;
	private List<String> tags = new ArrayList<String>();
	
	public Article() {}
	
	public Article(String title, String body) {
		this.title = title;
		this.body = body;
	}
	
	public void addTag(String tag) {
		tags.add(tag);
	}
	
	public String getTitle() {
		return title;
	}
	public void setTitle(String title) {
		this.title = title;
	}
	public String getBody() {
		return body;
	}
	public void setBody(String body) {
		this.body = body;
	}
	public List<String> getTags() {
		return tags;
	}
	public void setTags(List<String> tags) {
		this.tags = tags;
	}

}

Now, we can do the previous create API call like this:

    Article newArticle = new Article("New Article", "Lorem Ipsum");
    newArticle.addTag("some-tag");

    // let's create an article
    Response createArticleResponse = rootTarget.path("articles/article")
				.request().post(Entity.json(newArticle));

   System.out.println("Creating new article");
   System.out.println("Response code: "+ createArticleResponse.getStatus());
   System.out.println("Response :"+createArticleResponse.readEntity(String.class));

Ok, so that looks a lot cleaner. How did that work? Remember that JacksonJsonProvider class we registered. Well that’s coming into play here in converting the Article object to its JSON equivalent and POSTing that in the body of the HTTP call.

Next, can we do something about the response object? It’s got a lot of generic methods to query the status, headers, and read the body as a String. How do we read what comes back from Elasticsearch without explicitly parsing the String?

Since we’re using Jackson, we can create a POJO like follows which represents the bits and pieces of the response when we create an article:

import com.fasterxml.jackson.annotation.JsonIgnoreProperties;
import com.fasterxml.jackson.annotation.JsonProperty;
import com.fasterxml.jackson.databind.JsonNode;

@JsonIgnoreProperties(ignoreUnknown=true)
public class ESDocumentResponse {

	@JsonProperty("_id")
	private String id;
	
	@JsonProperty("_type")
	private String type;
	
	@JsonProperty("_version")
	private String version;
	
	@JsonProperty("_index")
	private String index;
	
	@JsonProperty("created")
	private boolean isCreated;

	@JsonProperty("_source")
	private JsonNode source;
	
	
	public JsonNode getSource() {
		return source;
	}

	public String getId() {
		return id;
	}

	public String getType() {
		return type;
	}

	public String getVersion() {
		return version;
	}

	public String getIndex() {
		return index;
	}

	public boolean isCreated() {
		return isCreated;
	}	
}

So this is making use of Jackson annotations to say we’d like to convert a JSON string which looks like

{
 "_index":"articles", 
 "_type":"article", 
 "_id":"AVHvLWl2ouJtLoXzZ5I3", 
 "_version":1, 
 "_shards":{"total":2,"successful":1,"failed":0}, 
 "created":true
}

into an ESDocumentResponse object. The “_id”, “_type”, “_version”, “_index”, “created” attributes at the top level of the JSON String are mapped directly. The “_source” attribute we will look at later. Anything else (e.g. “_shards”) we are ignoring for now.

So how do get the response to convert to the object? It doesn’t take much:

     ESDocumentResponse createArticleResponse2 = rootTarget
				.path("articles/article").request()
				.post(Entity.json(newArticle), ESDocumentResponse.class);

     System.out.println("Creating new article");
     System.out.println("Created : " + createArticleResponse2.isCreated());
     System.out.println("Article ID: " + createArticleResponse2.getId());

What if we wanted to GET the object we just created? We can make use of the same class:

     // let's get the article we just created and marshall to
     // ESDocumentResponse
     ESDocumentResponse getArticleResponse = rootTarget
				.path("articles/article").path(createArticleResponse2.getId())
				.request().get(ESDocumentResponse.class);

     System.out.println("Getting the article we just created");
     System.out.println("ID: " + getArticleResponse.getId());
     System.out.println("SOURCE TITLE: " + getArticleResponse.getSource().get("title"));

Here the only difference is calling the .get(ESDocumentResponse.class). The “_source” field which contains the full object has been mapped to JsonNode so we could get each individual field using the .get(fieldName) call. We could just easily have mapped it to the Article class but that wouldn’t make ESDocumentResponse very generic. It’s quite easy to convert JsonNode to Article using Jackson anyway.

So there we have it. Very simple usage of JAX-RS (with the Jersey implementation) to interact with Elasticsearch. I ran the above queries against Elasticsearch master (3.0) but am pretty sure this would work against any Elasticsearch version. I didn’t have any dependencies on the elasticsearch jar and its dependencies. Also because I only made use of some CRUD operations, I only needed to write code for those bits so even though there were API changes in the Elasticsearch Java Client API, this code was not affected.

In a future post, we could get into:

  • Bulk APIs
  • Search API
  • Aggregations
  • Connection Keep Alive
  • Load balancing across multiple data nodes

Enjoy!

– Sarwar Bhuiyan

Advertisements
Calling Elasticsearch APIs using JAX-RS Client (Jersey Client), Jackson

12 thoughts on “Calling Elasticsearch APIs using JAX-RS Client (Jersey Client), Jackson

  1. Please can you help an example that use match query and returns list of List.

    for example

    ESDocumentResponse getArticleResponse = rootTarget
    .path(“index/type/_search”).request()
    .post(Entity.json(“{\”query\”:{\”match_phrase\”:{\”content\”:\”” + sFilter + “\”}}}”),ESDocumentResponse.class);

    returns null

    but the code below work well

    Response getDocResponse = rootTarget
    .path(“darsapp/elsearchdocuments/_search”).request()
    .post(Entity.json(“{\”query\”:{\”match_phrase\”:{\”content\”:\”” + sFilter + “\”}}}”));

  2. Thanks for your quick response.
    Please can you assist me to archive these:-

    ESDocumentResponse getArticleResponse = rootTarget
    .path(“index/type/_search”).request()
    .post(Entity.json(“{\”query\”:{\”match_phrase\”:{\”content\”:\”” + sFilter + “\”}}}”),ESDocumentResponse.class);

    Also when are you going to disuse on these issue.

    Bulk APIs
    Search API
    Aggregations
    Connection Keep Alive
    Load balancing across multiple data nodes

    1. archive? you mean marshall into a list of domain objects? Look at how to use Jackson and ObjectMapper and you will see how to convert to your own POJOs or JSONNode objects.

      I’m not sure when I’ll get the time to discuss the other issues. Not hard to do given the examples I gave. You can look at this for some inspiration: https://github.com/sarwarbhuiyan/camel-elasticsearch-http/blob/master/src/main/java/com/sarwarbhuiyan/camel/component/elasticsearch/http/ElasticsearchHTTPClient.java

  3. Dirk says:

    Hi, I would like to give it a try, but the client get stuck when it is trying to verify the root target. WebTarget rootTarget = client.target(“http://localhost:9200”); No response at all, no timeout, nothing… Any ideas?

  4. Dirk says:

    When I type the address in the browser I get a response:

    {
    “name” : “Toxin”,
    “cluster_name” : “test_cluster”,
    “version” : {
    “number” : “2.3.3”,
    “build_hash” : “218bdf10790eef486ff2c41a3df5cfa32dadcfde”,
    “build_timestamp” : “2016-05-17T15:40:04Z”,
    “build_snapshot” : false,
    “lucene_version” : “5.5.0”
    },
    “tagline” : “You Know, for Search”
    }

  5. Ganesh says:

    Hi, admired this blog. My ElasticSearch server using authentication. Could you let me know how to use rootTarget for accessing authenticated ES server

    1. Ganesh says:

      I Used HttpAuthenticationFeature
      HttpAuthenticationFeature feature = HttpAuthenticationFeature.basicBuilder()
      .nonPreemptive()
      .credentials(“username”, “password”)
      .build();
      client.register(feature);

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s