Hazelcast in Spring Boot Running on Kubernetes

Spread the love

Whereas Kubernetes is the target environment of application execution, you will not find here how to declare required services or deployment most optimally. Some basic Kubernetes configurations are just enough for demonstration purposes. Though, you can use the configurations as a starting point and improve them with the help of Kubernetes Documentation.

The theme of Spring‘s Caching is not fully covered either. So, if you are interested in Cache Abstractions, please refer to the corresponding part of Spring Framework documentation.

What You Will Learn

The main focus of the article is to show how to build a Spring Boot application with the components and configurations necessary:

  • to work with Spring Data JPA based on Hibernate
  • to use Hazelcast IMDG (in-memory data grid) in embedded mode
  • to deploy an application on the Kubernetes cluster

What You Need

  • JDK 11 (I think the code will work with Java 8, though it will require changes in pom.xml)
  • Apache Maven
  • Minikube to set up a local Kubernetes cluster.
  • Docker CLI to build an application Docker image.

The code for the guide can be found on the  Spring Data JPA application on Kubernetes with the caching on Hazelcast. The application was built and run on Microsoft Windows, but I believe it will work on other platforms as well after some tweaking of slashes.

Application Structure

First of all, the application should work with the database via JPA abstraction. As we are going to introduce the second level cache, this implies that Hibernate is used as an ORM framework. To spare some time, I have chosen a ready-to-use code of Spring‘s guide Accessing data with MySQL. It implements a simple flow controller-repository with MySQL as storage. The original code is almost the same except for small changes, like updated versions of Spring and MySQL JDBC connector in pom.xml and refactoring of packages. 

Also, I added a couple of methods into repository and controller. These methods will be helpful later in showing the results of the caching.

    @Query(value = "SELECT name FROM user WHERE email = :email", nativeQuery = true)
    Optional<String> findNameByEmail(String email);

    @GetMapping(path = "{id}")
    public @ResponseBody ResponseEntity<User> getUser(@PathVariable Integer id) {
        return userRepository.findById(id)
                .map(user -> ResponseEntity.ok().body(user))
                .orElseGet(() -> ResponseEntity.notFound().build());

In general, the application provides the same functionality as it is described in the guide Accessing data with MySQL. New endpoint /users/{id} that returns user information by their identifier, and /users/email/{email} returns users name by their email.

Spring Boot With Hazelcast

Spring Framework supports plenty of caching platforms and libraries. Spring Boot makes it easy to use them in an application configuration. So, Spring Boot finds caching providers on the classpath and auto-configures them using default settings. See Spring Boot‘s documentation about the process of the auto-configuration Hazelcast Cache Provider configuration

As we are going to use Hazelcast, set up the next properties explicitly:


And add Hazelcast into the application’s dependencies:


With this, Spring Boot finds and injects a bean of HazelcastInstance into an application context. The Hazelcast will be configurated by settings from cache.yaml.

After that, HazelcastInstance can be used in other beans, i.e., it can be used in the constructor of our UserController:

private static final String CACHED_NAMES = "nameByEmail";
private UserRepository userRepository;
private ConcurrentMap<String, String> nameByEmailMap;

public UserController(UserRepository userRepository, HazelcastInstance hazelcastInstance) {
  this.userRepository = userRepository;
  nameByEmailMap = hazelcastInstance.getMap(CACHED_NAMES);

In the fragment above, we are using Hazelcast’s Distributed Map, which is presented and used in the code as ConcurrentMap. This gives a simple and flexible way to work with cached data as with simple map. All the heavy lifting to distribute cached data throughout the cluster Hazelcast performs behind the scenes. So, the usage of the cache as a map is fully transparent for developers.

Hibernate Second Level Cache

Next, we will enable the second-level (L2) cache for Hibernate. In our application, this is done by setting up properties:

Here, along with a general second-level cache, we also enable Query cache. As we are using an embedded Hazelcast that will be used in a distributed environment, the factory class is set into com.hazelcast.hibernate.HazelcastCacheRegionFactory. A detailed description of properties and settings for Hibernate caching and Hazelcast implementation for the second-level cache can be found in documentation correspondingly Hibernate Caching and Hibernate Second Level Cache.

To make the application using Hazelcast as the second-level cache, we add the dependency into pom.xml:


Let me pay attention to several moments. The second-level cache for Hibernate entities is not used by default, so you need explicitly define which entities you want to be cached. In our application, it is done in the User entity like this:

import org.hibernate.annotations.Cache;
import org.hibernate.annotations.CacheConcurrencyStrategy;

import javax.persistence.Cacheable;
import javax.persistence.Entity;

@Cache(usage = CacheConcurrencyStrategy.NONSTRICT_READ_WRITE)
public class User {
  // skipped

Then, Hibernate warns not to use Query cache as it brings an overhead into transaction flow and generally does not give much benefits in most cases. Here is the quote from the documentation:

Caching of query results introduces some overhead in terms of your application’s normal transactional processing. For example, if you cache the results of a query against a Person, Hibernate will need to keep track of when those results should be invalidated because changes have been committed against any Person entity.

That, coupled with the fact that most applications simply gain no benefit from caching query results, leads Hibernate to disable caching of query results by default.

Though for demonstration purposes only in our application, Query cache is enabled, and UserRepository overrides findAll() in the next way to cache the results of the query:

import javax.persistence.QueryHint;

import static org.hibernate.jpa.QueryHints.HINT_CACHEABLE;
import static org.hibernate.jpa.QueryHints.HINT_CACHE_REGION;

// some code is skipped

  @QueryHint(name = HINT_CACHEABLE, value = "true"),
  @QueryHint(name = HINT_CACHE_REGION, value = "query-cache-users")
Iterable<User> findAll();

Finally, it is possible to use Hazelcast as a pure caching framework separately from Hibernate L2 cache support. So, by default, HazelcastCacheRegionFactory creates a new instance of Hazelcast if property hibernate.cache.hazelcast.instance_name is not set into an existing instance. To reuse the same Hazecast instance which is created by Spring, we set up the property like this:

The name of the instance should be the same, which is defined in the Hazelcast configuration file cache.yaml:

  instance-name: users-app

This will avoid the creation of different Hazelcast instances in the application. So, Hibernate L2 cache will be set up with the same configuration from cache.yaml.

Deploying Hazelcast on Kubernetes

Hazelcast provides different means of auto-discovery cluster members. For instance, for local (development) environment, the multicast mechanism over UDP allows members to find each other. So, it is possible to run locally several instances of the application and Hazelcast will be able to build a cluster using a multicast auto-discovery. This type of discovery is enabled as following:

        enabled: true

For production environments, the usage of UDP is not the best choice. So, among others, Hazelcast supports clusters which are deployed in Kubernetes environment without multicasting. For the detailed information, look into the documentation on Configuring Kubernetes. To activate the discovery on Kubernetes, set the properties as follows:

  instance-name: users-app
  cluster-name: users-app
        enabled: false
        enabled: true
        service-name: hazelcast

This disables the multicast, enables Kubernetes discovery, and defines the name of the service hazelcast, which will be used to search for cluster members via Kubernetes API. Then, we need to create this service, which can be declared as the next (see also k8s/app-hazelcast-service.yaml):

apiVersion: v1
kind: Service
  name: hazelcast
    app: hazelcast
    - port: 5701
      protocol: TCP
    app: app-users
  type: ClusterIP

The key point here is to define the same name, hazelcast of the service which is used by Hazelcast configuration.

To allow Hazelcast to use the service inside Kubernetes for the discovery, we also need to grant certain permissions. An example of RBAC configuration for default namespace you can find in Hazelcast documentation. The same YAML file you can find in the code, see k8s\app-hazelcast-rbac.yaml.

Run the following commands to create the service and to grand roles:

kubectl apply -f k8s\app-hazelcast-service.yaml
kubectl apply -f k8s\app-hazelcast-rbac.yaml

Then, build the application, and after that, build a Docker image, which will be stored into your local Docker repository:

mvn clean verify
docker build -f .\docker\Dockerfile -t rapp/appusers:1.0 .

After that, you can check an availability of the image by running command docker image ls, which should return something like this:

REPOSITORY              TAG                    IMAGE ID       CREATED         SIZE
rapp/appusers           1.0                    9c92dfd38bbf   24 hours ago    419MB

This means that you can use the image rapp/appusers:1.0 to deploy into Kubernetes cluster. To do this, run the next commands, which will deploy three pods with the applications and will create a load balancer for the pods:

kubectl apply -f k8s\app-k8s-deployment.yaml
kubectl apply -f k8s\app-k8s-service.yaml

The result of the deployment you can check by running kubectl get pods, which should show the state of application pods similar to the next:

NAME                            READY   STATUS    RESTARTS   AGE
app-users-55d9b89d86-gstwz      1/1     Running   0          23h
app-users-55d9b89d86-l67dx      1/1     Running   0          23h
app-users-55d9b89d86-l766r      1/1     Running   0          23h

Which means that all the pods are created, ready, and running. Also, you can check logs of any pod by running kubectl logs <name-of-pod>. The logs will contain usual information about the starting of Spring Boot application (creation of context, Tomcat running, Hibernate instantiating, etc.). Apart from that, you can also see activities of Hazelcast: which discovery mechanism is activated, how the process of discovery and the build of cluster are performed. Finally, you can see the result of Hazelcast cluster construction similar to this: 

Members {size:3, ver:3} [
        Member []:5701 - 8d7d80e4-d0e2-4bd6-8116-ab3a7b494a3f
        Member []:5701 - 9aa6f3f8-7da7-4037-8e36-761d8432ceea
        Member []:5701 - b21112b7-35a7-49fb-ad06-0f921b9823be this

So, the cache cluster of three member is built, i.e., all three pods are found and added into the cluster.


As Kubernetes deploys pods into its own IP range and creates services on the declared ports, which are mapped into different external ones, you need to determine either Minikube‘s IP and external ports or to run Minikube in a tunnel mode, which allows using internal IPs and ports. To get Minikube‘s IP and the application port, run the following commands:

> minikube ip
> kubectl get service app-users
NAME        TYPE           CLUSTER-IP       EXTERNAL-IP   PORT(S)          AGE
app-users   LoadBalancer   <pending>     8080:31434/TCP   24h

So you can access the application.

Another way to access the application is to run minikube tunnel command in a separate console. Then, check which an external IP is given to the application service:

> kubectl get service app-users
NAME        TYPE           CLUSTER-IP       EXTERNAL-IP      PORT(S)          AGE
app-users   LoadBalancer   8080:31434/TCP   24h

This means that you can open the application.

Using any of these approaches, you can test endpoints that are declared in UserController in the same way as it is suggested in the guide Accessing data with MySQL. Additionally, there are created two endpoints /users/{id} and /users/email/{email} to check how entities are cached and how Hazelcast map works. 

The first endpoint returns information about users by their identifier. As the application properties set, the logs will contain SQL requests generated by Hibernate. With enabled L2 cache, repeated requests for the same user will not log requests to the database (of cause, within time-to-live-seconds: 300).

The second added endpoint shows the usage of Hazelcast map structure. As the endpoint triggers a custom (not-cached) request to the database, each SQL request could have been logged. 

@Query(value = "SELECT name FROM user WHERE email = :email", nativeQuery = true)
Optional<String> findNameByEmail(String email);

But the results of the request are stored in the Hazelcast map after the retrieving data from the database. So, if the map returns not empty value by a given email, no additional request to the database is performed:

private Optional<String> getFromCache(String email) {
  return Optional.ofNullable(nameByEmailMap.computeIfAbsent(email,
              (e) -> userRepository.findNameByEmail(e).orElse(null)));


As was mentioned in the foreword, this guide is not a definitive tutorial for mentioned technologies and tools. Here are some suggestions for improvements.

It is possible to use “out-of-the-box” Spring‘s support of caching, which in our case, is based on the fact that HazelcastInstance is discovered by Spring and is wrapped into CacheManager. This allows using Spring‘s @EnableCaching and @Cacheable annotations.

In our simplified Kubernetes configurations, it is possible to replace types of services, like NodePort, into clusterIP, which might allow us to run applications in the cluster and to discover services by names. So, it can help to avoid joggling with hardcoded IPs. Additionally, it is highly recommended to avoid using a default namespace. So, consider using a dedicated namespace in the application and Hazelcast configuration.

Finally, never ever store your secrets in the code. In our code, the database credentials are encoded and written directly into YAML configuration files for demonstration and simplicity only. Do not repeat this in your applications. Well-adopted practice is to store secrets in environment variables or third-party vaults, which are retrieved/substituted by CI/CD pipelines during deployment.

Source link

Related Articles

Leave a Reply

Your email address will not be published.

Back to top button