Apache Kafka SSL Security (Part one) — Certificate Authority, Keystore, and Truststore

Blog by Vinod Chelladurai

Are you running Apache Kafka to handle the massively scaling data in your company? Then, it is high-time and very important to secure your Kafka ecosystem. Why? Because even a small compromise in the security could lead employees or outsiders, who are not authorised to access Kafka, to gain complete control over the data inside it. This would significantly cost a company from several million to completely losing the trust it developed among its customers after several years of hard work and excellence.

There are several ways to secure the Apache Kafka ecosystem. In this two-part article, I will discuss one of the most commonly used security mechanisms - SSL security, more commonly referred to as the SSL handshake mechanism.

Assumption

I assume the readers of this post have a basic understanding of Apache Kafka, its purpose and its high-level architecture.

What is the goal of this article?

This is a two-part article that serves the below purpose:

  1. After reading the first part, the readers would understand what is meant by the SSL handshake mechanism and how it works in the background. Accordingly, the readers would have a strong foundation of the 3 important concepts - Certificate Authority (CA), Keystore and Truststore, which forms the backbone of the SSL handshake mechanism.

  2. After reading the second part, the readers would understand how to actually implement the SSL handshake mechanism between the Kafka brokers (I will skip the zookeeper security here).

Why do we need security in Apache Kafka?

Apache Kafka is a distributed publish-subscribe system where the messages are distributed across a set of nodes, called brokers so that the producers can send messages that can be concurrently (or even at a later period of time) consumed by the relevant consumers. All these brokers together constitute a single Kafka cluster.

The nodes of the Kafka cluster, i.e., brokers and another piece of Apache software application called zookeeper, often communicate with each other as part of the distributed coordination of the cluster components. For example, the brokers need to communicate with each other for several purposes such as replication of messages, partition leader election, serving producer/consumer requests, etc.. The brokers and the zookeepers can be considered as individual servers running inside separate host machines. Hence, it is highly important that we need to secure such communication happening between the brokers (inter-broker communication) and also between the brokers and the zookeepers. 

What is SSL Handshake mechanism?

An SSL (Secure Sockets Layer), also called TLS (Transport Layer Security), is an encryption protocol between two communicating parties (for example, Kafka brokers, client-server, etc.) for a secure exchange of messages between them. Accordingly, in the context of Kafka, the following events will take place between any two brokers during the lifecycle of an SSL handshake mechanism :

  • Both the brokers authenticate one another via the concept of certificate authority, truststore, and keystore.

  • An SSL connection is established between them once their authenticity is agreed upon by each other.

  • Each broker encrypts their messages via a valid SSL protocol (For example, TLSv1.2).

  • Finally, the communication will take place between them after ensuring the integrity of their messages.

Certificate Authority (CA), Keystore, and Truststore

In order to implement an SSL handshake mechanism among the Kafka brokers, one has to understand the basic concepts of its backbone - certificate authority, keystore, and truststore. Assume that a client is making a request for a connection to the server. In this context,

  • A certificate authority (CA) can be considered as an authorised entity that signs the applications/requests of the client. In other words, the CA authorises the identity of the client.

  • A keystore can be considered as the personal identity of the client that contains various information of which the most important are the client’s ID, an application signed by any CA, and the CA who actually signed it.

  • A truststore can be considered as an archive inside the server that contains a list of certificate authorities (CA). The server authenticates any client whose keystore contains an application signed by a CA that is listed in its truststore.

To have a deeper understanding of how the whole concept of certificate authority, keystore, and truststore works, let us consider a simple scenario as below.

Assume that a person from India wants to travel to Germany. Now, in this whole process, the following events, as shown in the below figure, would take place.

  1. An applicant from India goes to the German Embassy in India, provides his passport, fills in an application to request to issue him with a German visa.

  2. The German Embassy in India verifies his passport, signs his application and stamps the German visa inside his passport with its embassy signature.

  3. The applicant arrives at the German airport, provides his passport containing the stamped German visa to the passport control.

  4. The German passport control verifies the applicant’s passport and then checks its computer system to find whether the German embassy in India is listed in its trustable legal embassies that are allowed to issue a German visa.

  5. The German passport control finds that the German embassy in India is a trusted legal authority and hence, it allows the applicant inside Germany.

In the above example,

  • The applicant from India is the client.

  • The German Passport Control is the server.

  • The German Embassy in India is the certificate authority (CA).

  • The applicant’s passport is the keystore of the client.

  • The stamped German visa inside the passport is the keystore with the signed application.

  • The Passport Control System of Germany is the truststore of the server.

In the above client-server example, we saw the server authenticating the client. However, in a typical SSL handshake mechanism, the whole process will be repeated in the other way round as well (two-way SSL handshake mechanism), i.e., the client will also verify the authenticity of the server using its own truststore and the server’s keystore followed by which both the client and the server would specify a valid encryption protocol (for example, TLSv1.2) to encrypt their respective messages for the whole SSL handshake mechanism to be considered as complete.

Please note that in a typical two-way client-server SSL handshake process , at first, the client will actually verify the server followed by which the server will verify the client.

I hope the readers of this post now have a clear understanding of the concepts of the key terms — certificate authority, keystore and truststore which are required to implement the SSL handshake mechanism.

In the second part, I will explain how to actually implement the SSL handshake mechanism between Kafka brokers via these key terms with a practical example.

Thanks for reading !!


நன்றி _/\_

Read more tech articles