The Fundamentals of Apache Kafka Architecture

Apache Kafka Architecture Deep Dive


Overview of Apache Kafka Architecture
Overview of Apache Kafka Architecture

The storage layer is designed to store data efficiently and is a distributed system such that if your storage needs grow over time you can easily scale out the system to accommodate the growth.

The compute layer consists of four core components—the producer, consumer, Kafka Streams, and Connect APIs, which allow Kafka to scale applications across distributed systems.


An event is a record of something that happened that also provides information about what happened.

Examples of events are customer orders, payments, clicks on a website, or sensor readings.

Kafka Event
Kafka Event

In a Kafka-based architecture, an event record consists of a timestamp, a key, a value, and optional headers.


Kafka Topics
Kafka Topics

Topics are append-only, immutable logs of events. Typically, events of the same type, or events that are in some way related, would go into the same topic.

In order to distribute the storage and processing of events in a topic, Kafka uses the concept of partitions. A topic is made up of one or more partitions and these partitions can reside on different nodes in the Kafka cluster.

Kafka Topic Partitions
Kafka Topic Partitions

Within the partition, each event is given a unique identifier called an offset. The offset for a given partition will continue to grow as events are added, and offsets are never reused.


Scala for Java Programmers