Apache Kafka on Kubernetes with Strimzi — Part 2: Creating Producer and Consumer apps using Go and Scala and deploying on Kubernetes

The Producer

For creating the producer, I would like to use the Go language. We’re going to add below dependencies to our module file:

github.com/segmentio/kafka-go // Kafkago.uber.org/zap // Logginggoogle.golang.org/protobuf // Message Serializationgithub.com/urfave/cli/v2 // CLI
  • Kafka Bootstrap servers
  • Topic name
  • Buffer size (for batching messages)
  • Sleep time between producing messages
$ docker build -t nrsina/strimzi-producer:v1 .
Deploying Strimzi Producer app in Kubernetes

The Consumer

For our consumer, we’re going to build a consumer using the Scala language with Alpakka Kafka library which is an impressive Kafka library by Lightbend. By scaling our consumer, we can see how Consumer Group works. But before jumping into the code, I’d like to talk a little bit about Delivery Semantics (Delivery Guarantees) since it’s a very important issue to consider when using any stream processing framework.

Delivery Semantics

Kafka uses a dumb broker / smart consumer design. The broker only keeps the messages for a certain amount of time and consumers should keep track of their offset in partitions of the topic. So the developer should configure the consumer based on the business needs. We have three delivery semantics:

  • At most once. This is the most relaxed semantic. With this semantic, message might get processed one time or not at all (message can get lost and it cannot be redelivered). This semantic happens when the commitment of the offsets has nothing to to with the processing of the message.
    By setting the enable.auto.commit property of the consumer to true, the consumer offsets will be periodically committed in the background. So we have no control on the offsets and if the process of a message fails, the consumer can’t request the message from the topic again since the offset has been committed before and moved passed the failed message.
  • At least once. This semantic ensures that a message may never get lost but may get redelivered. So the message might get processed more than one time (but never lost). When this delivery semantic is needed, the consumer has to take control of the offset committing.
    The commitment should happen after the processing of the message is done. So if the process fails, no offset will be committed. And on the next pull from the topic, the failed message will get redelivered again.
    Bear in mind that with this approach, the processing of a message may be successful but the committing fails for any reason. So then again the message will be redelivered. This is because our processing (for example, persisting in a database) and committing the offsets in Kafka are not in a single transaction. (The enable.auto.commit property should be false since we need to take control of the offset committing).
  • Exactly once. This semantic guarantees that the processing of the message happens only once. This is the most strong semantic. When we want to have exactly once semantics, the process and committing of the message should happen in a single transaction.
    For example, if we are persisting the message in PostgreSQL, we should also persist the offsets in the same database so we can easily wrap both of them in a single transaction. If the persisting of the offset fails, the persisting of the message will also roll backs and vice versa.

Creating the Consumer

We are using Akka for this project. Akka is a toolkit consisting of many open-source libraries for designing concurrent, distributed and resilient applications by leveraging Actor Model. Akka uses Actor model which provides a high level of abstraction for building concurrent applications and alleviates the developer from dealing with low-level thread management and locking. Actors are single and independent units that encapsulate state and behavior and they communicate with other actors via message passing. A hierarchy of actors can be formed in Akka. Refer to Akka’s documentation for more information.

akka {
loglevel = DEBUG
}
akka.management {
http {
hostname = "127.0.0.1"
hostname = ${?AKKA_MANAGEMENT_HTTP_HOST}
bind-hostname = "0.0.0.0"
port = 8558
port = ${?AKKA_MANAGEMENT_HTTP_PORT}
}
}
strimzi-consumer {
bootstrap-servers = "localhost:9092"
bootstrap-servers = ${?SC_BOOTSTRAP_SERVER}
group-id = "my-group"
group-id = ${?SC_GROUP_ID}
enable-auto-commit = "false" // false for at-least-once semantics
enable-auto-commit = ${?SC_AUTO_COMMIT}
auto-offset-reset = "latest" // other value: earliest
auto-offset-reset = ${?SC_OFFSET_RESET}
topic = "my-topic"
topic = ${?SC_TOPIC_NAME}
parallelism = 10 // batch processing of consumed messages
parallelism = ${?SC_PARALLELISM_NUM}
sleep-time-ms = 100 // simulates computation of a message
sleep-time-ms = ${?SC_SLEEP_TIME}
}
addSbtPlugin("com.typesafe.sbt" % "sbt-native-packager" % "1.7.6")
$ sbt docker:publishLocal
$ sbt docker:stage
Deploying Strimzi Consumer app in Kubernetes

Conclusion

After deploying our Kafka Cluster on Kubernetes (Part 1), we have built a producer app in Go language and a consumer app in Scala language. After publishing our applications and watching the logs, it looks like everything is fine and running well. But is it? Are we sure that everything is ok? What is the rate of producing / consuming messages? Do we have a “Consumer Lag” here? By the look of the logs everything is fine and running well. But we are not really sure how our apps and Kafka are performing. We need some monitoring tools to monitor the Kafka’s performance. In the next part of the series, we are going to use Prometheus and Grafana to monitor our Cluster.

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Sina Nourian

Sina Nourian

CTO at Daneshgar Technology Co. Ltd | Interested in Distributed Systems, Cloud Computing, Serverless computing, Data Stream Processing and Microservices