Working with unbounded and fast-moving data streams has historically been difficult. But with Kafka Streams and ksqlDB, building stream processing applications is easy and fun. This practical guide shows data engineers how to use these tools to build highly scalable stream processing applications for moving, enriching, and transforming large amounts of data in real time.
Mitch Seymour, data services engineer at Mailchimp, explains important stream processing concepts against a backdrop of several interesting business problems. You'll learn the strengths of both Kafka Streams and ksqlDB to help you choose the best tool for each unique stream processing project. Non-Java developers will find the ksqlDB path to be an especially gentle introduction to stream processing.
Learn the basics of Kafka and the pub/sub communication pattern Build stateless and stateful stream processing applications using Kafka Streams and ksqlDB Perform advanced stateful operations, including windowed joins and aggregations Understand how stateful processing works under the hood Learn about ksqlDB's data integration features, powered by Kafka Connect Work with different types of collections in ksqlDB and perform push and pull queries Deploy your Kafka Streams and ksqlDB applications to production
If you're eager to delve deep into the Kafka Low-Level API, High-Level DSL, and ksqlDB, this book is a perfect fit. It provides a comprehensive understanding with excellent examples and complete code available on GitHub. I often find myself using this book, exploring its detailed explanations of Kafka's functionality.
For a book that has "Mastering" in its title, this one sure was beginner-level, especially when ksqlDB is concerned.
While it was an interesting read with decent examples, the book rarely dealt with some of the more advanced topics of Kafka Streams (what to look out for in cloud native deployments? storage type considerations? non-trivial discussions of how to handle large datasets?) and on the rare occasions when it did decide to dive deep it offered less detailed discussion than what you could find in the official docs and on the Confluent blog.
On the plus side, I do think this would be a great intro for someone who's never had any contact with Kafka Streams or ksqlDB before - it's definitely easier to navigate than the official docs.