Kafka: The Definitive Guide: Real-Time Data and Stream Processing at Scale 2/e
內容描述
Every enterprise application creates data, whether it consists of log messages, metrics, user activity, or outgoing messages. Moving all this data is just as important as the data itself. With this updated edition, application architects, developers, and production engineers new to the Kafka streaming platform will learn how to handle data in motion. Additional chapters cover Kafka's AdminClient API, transactions, new security features, and tooling changes.
Engineers from Confluent and LinkedIn responsible for developing Kafka explain how to deploy production Kafka clusters, write reliable event-driven microservices, and build scalable stream processing applications with this platform. Through detailed examples, you'll learn Kafka's design principles, reliability guarantees, key APIs, and architecture details, including the replication protocol, the controller, and the storage layer.
You'll examine:
Best practices for deploying and configuring Kafka
Kafka producers and consumers for writing and reading messages
Patterns and use-case requirements to ensure reliable data delivery
Best practices for building data pipelines and applications with Kafka
How to perform monitoring, tuning, and maintenance tasks with Kafka in production
The most critical metrics among Kafka's operational measurements
Kafka's delivery capabilities for stream processing systems
作者介紹
Gwen Shapira is a system architect at Confluent helping customers achieve success with their Apache Kafka implementation. She has 15 years of experience working with code and customers to build scalable data architectures, integrating relational and big data technologies. She currently specializes in building real-time reliable data processing pipelines using Apache Kafka. Gwen is an Oracle Ace Director, an author of Hadoop Application Architectures, and a frequent presenter at data driven conferences. Gwen is also a committer on the Apache Kafka and Apache Sqoop projects.
Todd is a Staff Site Reliability Engineer at LinkedIn, tasked with keeping the largest deployment of Apache Kafka, Zookeeper, and Samza fed and watered. He is responsible for architecture, day-to-day operations, and tools development, including the creation of an advanced monitoring and notification system. Todd is the developer of the open source project Burrow, a Kafka consumer monitoring tool, and can be found sharing his experience on Apache Kafka at industry conferences and tech talks. Todd has spent over 20 years in the technology industryrunning infrastructure services, most recently as a Systems Engineer at Verisign, developing service management automation for DNS, networking, and hardware management, as well as managing hardware and software standards across the company.
Rajini Sivaram is a Software Engineer at Confluent designing and developing security features for Kafka. She is an Apache Kafka Committer and member of the Apache Kafka Program Management Committee. Prior to joining Confluent, she was at Pivotal working on a high-performance reactive API for Kafka based on Project Reactor. Earlier, Rajini was a key developer on IBM Message Hub which provides Kafka-as-a-Service on the IBM Bluemix platform. Her experience ranges from parallel and distributed systems to Java virtual machines and messaging systems.
Krit Petty is the Site Reliability Engineering Manager for Kafka at LinkedIn. Before becoming Manager, he worked as an SRE on the team expanding and increasing Kafka to overcome the hurdles associated with scaling Kafka to never before seen heights, including taking the first steps to moving LinkedIn's large-scale Kafka deployments into Microsoft's Azure cloud. Krit has a Master's Degree in Computer Science and previously worked managing Linux systems and as a Software Engineer developing software for high-performance computing projects in the oil and gas industry.