Mastering Hadoop 3: Big Data processing at scale to unlock unique business insights

Mastering Hadoop 3: Big Data processing at scale to unlock unique business insights

作者: Chanchal Singh;Manish Kumar;Dr. Timothy Wong
出版社: Packt Publishing
出版在: 2019-02-28
ISBN-13: 9781788620444
ISBN-10: 1788620445
裝訂格式: Paperback
總頁數: 797 頁





內容描述


Your guide to master the most advanced concepts of Hadoop 3
Key Features

Master the newly introduced features and capabilities of Hadoop 3 - the world's most popular Big Data ecosystem
Crunch and process your data with ease using MapReduce, YARN and a whole host of other tools within the Hadoop ecosystem
A highly practical book with real-world case studies and easy to understand code to help you master Hadoop

Book Description
Apache Hadoop is one of the most popular Big Data solutions for distributed storage and processing of large chunks of data. With Hadoop 3, Apache promises to bringing a high-performance, more fault-tolerant and more efficient Big Data processing platform, with focus on better scalability and efficiency.
This is a comprehensive guide to understand advanced concepts of Hadoop ecosystem tool. You will learn how Hadoop works internally, advance concepts of different ecosystem tools, solution to some of real world use case and how to secure your cluster. It will then walk you through some of advance concepts of HDFS, YARN, MapReduce and Hadoop3. We will address some of the common challenges like, how to use Kafka efficiently, design low latency reliable message delivery Kafka systems, handle high data volumes, how to address some of the top-level concerns of building an enterprise grade messaging system and how to use different stream processing systems along with Kafka to fulfill their enterprise goals.
By the end of this book you will have an understanding of how components in the Hadoop ecosystem are effectively integrated to implement, a Fast & Reliable data pipeline. Also how to tackle different real-world problem when they occur in data pipeline.
What you will learn

Get an in-depth understanding of distributed computing using Hadoop 3 
Develop enterprise-grade applications using Apache Spark, Flink, and more. 
Build scalable and high performant Hadoop Data pipelines with security, monitoring and data governance at place
Build distributed, scalable, reliable and high performant Hadoop Data pipelines with security, monitoring and data governance at place.
Best Practices for Enterprises using or planning to use Hadoop 3 as data platform

Who This Book Is For
If you want to become a Big Data professional by mastering the advanced concepts in Hadoop, this book is for you. If you're a Hadoop professional looking to strengthen your knowledge of the Hadoop ecosystem, this book will also help you. A fundamental knowledge of the Java programming language and some basics of Hadoop is required to get started with this book.




相關書籍

JavaScript-優良部份 (JavaScript: The Good Parts)

作者 Douglas Crockford 莊惠淳 譯

2019-02-28

Kafka 進階

作者 趙渝強

2019-02-28

Building Distributed Applications in Gin: A hands-on guide for Go developers to build and deploy distributed web apps with the Gin framework

作者 Labouardy Mohamed

2019-02-28