Apache Spark 2.x Cookbook

Apache Spark 2.x Cookbook

作者: Rishi Yadav
出版社: Packt Publishing
出版在: 2017-05-31
ISBN-13: 9781787127265
ISBN-10: 1787127265
裝訂格式: Paperback
總頁數: 294 頁





內容描述


Key Features

Contains recipes on solving real-time data-processing problems with Apache Spark
Utilize core Spark modules such as Spark SQL, Spark MLlib, Spark Streaming, and GraphX processing
A practical guide to help you master Apache Spark as your single big data computing platform

Book Description
While Apache Spark 1.x gained lot of traction and adoption in the early years, Spark 2.0 delivers very notable improvements in the areas of API, Performance, Structured Streaming, and simplifying building blocks to build better, faster, smarter, and accessible big data applications. This book uncovers all these features in the form of structured recipes to analyze and mature large and complex sets of data.
Starting with installing and configuring Apache Spark with various cluster managers, you will learn to set up development environments. Furthermore, you will be introduced to working with RDD's, Data Frames to operate on data with schemas, and real-time streaming with various sources such as Twitter Stream and Apache Kafka. You will also work through recipes on machine learning, including supervised learning, unsupervised learning, recommendation engines, deep learning algorithms, and GPU implementations on Spark.
Last but not the least, the final few chapters will help you delve more deeply into the concepts of graph processing using GraphX, securing your implementations, cluster optimization, and troubleshooting.
What you will learn

Install and configure Apache Spark with various cluster managers
Set up a development environment for Apache Spark
Learn to operate on data in Spark with schemas
Get to grips with real-time streaming analytics using Spark Streaming
Master supervised learning and unsupervised learning using MLlib
Build a recommendation engine using MLlib
Use Tensorframes to manipulate Spark's DataFrames with TensorFlow programs for deep learning
Develop a set of common applications or project types, and solutions that solve complex big data problems




相關書籍

Vue.js 2.x 實踐指南

作者 鄒瓊俊

2017-05-31

Hadoop 2 Quick-Start Guide: Learn the Essentials of Big Data Computing in the Apache Hadoop 2 Ecosystem (Paperback)

作者 Douglas Eadline

2017-05-31

Data Versus Democracy: How Big Data Algorithms Shape Opinions and Alter the Course of History (Paperback)

作者 Shaffer Kris

2017-05-31