Agile Data Science 2.0: Building Full-Stack Data Analytics Applications with Spark

Agile Data Science 2.0: Building Full-Stack Data Analytics Applications with Spark

作者: Russell Jurney
出版社: O'Reilly
出版在: 2017-06-23
ISBN-13: 9781491960110
ISBN-10: 1491960116
裝訂格式: Paperback
總頁數: 352 頁





內容描述


Agile Data Science 2.0 covers the theory and practice of applying agile methods to the practice of applied analytics research called data science. The book takes the stance that data products are the preferred output format for data science teams to effect change in an organization. Accordingly, we show how to "get meta" to enable agility in building applications describing the applied research process itself. Then we show how to use big data tools to iteratively build, deploy and refine analytics applications. Tracking data-product development through the five stages of the "data value pyramid", we show you how to build applications from conception through development through deployment and then through iterative improvement. Application development is a fundamental skill for a data scientist, and by publishing your data science work as a web application, we show you how to effect maximal change within your organization.Technologies covered include Python, Apache Spark (Spark MLlib, Spark Streaming), Apache Kafka, MongoDB, ElasticSearch, d3.js, scikit-learn and Apache Airflow. More important than any one technology, we show you how to compose a data platform to make you a productive application developer.




相關書籍

Data Mining and Business Analytics with R (Hardcover)

作者 Johannes Ledolter

2017-06-23

Practical Machine Learning in R

作者 Nwanganga Fred Chapple Mike

2017-06-23

Hands-On Programming with R: Write Your Own Functions and Simulations (Paperback)

作者 Garrett Grolemund

2017-06-23