Description
Data science teams having a look to turn research into useful analytics applications require not only the right tools, but also the right approach if they’re to succeed. With the revised second edition of this hands-on guide, up-and-coming data scientists will learn to use the Agile Data Science development methodology to build data applications with Python, Apache Spark, Kafka, and other tools.
Author Russell Jurney demonstrates easy methods to compose a data platform for building, deploying, and refining analytics applications with Apache Kafka, MongoDB, ElasticSearch, d3.js, scikit-learn, and Apache Airflow. You’ll learn an iterative approach that allows you to quickly change the type of analysis you’re doing, depending on what the data is telling you. Publish data science work as a internet application, and have an effect on meaningful change in your organization.
- Build value from your data in a series of agile sprints, the usage of the data-value pyramid
- Extract features for statistical models from a single dataset
- Visualize data with charts, and expose different aspects through interactive reports
- Use historical data to predict the future by way of classification and regression
- Translate predictions into actions
- Get feedback from users after each and every sprint to keep your project on track