Analyzing Wine Types and Quality

Dipanjan Sarkar; Raghav Bali; Tushar Sharma

Book Chapter

Analyzing Wine Types and Quality

Sarkar D
Bali R
Sharma T

Apress, (2018), 407-446

DOI: 10.1007/978-1-4842-3207-1_9

N/ACitations

14Readers

Get full text

Abstract

In the last chapter, we looked at specific case studies leveraging unsupervised Machine Learning techniques like clustering and rule-mining frameworks. In this chapter, we focus on some more case studies relevant to supervised Machine Learning algorithms and predictive analytics. We have looked at classification based problems in Chapter 7, where we built sentiment classifiers based on text reviews to predict the sentiment of movie reviews. In this chapter, the problem at hand is to analyze, model, and predict the type and quality of wine using physicochemical attributes. Wine is a pleasant tasting alcoholic beverage, loved by millions across the globe. Indeed many of us love to celebrate our achievements or even unwind at the end of a tough day with a glass of wine! The following quote from Francis Bacon should whet your appetite about wine and its significance. "Age appears best in four things: old wood to burn, old wine to drink, old friends to trust, and old authors to read."-Francis Bacon Regardless of whether you like and consume wine or not, it will definitely be interesting to analyze the physicochemical attributes of wine and understand their relationships and significance with wine quality and types. Since we will be trying to predict wine types and quality, the supervised Machine Learning task involved here is classification. In this chapter, we look at various ways to analyze and visualize wine data attributes and features. We focus on univariate as well as multivariate analyses. For predicting wine types and quality, we will be building classifiers based on state-of-the-art supervised Machine Learning techniques, including logistic regression, deep neural networks, decision trees, and ensemble models like random forests and gradient boosting to name a few. Special emphasis is on analyzing, visualizing, and modeling data such that you can emulate similar principles on your own classification based real-world problems in the future. We would like to thank the UC Irvine ML repository for the dataset. Also a special mention goes to DataCamp and Karlijn Willems, notable Data Science journalist, who has done some excellent work in analyzing the wine quality dataset and has written an article on her findings at https://www.datacamp.com/community/tutorials/deep-learning-python, which you can check out for more details. We have taken a couple of analyses and explanations from this article as an inspiration for our chapter and Karlijn has been more than helpful in sharing the same with us. Problem Statement "Given a dataset, or in this case two datasets that deal with physicochemical properties of wine, can you guess the wine type and quality?" This is the main objective of this chapter. Of course this doesn't mean the entire focus will be only on leveraging Machine Learning to build predictive models. We will process, analyze, visualize, and model our dataset based on standard Machine Learning and data mining workflow models like the CRISP-DM model.

Cite

CITATION STYLE

APA

Sarkar, D., Bali, R., & Sharma, T. (2018). Analyzing Wine Types and Quality. In Practical Machine Learning with Python (pp. 407–446). Apress. https://doi.org/10.1007/978-1-4842-3207-1_9

Analyzing Wine Types and Quality

Abstract

Cite

Register to see more suggestions