Successful Data Science Projects: Lessons Learned from Kaggle Competition

Mohammed Zuhair Al-Taie; Naomie Salim; Adekunle Isiaka Obasa

Journal ArticleOPEN ACCESS

Successful Data Science Projects: Lessons Learned from Kaggle Competition

Al-Taie M
Salim N
Obasa A

Kurdistan Journal of Applied Research (2017) 2(3) 40-49

DOI: 10.24017/science.2017.3.18

N/ACitations

15Readers

Abstract

The workflow from data understanding to deployment of an analytical model of a data science project begins at framing the problem at hand, a task that is typically business-oriented and requires human-to-human interaction. However, the next three steps: data understanding, feature extraction, and model building that come next in the pipeline are the key to successful data science projects. Failing to fully understand the requirements of each of these three steps can negatively affect the performance of the proposed system. Hence, the current study tries to answer the following question “What are the requirements of a successful data science project?” To answer this question, we will use the solution that we built to measure the relevance of local search results of small online e-businesses and submitted to Kaggle data science platform to shed light on why our solution did not achieve a top position among other competitors. Evaluation of the design that we submitted to the competition is going to be carried out in the spirit of the three winning submissions. Our results revealed that well-performed data preprocessing, well-defined features, and model ensembling are critical for building successful data science projects. Such a clarification provides insight into specific aspects of model design to help others including Kagglers avoid possible mistakes while approaching their data science projects.

Cite

CITATION STYLE

APA

Al-Taie, M. Z., Salim, N., & Obasa, A. I. (2017). Successful Data Science Projects: Lessons Learned from Kaggle Competition. Kurdistan Journal of Applied Research, 2(3), 40–49. https://doi.org/10.24017/science.2017.3.18

Readers' Seniority

PhD / Post grad / Masters / Doc 7

78%

Professor / Associate Prof. 1

11%

Researcher 1

11%

Readers' Discipline

Computer Science 5

45%

Business, Management and Accounting 3

27%

Social Sciences 2

18%

Medicine and Dentistry 1

Successful Data Science Projects: Lessons Learned from Kaggle Competition

Abstract

Register to see more suggestions

Cite

Readers' Seniority

Readers' Discipline