Exploratory data analysis and etl with sas on hadoop eco-system with cervical cancer dataset

11Citations
Citations of this article
8Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Objective: The main objective of this project is to explore and analyse a secondary dataset which collected from “Hospital Uni-versitario de Caracas” in Caracas, Venezuela. Methods: The dataset comprises 858 patients’ information relating to demographic information and medical history data. There is a large number of records which are left with blank, which might be intentionally avoided by the patient due to privacy con-siderations. SAS Studio is utilized in data exploration and data pre-processing. Data cleaning and data transformation are con-ducted basing on the knowledge gathered in the process of data exploration. Afterwards, the dataset was exported from SAS Studio and uploaded to Hadoop Hortonworks platform for analysing purpose. Lastly, five hypotheses have been explored with the visualization tool of Tableau.

Cite

CITATION STYLE

APA

Xiaotian, C., Thiruchelvam, V., & Vistro, D. M. (2020). Exploratory data analysis and etl with sas on hadoop eco-system with cervical cancer dataset. International Journal of Current Research and Review, 12(19), 88–104. https://doi.org/10.31782/IJCRR.2020.121924

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free