Using Pyspark Environment for Solving a Big Data Problem: Searching for Supersymmetric Particles

  • et al.
N/ACitations
Citations of this article
3Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Supersymmetry theory predicts that every particle in the standard model has a superpartner particle with a different mass. The Classification Problem of Supersymmetric Particles in High-Energy represents a major challenge for physicists. This paper aims to resolve the Big data Classification Problem in the area of Supersymmetric Particles using the Apache Spark Environment with the "MLlib" library. This contribution attempts to explore the performance of Machine Learning methods in the context of large data such as a "Susy" dataset, collected from the UCI Machine Learning repository. In this work, the performance is measured using three metrics: Accuracy, Area Under Curve (AUC), and training Computation Time (CT). The results are promising and show that the Gradient Boosted Tree (GBT) classifier achieves a high accuracy score (79%). While the Logistic Regression (LR) algorithm realizes a well AUC score (86%).

Cite

CITATION STYLE

APA

Azhari*, M. … Dakkon, M. (2020). Using Pyspark Environment for Solving a Big Data Problem: Searching for Supersymmetric Particles. International Journal of Innovative Technology and Exploring Engineering, 9(7), 541–546. https://doi.org/10.35940/ijitee.g5308.059720

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free