Artificial intelligence and machine learning: Practical aspects of overfitting and regularization

Daniel Vasicek

Journal ArticleOPEN ACCESS

Artificial intelligence and machine learning: Practical aspects of overfitting and regularization

Vasicek D

Information Services and Use (2020) 39(4) 281-289

DOI: 10.3233/ISU-190059

5Citations

23Readers

Abstract

Neural networks can be used to fit complex models to high dimensional data. High dimensionality often obscures the fact that the model overfits the data and it often arises in the publication industry because we are usually interested in a large number of concepts; for example, a moderate thesaurus will contain thousands of concepts. In addition, the discovery of ideas, sentiments, tendencies, and context requires that our modelling algorithms be aware of many different features such as the words themselves, length of sentences (and paragraphs), word frequency counts, phrases, punctuation, number of references, and links. Overfitting can be counterbalanced by Regularization, but the latter can also cause problems. This paper attempts to clarify the concepts of 'overfitting' and 'regularization' using two-dimensional graphs that demonstrate over fitting and how regularization can force a smoother fit to noisy data.

Author supplied keywords

Cited by Powered by Scopus

View more at Scopus

Cite

CITATION STYLE

APA

Vasicek, D. (2020). Artificial intelligence and machine learning: Practical aspects of overfitting and regularization. Information Services and Use, 39(4), 281–289. https://doi.org/10.3233/ISU-190059

Readers' Seniority

PhD / Post grad / Masters / Doc 7

70%

Lecturer / Post doc 2

20%

Researcher 1

10%

Readers' Discipline

Engineering 4

50%

Computer Science 3

38%

Social Sciences 1

13%

Artificial intelligence and machine learning: Practical aspects of overfitting and regularization

Abstract

Author supplied keywords

Cited by Powered by Scopus

Adaptive threshold optimisation for online feature selection using dynamic particle swarm optimisation in determining feature relevancy and redundancy

Coarse-Grained Neural Network Model of the Basal Ganglia to Simulate Reinforcement Learning Tasks

Identification of cotton leaf damage using soft computational techniques

Register to see more suggestions

Cite

Readers' Seniority

Readers' Discipline