Empowering OCL research: a large-scale corpus of open-source data from GitHub

10Citations
Citations of this article
24Readers
Mendeley users who have this article in their library.

Abstract

Model-driven engineering (MDE) enables the rise in abstraction during development in software and system design. In particular, meta-models become a central artifact in the process, and are supported by various other artifacts such as editors and transformation. In order to define constraints, invariants, and queries on model-driven artifacts, a generic language has been developed: the Object Constraint Language (OCL). In literature, many studies into OCL have been performed on small collections of data, mostly originating from a single source (e.g., OMG standards). As such, generalization of results beyond the data studied is often mentioned as a threat to validity. Creation of a benchmark dataset has already been identified as a key enabler to address the generalization threat. To facilitate further empirical studies in the field of OCL, we present the first large-scale dataset of 103262 OCL expression, systematically extracted from 671 GitHub repositories. In particular, our dataset has extracted these expressions from various types of files (a.o. metamodels and model-to-text transformations). In this work we showcase a variety of different studies performed using our dataset, and describe several other types that could be performed. We extend previous work with data and experiments regarding OCL in model-to-text (mtl) transformations.

References Powered by Scopus

The control of the false discovery rate in multiple testing under dependency

8069Citations
N/AReaders
Get full text

Transforming models with ATL

548Citations
N/AReaders
Get full text

USE: A UML-based specification environment for validating UML and OCL

391Citations
N/AReaders
Get full text

Cited by Powered by Scopus

ModelSet: a dataset for machine learning in model-driven engineering

29Citations
N/AReaders
Get full text

A systematic process for Mining Software Repositories: Results from a systematic literature review

22Citations
N/AReaders
Get full text

On Codex Prompt Engineering for OCL Generation: An Empirical Study

9Citations
N/AReaders
Get full text

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Cite

CITATION STYLE

APA

Mengerink, J. G. M., Noten, J., & Serebrenik, A. (2019). Empowering OCL research: a large-scale corpus of open-source data from GitHub. Empirical Software Engineering, 24(3), 1574–1609. https://doi.org/10.1007/s10664-018-9641-6

Readers over time

‘18‘19‘20‘22‘23‘24‘2502468

Readers' Seniority

Tooltip

PhD / Post grad / Masters / Doc 11

73%

Professor / Associate Prof. 3

20%

Researcher 1

7%

Readers' Discipline

Tooltip

Computer Science 10

77%

Physics and Astronomy 1

8%

Engineering 1

8%

Agricultural and Biological Sciences 1

8%

Save time finding and organizing research with Mendeley

Sign up for free
0