Collective mind: Towards practical and collaborative auto-tuning

16Citations
Citations of this article
68Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

Empirical auto-tuning and machine learning techniques have been showing high potential to improve execution time, power consumption, code size, reliability and other important metrics of various applications for more than two decades. However, they are still far from widespread production use due to lack of native support for auto-tuning in an ever changing and complex software and hardware stack, large and multi-dimensional optimization spaces, excessively long exploration times, and lack of unified mechanisms for preserving and sharing of optimization knowledge and research material. We present a possible collaborative approach to solve above problems using Collective Mind knowledge management system. In contrast with previous cTuning framework, this modular infrastructure allows to preserve and share through the Internet the whole auto-tuning setups with all related artifacts and their software and hardware dependencies besides just performance data. It also allows to gradually structure, systematize and describe all available research material including tools, benchmarks, data sets, search strategies and machine learning models. Researchers can take advantage of shared components and data with extensible meta-description to quickly and collaboratively validate and improve existing auto-tuning and benchmarking techniques or prototype new ones. The community can now gradually learn and improve complex behavior of all existing computer systems while exposing behavior anomalies or model mispredictions to an interdisciplinary community in a reproducible way for further analysis. We present several practical, collaborative and model-driven auto-tuning scenarios. We also decided to release all material at c-mind.org/repo to set up an example for a collaborative and reproducible research as well as our new publication model in computer engineering where experimental results are continuously shared and validated by the community.

Figures

  • Fig. 1. Rising number of optimization dimensions in GCC in the past 12 years (Boolean or parametric flags). Obtained by automatically parsing GCC manual pages, therefore small variation is possible (script was kindly shared by Yuriy Kashnikov). (Colors are visible in the online version of the article; http://dx.doi.org/10.3233/SPR-140396.)
  • Fig. 2. Number of distinct combinations of compiler optimizations for GCC 4.7.2 with a maximum achievable execution time speedup over -O3 optimization level on Intel Xeon E5520 platform across 285 shared Collective Mind benchmarks after 5000 random iterations (top graph) together with a number of benchmarks where these combinations achieve more than 10% speedup (bottom graph). (Colors are visible in the online version of the article; http://dx.doi.org/10.3233/SPR-140396.)
  • Fig. 3. Execution time of a matrix–matrix multiply kernel when executed on CPU (Intel E6600) and on GPU (NVIDIA 8600 GTS) depending on size N of square matrix as a motivation for online tuning and adaptive scheduling on heterogeneous architectures [46]. (Colors are visible in the online version of the article; http://dx.doi.org/10.3233/SPR-140396.)
  • Fig. 4. Converting (a) continuously evolving, ad-hoc, hardwired and difficult to maintain experimental setups to (b) interconnected cM modules (tool wrappers) with unified, dictionary-based inputs and outputs, data meta-description, and gradually exposed characteristics, tuning choices, features and a system state. (Colors are visible in the online version of the article; http://dx.doi.org/10.3233/SPR-140396.)
  • Fig. 5. Gradually categorizing all available user artifacts using cM modules while making them searchable through meta-description and reusable through unified cM module actions. All material from this paper is shared through Collective Mind online live repository at c-mind.org/browse and c-mind.org/github-code-source. (Colors are visible in the online version of the article; http://dx.doi.org/10.3233/SPR-140396.)
  • Fig. 6. Unified build and run cM pipeline implemented as chained cM modules. (Colors are visible in the online version of the article; http://dx.doi.org/10.3233/SPR-140396.)
  • Fig. 7. Gradual and collaborative top-down decomposition of computer system software and hardware using cM modules (wrappers) similar to methodology in physics. First, coarse-grain design and optimization choices and features are exposed and tuned, and later more fine-grain choices are exposed depending on the available tuning time budget and expected return on investment. (Colors are visible in the online version of the article; http://dx.doi.org/10.3233/SPR-140396.)
  • Fig. 8. Compiler flag auto-tuning to improve execution time and code size of a shared image corner detection program with a fixed data set on Samsung Galaxy Series mobile phone using cM for Android. Highlighted points represent frontier of optimal solutions as well as GCC with -O3 and -Os optimization flags versus LLVM with -O3 flag (c-mind.org/interactive-graph-demo). (Colors are visible in the online version of the article; http://dx.doi.org/10.3233/SPR-140396.)

References Powered by Scopus

A fast learning algorithm for deep belief nets

13991Citations
N/AReaders
Get full text

LLVM: A compilation framework for lifelong program analysis & transformation

4177Citations
N/AReaders
Get full text

Roofline: An insightful visual performance model for multicore architectures

1693Citations
N/AReaders
Get full text

Cited by Powered by Scopus

From repeatability to reproducibility and corroboration

57Citations
N/AReaders
Get full text

Machine Learning in Compilers: Past, Present and Future

28Citations
N/AReaders
Get full text

Collective Knowledge: Towards R&D sustainability

21Citations
N/AReaders
Get full text

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Cite

CITATION STYLE

APA

Fursin, G., Miceli, R., Lokhmotov, A., Gerndt, M., Baboulin, M., Malony, A. D., … Del Vento, D. (2014). Collective mind: Towards practical and collaborative auto-tuning. Scientific Programming, 22(4), 309–329. https://doi.org/10.1155/2014/797348

Readers over time

‘14‘15‘16‘17‘18‘19‘20‘21‘22‘23‘24‘250481216

Readers' Seniority

Tooltip

PhD / Post grad / Masters / Doc 35

74%

Researcher 6

13%

Lecturer / Post doc 4

9%

Professor / Associate Prof. 2

4%

Readers' Discipline

Tooltip

Computer Science 29

62%

Engineering 8

17%

Business, Management and Accounting 8

17%

Social Sciences 2

4%

Save time finding and organizing research with Mendeley

Sign up for free
0