The past year has seen a steep rise in generative AI systems that claim to be open. But how open are they really? The question of what counts as open source in generative AI is poised to take on particular importance in light of the upcoming EU AI Act that regulates open source systems differently, creating an urgent need for practical openness assessment. Here we use an evidence-based framework that distinguishes 14 dimensions of openness, from training datasets to scientific and technical documentation and from licensing to access methods. Surveying over 45 generative AI systems (both text and text-to-image), we find that while the term open source is widely used, many models are 'open weight' at best and many providers seek to evade scientific, legal and regulatory scrutiny by withholding information on training and fine-tuning data. We argue that openness in generative AI is necessarily composite (consisting of multiple elements) and gradient (coming in degrees), and point out the risk of relying on single features like access or licensing to declare models open or not. Evidence-based openness assessment can help foster a generative AI landscape in which models can be effectively regulated, model providers can be held accountable, scientists can scrutinise generative AI, and end users can make informed decisions.
CITATION STYLE
Liesenfeld, A., & Dingemanse, M. (2024). Rethinking open source generative AI: open washing and the EU AI Act. In 2024 ACM Conference on Fairness, Accountability, and Transparency, FAccT 2024 (pp. 1774–1787). Association for Computing Machinery, Inc. https://doi.org/10.1145/3630106.3659005
Mendeley helps you to discover research relevant for your work.