Software tests require data that is realistic, but not real. For example, banking applications cannot be tested with actual customer names and addresses. In these situations, developers rely on fake data generators, also known as fakers, to generate test data to be used in automated tests. Fakers exist in all programming languages. For example, the faker gem and java-faker are popular faking libraries for the Ruby and Java languages. Faking libraries usually include generators for names, phone numbers, and addresses. The development of test data generators is challenging, as they must consider several constraints. For example, name generators must capture the cultural sphere into which the system under test is being deployed. In many Spanish-speaking countries, a family name generator must output two names separated by a space. Another constraint relates to humor, as fakers have been proven to be a strong vector of healthy humor for bonding software development teams [1]. For an English-speaking developer, character names from Star Trek or Seinfeld are more exciting test data than John Doe, and there is support for this in faking libraries. Hence, the most advanced faking libraries contain data generators for specific languages, idioms, and cultures. These faking libraries are under constant evolution to stay in tune with testing constraints and the testing culture of the time.
CITATION STYLE
Baudry, B., Etemadi, K., Fang, S., Gamage, Y., Liu, Y., Liu, Y., … Tiwari, D. (2024). Generative AI to Generate Test Data Generators. IEEE Software. https://doi.org/10.1109/MS.2024.3418570
Mendeley helps you to discover research relevant for your work.