Building Cantonese Dictionaries Using Crowdsourcing Strategies: The words.hk Project

  • Lau C
N/ACitations
Citations of this article
3Readers
Mendeley users who have this article in their library.
Get full text

Abstract

The words.hk project is the first attempt to build a Cantonese-to-Cantonese dictionary using a lean start-up (see Ries, The lean startup: How today’s entrepreneurs use continuous innovation to create radically successful businesses. New York: Crown Business, 2011) model combined with crowdsourcing strategies. The goal is to produce a comprehensive dictionary written for Cantonese and in Cantonese. Existing resources are often (1) not available electronically, (2) out of date, or (3) too Anglo- or Sino-centric. Building large data sets from these existing resources requires a lot of editing and ‘data-janitorial’ work, which can be done far better with a large group of less-experienced people than just a handful of experts, and crowdsourcing strategies are particularly appropriate in these cases. We started with a small team of editors and software developers in 2014. In less than 3 years’ time, we grew into an organisation with over 400 volunteers, gathered over 42,000 entries, of which more than 36,000 entries have been edited with Written Cantonese descriptions, examples, and translations as of June 2017. Given the nature of the project and the member composition – a language with no authority to fall back on and most members with no formal linguistics or lexicographical training – we adhere to two simple principles, in order to keep the dictionary growing without introducing major issues in the core data: ‘usage over etymology’ and ‘decision problem avoidance’. I will discuss how these principles have shaped the architecture of the project, the editing workflow, and other technological difficulties that we face.

Cite

CITATION STYLE

APA

Lau, C. (2019). Building Cantonese Dictionaries Using Crowdsourcing Strategies: The words.hk Project (pp. 89–107). https://doi.org/10.1007/978-981-13-1277-9_6

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free