Dirty data exist in many systems. Efficient and effective management of dirty data is in demand. Since data cleaning may result in the the loss of useful data and new dirty data, we attempt to manage dirty data without cleaning and retrieve query result according to the quality requirement of users. Since entity is the unit for understanding objects in the world and many dirty data are led by different descriptions of the same real-world entity, we propose EntityManager, a dirty data management system with entity as the basic unit and keep conflicts in data as uncertain attributes. Even though the query language is SQL, the query in our system has different semantics on dirty data. In the demonstration, we will show a new philosophy for managing dirty data around entities. We will present our prototype allowing load dirty data and query dirty data according to the requirement of users. © Springer-Verlag 2013.
CITATION STYLE
Wang, H., Liu, X., Li, J., Tong, X., Yang, L., & Li, Y. (2013). Entitymanager: An entity-based dirty data management system. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 7826 LNCS, pp. 468–471). https://doi.org/10.1007/978-3-642-37450-0_38
Mendeley helps you to discover research relevant for your work.