Proceedings of the 1st international workshop on Contextualized attention metadata: collecting, managing and exploiting of rich usage information
It is our great pleasure to welcome you to CAMA06, the "1st International ACM Workshop on Contextualized Attention Metadata: Collecting, Managing and Exploiting of Rich Usage Information".The focus of the workshop is the capture and analysis of user behavior, so as to enable more effective and efficient support for the tasks and activities that the user is engaging in. This research has some direct implications in the real world: information systems that address users needs in a more direct and transparent way are more valuable. Indeed, this can be translated in some contexts in direct monetary metrics, e.g. by making money through targeted advertisements. In other contexts, there may be a less direct monetary gain, as in the case of fundamental research support or community strengthening initiatives.In order to increase the relevance of information and services further, usage data are increasingly used for "mass customization". Thus, the value and pertinence of data about the user and her interactions with systems and applications becomes increasingly obvious. As an example, tracking user interaction with and across websites is a major area for global services (including free ones like Google Analytics). Furthermore, several organizations have set out to collect data about user interactions across websites, e.g. AttentionTrust Google's GMail illustrates how this kind of information can be used for targeted advertisements based on continuous analysis of incoming and outgoing email traffic.The main concern in all these efforts is capturing and analyzing the interactions of users, trying to figure out why they do what they do when they do it. The data captured, called attention metadata, typically include some information about the user, as well as about her interactions with an application (understood in a broad sense here). One issue is to be able to gather such data across different systems, which requires some form of user identification, with the associated concerns around security and privacy. The interactions depend on the type of tool, and can include following links (e.g. within a browser), activating user interface widgets (e.g. in a closed information system like SAP), following usage patterns (e.g. at specific websites), submitting search terms and selecting results (e.g. in a search engine session).There are two general approaches to capture attention metadata: Either the data is captured intrusively by asking the user to complete (online) questionnaires or through direct user observations that the user may or may not be aware of. Intrusive methods are generally less reliable. Data gathered by direct observation is more likely to be "true" as the user is less aware of the observation. More importantly, direct observation does not interfere with the primary goal of the user: after all, she is using an application to pursue her own interest, and any overhead involved in capturing what she is trying to do, why and how is just that: overhead, that should be avoided if at all possible. Sources for attention metadata are log files of servers and applications (e.g. browsers and email clients, chat tools, office tools, etc.), which can be obtained without disturbing the user. In order to draw on the benefits of both methods, several systems employ a hybrid method using elements from observation and enquiry.The analysis of attention metadata can focus either on the user (or group of users) or the application. Examples of the outcome of such analysis are statistics about the information, e.g. the number of times it has been read or downloaded, or about the user, e.g. the information she read, downloaded, exchanged with friends, etc. The most advanced analysis allows for some conclusions on the user goals, e.g. the Personal Reader (http://www.personal-reader.de/) that tries to provide additional information to a webpage read by a user. Such information can be fed back to the information system, in order to enable more targeted services.In general, the collection and analysis methods developed so far have some shortcomings. Attention metadata is mostly captured in one application only. The formats used to represent and store attention are highly diverse and heterogeneous, which makes the combination of such information over several applications very tedious and time-consuming. Furthermore, there are no standards to exchange attention information across system boundaries, e.g. between the desktop and the service provider.The combination of attention metadata from various sources provides a more complete view of user interactions. For example, such combination makes it possible to capture the whole information flow on the user desktop, from obtaining information, processing activities like storing, reading, distributing, deleting, and the repurposing of the information in newly authored documents. We call the combination of attention metadata from heterogeneous sources contextualized attention metadata of users. Contextualization of attention metadata refers to capturing data about the context in which a certain activity is carried out. For example, while writing a scientific paper, the author carries out certain tasks, e.g. writing the document, searching information on the web and on his desktop, sharing his thoughts about the paper with his co-authors via chat and email, drawing related graphs, etc. The task of capturing all these activities requires a scheme to include all the various types of user and activity metadata as well as its relation, hence contextualized attention metadata.The recently introduced AttentionXML specification sets out to capture user interactions in terms of interoperable attention metadata. The core of AttentionXML is a schema that describes metadata on how people use information, e.g. what people read, comment on or listen to. Typical applications of AttentionXML focus on the attention that users spend on web pages, news feeds and blogs.Important as the AttentionXML specification is in bootstrapping "attention ecology", we believe that it is somewhat limited in the scope of attention metadata that it includes. For example, the schema does not allow capturing information about downloading, viewing or editing documents, applications and contexts where objects were used. That is why we have proposed an extension to AttentionXML that allows us to significantly broaden the usage scenarios of attention metadata. We call this new schema the contextualized attention metadata schema - CAMs that the schema is in place, various tools emerge to capture attention metadata, and the focus is shifting to the analysis of collected attention metadata. Apart from the technical problems of dealing with large amounts of data, the open questions deal with problems like user identification, classification of users and activities, deep mining for behavioral patterns, etc. Another issue that needs to be urgently addressed in a better way is security and privacy. Attention metadata includes highly personal information so that secure solutions have to be provided to ensure the integrity and privacy of the data, e.g. enabling the user to have full control over his attention metadata at all times. Last but not least, the integration of the new possibilities arising out of the usage of attention metadata have to be integrated into existing system to enhance their capabilities, e.g. in terms of highly targeted and personalized information provision.The CAMA 2006 workshop addresses the above outlined issues in a variety of presentations. As an opening, Steve Gillmor will elaborate on the future of attention metadata in commercial settings. The overview on the usage of attention metadata in real-world systems will include a brief discussion of the socio-technological perspective of attention metadata, thus outlining some of the more social problems like data-ownership. Furthermore, we will discuss the experiences with capturing and analyzing attention metadata, e.g. in a library environment, for the support of the corporate decision making process and in learning scenarios. Last but not least, we discuss several issues in real-world settings focusing on the capturing process, like how an attention recorder allows capturing highly detailed attention metadata or how the user file system can be mined for attention metadata.