Computing and applying trust in w...
ABSTRACT Title of dissertation: COMPUTING AND APPLYING TRUST IN WEB-BASED SOCIAL NETWORKS Jennifer Ann Golbeck, Doctor of Philosophy, 2005 Dissertation directed by: Professor James Hendler Department of Computer Science The proliferation of web-based social networks has lead to new innovations in social networking, particularly by allowing users to describe their relationships beyond a basic connection. In this dissertation, I look specifically at trust in web-based social networks, how it can be computed, and how it can be used in applications. I begin with a definition of trust and a description of several properties that affect how it is used in algorithms. This is complemented by a survey of web-based social networks to gain an understanding of their scope, the types of relationship information available, and the current state of trust. The computational problem of trust is to determine how much one person in the network should trust another person to whom they are not connected. I present two sets of algorithms for calculating these trust inferences: one for networks with binary trust ratings, and one for continuous ratings. For each rating scheme, the algorithms are built upon the defined notions of trust. Each is then analyzed theoretically and with respect to
simulated and actual trust networks to determine how accurately they calculate the opinions of people in the system. I show that in both rating schemes the algorithms presented can be expected to be quite accurate. These calculations are then put to use in two applications. FilmTrust is a website that combines trust, social networks, and movie ratings and reviews. Trust is used to personalize the website for each user, displaying recommended movie ratings, and ordering reviews by relevance. I show that, in the case where the user's opinion is divergent from the average, the trust-based recommended ratings are more accurate than several other common collaborative filtering techniques. The second application is TrustMail, an email client that uses the trust rating of each sender as a score for the message. Users can then sort messages according to their trust value. I conclude with a description of other applications where trust inferences can be used, and how the lessons from this dissertation can be applied to infer information about relationships in other complex systems.
COMPUTING AND APPLYING TRUST IN WEB-BASED SOCIAL NETWORKS by Jennifer Ann Golbeck Dissertation Submitted to the Faculty of the Graduate School of the University of Maryland, College Park in partial fulfillment of the requirements for the degree of Doctor of Philosophy 2005 Advisory Committee: Professor James Hendler, Chair/Advisor Professor Ashok Agrawala Professor Mark Austin Professor Benjamin Bederson Professor Lise Getoor Professor Ben Shneiderman
��Copyright by Jennifer Ann Golbeck 2005
iii ACKNOWLEDGEMENTS First, I would like to thank James Hendler, my advisor. He gave me great intellectual freedom to pursue my interests and provided encouragement and guidance throughout this work���s lifetime. Thanks also to my committee for their challenges, assistance, and support: Ben Bederson, Ben Shneiderman, Ashok Agrawala, Lise Getoor, and Mark Austin. I received what seemed like endless help from members of MINDSWAP in many capacities. Thanks to Yarden Katz, Mike Grove, Aditya Kalyanpur, Evren Sirin, Ron Alford, Amy Alford, Debbie Heisler. Aaron Mannes, Denise Cross, and others I may have forgotten. Special thanks to Bijan Parsia who has been a tireless advocate and supportive colleague for the life of this work, and who also co-authored the work that appears as section 10.3. Thanks also to the FOAF community for their support and participation. Many colleagues around the world have helped me develop this work into what it is now. Thanks to Cai-Nicolas Ziegler, Paolo Massa, Matthew Richardson, Morten Frederiksen, Chris Bizer, and Sep Kamvar. Thanks also to Stuart Kurtz, my former advisor at the University of Chicago, who helped set me on my way toward this goal. My family, of course, has been very supportive and encouraging. Thanks to brother Tom Golbeck and his wife and my friend Michelle, Jeanne Mitchell, and the rest
iv of my large Catholic family who would require several pages to be fully enumerated. As always, bones to �� and K who were present throughout the dissertation process. Of course, a very special thanks to my husband, Dan Golbeck. He has endless wells of patience and support, and always knows the right moment to tell me that I'm brilliant so I'll keep going. He deserves some sort of degree for putting up with me while I completed this dissertation. Finally, thanks to Irene and John Golbeck, my mom and dad. They have encouraged me every step of my life to be a strong, independent thinker, to work hard, and to keep trying at difficult things. I inherited a different kind of insanity from each of them, and the combination has served me well through all my years of education. I am eternally grateful to them for all the opportunities they made available to me, and for the support they have given me along the way.
v TABLE OF CONTENTS List of Tables List of Figures 1 Introduction 1.1 Contributions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2 Organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 Web-Based Social Networks 2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2 Previous Work. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3 Definitions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.4 A Survey of Web-Based Social Networks. . . . . . . . . . . . . . . . . . . . . . . . 2.4.1 Size. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.4.2 Categorization. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.4.3 Relationship Data. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.5 The Semantic Web and Friend of a Friend (FOAF). . . . . . . . . . . . . . . . . 2.5.1 Background. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.5.2 FOAF and Current WBSNs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.5.3 Extensions to FOAF. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.6 Conclusions and Future Directions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix x 1 4 5 10 10 11 12 15 16 17 19 22 23 26 27 28
vi 3 Trust: Definition and Properties 3.1 A Definition of Trust. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 Properties of Trust. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2.1 Transitivity. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2.2 Composability. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2.3 Personalization and Asymmetry. . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3 The Values of Trust. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.4 Conclusions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 Inferring Trust: Background and Related Work 4.1 From Trust Properties to Trust Algorithms . . . . . . . . . . . . . . . . . . . . . . . 4.2 Previous Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2.1 Game Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2.2 Peer-To-Peer Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2.3 Calculating Trust on the Web. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2.4 Public Key Infrastructure. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.3 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 Inferring Trust in Binary Networks 5.1 Generating Social Networks. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.1.1 Building Networks with Correct Topology. . . . . . . . . . . . . . . . . . 5.1.2 Adding Trust Ratings to Graphs. . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2 Making Trust Inferences. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2.1 A Rounding Algorithm. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2.2 A Non-Rounding Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 31 34 34 36 38 40 42 43 43 46 47 49 51 54 55 56 57 58 58 59 60 61
vii 5.2.3 Analysis of the Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2.4 Simulations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.3 Conclusions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 Inferring Trust in Continuous Networks: TidalTrust 6.1 Experimental Networks. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2 Patterns of Trust Values. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2.1 Distribution of Trust Values. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2.2 Correlation of Trust and Accuracy. . . . . . . . . . . . . . . . . . . . . . . . . 6.2.3 Path Length and Accuracy. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.3 TidalTrust: An Algorithm for Inferring Trust. . . . . . . . . . . . . . . . . . . . . . 6.3.1 Incorporating Path Length. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.3.2 Incorporating Trust. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.3.3 Full Algorithm for Inferring Trust. . . . . . . . . . . . . . . . . . . . . . . . . 6.4 Accuracy of TidalTrust . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.4.1 Discussion of Trust and Accuracy. . . . . . . . . . . . . . . . . . . . . . . . . 6.4.2 Related Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 Trust Inferences in Application: FilmTrust 7.1 Related Work. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.2 The FilmTrust Website. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.3 Site Personalization. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.3.1 Recommended Movie Ratings. . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.3.2 Presenting Ordered Reviews. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 65 70 71 72 75 76 77 86 96 96 97 101 107 107 110 111 116 117 119 123 123 132
viii 7.4 User Study. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.5 Conclusions and Discussion. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 TrustMail: Trust Networks for Email Filtering 8.1 Background and Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.2 The TrustMail Application. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.3 Case Study: The Enron Email Corpus. . . . . . . . . . . . . . . . . . . . . . . . . . . 8.4 Conclusions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 Conclusions 10 Future Work 10.1 Validation of Current Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.2 Extensions to Current Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.2.1 Network Structure and Trust Inferences . . . . . . . . . . . . . . . . . . . 10.2.2 Recommendations with FilmTrust . . . . . . . . . . . . . . . . . . . . . . . . 10.2.3 Privacy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.3 Filtering Semantic Web Statements with Trust. . . . . . . . . . . . . . . . . . . . 10.3.1 From Trust Network Inferences to Accepting Claims. . . . . . . . . . 10.3.2 Using Claim Ratings in Semantic Web Systems. . . . . . . . . . . . . . 10.3.3 Filtering Inferences in Knowledge Bases with Trust Values. . . . . 10.4 Meal of a Meal: Inferring Trophic Relationships in Food Webs. . . . . . 10.5 Conclusions and Vision. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References 133 135 137 137 140 142 144 146 154 154 155 155 157 159 160 162 163 164 167 170 171
x LIST OF FIGURES 2.1 3.1 4.1 5.1 5.2 5.3 5.4 5.5 5.6 5.7 5.8 6.1 6.2 6.3 6.4 6.5 WBSN membership for sites ranked by population. . . . . . . . . . . . . . . . . . . . . Network Paths for Discovering Trust . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Finding Trusted Paths to the Sink. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . An illustration of how nodes are used in trust inferences . . . . . . . . . . . . . . . . A map of how the initial accuracy in the system changes with g and pa. . . . . The increasing probability of a correct trust inference. . . . . . . . . . . . . . . . . . A comparison of the initial accuracy of trust ratings with the accuracy of inferred ratings using the rounding algorithm . . . . . . . . . . . . . . . . . . . . . . . . . The accuracy of inferred ratings are shown for various initial percentages of good nodes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Accuracy of Recommendations Compared to Initial Accuracy Using Non- Rounding Algorithm. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Accuracy of Recommendations Using Rounding Algorithm. . . . . . . . . . . . . A comparison of the accuracy of trust inferences made with the rounding and non-rounding algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The structure of the Trust Project's network. . . . . . . . . . . . . . . . . . . . . . . . . . The structure of the FilmTrust social network. . . . . . . . . . . . . . . . . . . . . . . . . The distribution of trust ratings in the Trust Project network . . . . . . . . . . . . . Finding points of comparison in the network. . . . . . . . . . . . . . . . . . . . . . . . . Distribution of trust ratings in the original network and experiments . . . . . . 17 37 45 61 62 64 66 67 68 68 69 73 74 77 78 79
xi 6.6 6.7 6.8 6.9 6.10 6.11 6.12 6.13 6.14 6.15 7.1 7.2 7.3 7.4 7.5 7.6 8.1 10.1 10.2 The relationship between ��� and Trust Rating. . . . . . . . . . . . . . . . . . . . . . . . . Average ��� by trust value. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Trust distributions in the original network and experiments. . . . . . . . . . . . . . Distribution of trust ratings in the original network and among the pairs with common neighbors in the randomized networks. . . . . . . . . . . . . . . . . . . Paths from the source to sink of length two, three, and four. . . . . . . . . . . . . . Paths of length 2, 3, 4, 5, and 7. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . An illustration of how ���s,n2 values are derived for a path length of four. . . . . Minimum average ��� from all paths of a fixed length containing a given trust value. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The process of determining the trust threshold . . . . . . . . . . . . . . . . . . . . . . . . A network illustrating when lines 25-28 allow for more children . . . . . . . . . A users' friend listing at the FilmTrust website. . . . . . . . . . . . . . . . . . . . . . . . A user's movies page with titles, ratings, reviews, and options. . . . . . . . . . . . The move ratings and reviews page for Jaws. . . . . . . . . . . . . . . . . . . . . . . . . Average ���a and ���r values for an increasing minimum ���a threshold . . . . . . . . A user's view of the page for "A Clockwork Orange" . . . . . . . . . . . . . . . . . . The increase in ��� as the minimum ���a is increased . . . . . . . . . . . . . . . . . . . . . The TrustMail Interface. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A sample social network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Inferring trophic relationships in food webs . . . . . . . . . . . . . . . . . . . . . . . . . . 81 83 84 85 88 89 91 94 102 105 121 122 123 127 129 131 140 156 168
1 Chapter 1 Introduction The vast public interest in social networks has opened up many new spaces of possible research in computing. This research adopts web-based social networks as the foundation for studying trust. The goal of this work is twofold: First, find ways to utilize the structure of social networks and the trust relationships within them to accurately infer how much two people that are not directly connected might trust one another, and second, show how those trust inferences can be integrated into applications. The ultimate goal is to create software that is intelligent with respect to the user's social preferences such that the user's experience is personalized, and the information presented to them is more useful. Tens of millions of users participate in web-based social networking. The web- based nature of these networks means that the data is publicly available the websites that are taking advantage of Semantic Web technologies, such as FOAF, have even taken this a step further, making the social network information easily available to any system that
2 wants to incorporate it. Similarly, the role of social trust in computing is becoming a prominent topic for research on the Semantic Web, within human-computer interaction, and in the larger computing community as a whole. In this work, I look at instances where trust is integrated into a social network. The first step to facilitate that integration is to have a definition of trust that captures the social features while being narrow enough to function in the environment of a social network. Given two people, Alice and Bob, I define trust as follows: Alice trusts Bob if she commits to an action based on a belief that Bob's future actions will lead to a good outcome. From that definition, functional properties of trust can be extracted, including transitivity, composability, asymmetry, and personalization. This definition has allowed for the development of two naturally-evolved trust networks that are used in this research. The first has nearly 2,000 members and is entirely based on the semantic web. Using an ontology I created to extend the Friend of a Friend (FOAF) vocabulary, the network is created by spidering files on the semantic web and building a centralized model. The second network is also available on the semantic web, but has a more typical web-based social network structure, with user accounts and a central website. This trust network backs the FilmTrust website, and has over 300 members. Using these foundations of trust in web-based social networks, and real networks as testbeds, I move toward inferring trust within the network. If two individuals are not directly connected, a trust inference uses the paths that connect them in the social network, and the trust values along those paths, to come up with a recommendation about how much one person might trust the other. I present algorithms for inferring trust in