Semantic Analytics on Social Networks : Experiences in Addressing the Problem of Conflict of Interest Detection
- ISBN: 1595933239
- DOI: 10.1145/1135777.1135838
Abstract
In this paper, we describe a Semantic Web application that detects Conflict of Interest (COI) relationships among potential reviewers and authors of scientific papers. This application discovers various `semantic associations' between the reviewers and authors in a populated ontology to determine a degree of Conflict of Interest. This ontology was created by integrating entities and relationships from two social networks, namely ``knows,'' from a FOAF (Friend-of-a-Friend) social network and ``co-author,'' from the underlying co-authorship network of the DBLP bibliography. We describe our experiences developing this application in the context of a class of Semantic Web applications, which have important research and engineering challenges in common. In addition, we present an evaluation of our approach for real-life COI detection.
Author-supplied keywords
Semantic Analytics on Social Networks : Experiences in Addressing the Problem of Conflict of Interest Detection
Addressing the Problem of Conflict of Interest Detection
Boanerges Aleman-Meza1, Meenakshi Nagarajan1, Cartic Ramakrishnan1, Li Ding2,
Pranam Kolari2, Amit P. Sheth1, I. Budak Arpinar1, Anupam Joshi2, Tim Finin2
1LSDIS Lab, Dept. of Computer Science.
University of Georgia
Athens, GA 30602-7404
(boanerg, bala, cartic, amit,
budak)@cs.uga.edu
2Department of Computer Science and Electrical
Engineering
University of Maryland, Baltimore County
Baltimore, MD 21250
(dingli1, kolari1, joshi, finin)@cs.umbc.edu
ABSTRACT
In this paper, we describe a Semantic Web application that detects
Conflict of Interest (COI) relationships among potential reviewers
and authors of scientific papers. This application discovers
various ‘semantic associations’ between the reviewers and authors
in a populated ontology to determine a degree of Conflict of
Interest. This ontology was created by integrating entities and
relationships from two social networks, namely “knows,” from a
FOAF (Friend-of-a-Friend) social network and “co-author,” from
the underlying co-authorship network of the DBLP bibliography.
We describe our experiences developing this application in the
context of a class of Semantic Web applications, which have
important research and engineering challenges in common. In
addition, we present an evaluation of our approach for real-life
COI detection.
Categories and Subject Descriptors
H.4.m [Information Systems Applications]: Miscellaneous;
H.3.4 [Information Storage and Retrieval]: Systems and
Software - Information Networks
General Terms
Algorithms, Experimentation
Keywords
Semantic Web, Social Networks, Conflict of Interest, Peer
Review Process, Semantic Analytics, Entity Disambiguation, Data
Fusion, Semantic Associations, Ontologies, RDF
1. INTRODUCTION
Conflict of Interest (COI) is typically known as a situation that
may bias a decision. It can be caused by a variety of factors such
as family ties, business [31] or friendship ties and access to
confidential information. Detecting COI is necessary in many
situations, such as contract allocation, IPO (Initial Public
Offerings) or company acquisitions, corporate law and peer-
review of scientific research papers or proposals. Besides ensuring
impartial decisions, detection of COI is also critical where ethical
and legal ramifications could be quite damaging to individuals or
organizations. The underlying technical challenge is also related
to the common connecting-the-dots applications that are found in
a broad variety of fields, including regulatory compliance,
intelligence and national security [18] and drug discovery [24].
In some cases, it can be difficult to detect COI because of the
lack of available information. However, in many other cases, there
exists implicit and/or explicit information in the form of social
networks, such as those on the Web. For example, the
LinkedIn.com social network, comprising a large number of
people from information technology areas, could be used to detect
COI in situations such as IPO or company acquisitions.
MySpace.com, Friendster and Hi5 contain social network data
that could substantiate COI in situations of friendship or personal
ties. The list keeps growing; for example, Facebook.com (targeted
towards college students) has recently begun expanding to include
high-school students. Club Nexus is an online community serving
over 2000 Stanford undergraduate and graduate students [1]. The
creation of Yahoo! 360° and the acquisition of Dodgeball.com by
Google are recent examples where the importance of social
network applications is evident not only considering the millions
of users that some of them have but also due to the (even
hundreds of) millions of dollars they are worth.
Although social networks can provide data to detect COI,
one important problem lies in the lack of integration among sites
hosting them. Moreover, privacy concerns prevent such sites from
openly sharing their data. Therefore, we chose publicly available
social network data to address the challenge of COI detection. We
selected public sources for two reasons. First, they provide an
opportunity to address the problem of integrating different social
networks. Second, we can demonstrate real-world examples of the
relevance of the problem of COI detection.
The data we used comes from bibliographic literature in
Computer Science research. The DBLP bibliography (dblp.uni-
trier.de/) provides collaboration network data by virtue of the
explicit co-author relationships among authors. We made the
assumption that this collaboration network represents an
underlying social network. As a second social network, we used a
multitude of FOAF documents from the Web where the “knows”
relationship is explicitly stated. The aggregation of such FOAF
documents by means of the “knows” relationship results in a
social network. Although we anticipated significant challenges
while integrating the two networks, the effort needed in
addressing this challenge surpassed our initial expectations. For
example, DBLP has different entries that in the real world refer to
the same person, such as the case of “R. Guha” and “Ramanathan
V. Guha.” Thus, the need for entity disambiguation (also called
Copyright is held by the International World Wide Web Conference
Committee (IW3C2). Distribution of these papers is limited to classroom
use, and personal use by others.
WWW 2006, May 23–26, 2006, Edinburgh, Scotland.
ACM 1-59593-323-9/06/0005.
to be a fundamental challenge in developing Semantic Web
applications involving heterogeneous, real-world data. We believe
that this integration effort of two social networks provides an
example of how semantic technologies, such as FOAF, contribute
to enhancing the Web.
The contributions of this paper are as follows:
• We bring together a semantic & semi-structured social network
(FOAF) with a social network extracted from the collaborative
network in DBLP. We explain the challenges involved with
respect to large-scale entity disambiguation to achieve
integration of different social networks (together with our
results and findings for this task).
• We introduce semantic analytics techniques to address the
problem of COI detection.
• We describe our experiences in the context of a class of
Semantic Web applications, which have important challenges in
common. We illustrate how an application that we developed
for COI detection is a simple yet representative application of
this class. The application is built around the scenario of a peer-
review process. Thus, we demonstrate not only an application
for COI detection but also shed some light on what it takes to
develop this type of Semantic Web application.
2. MOTIVATION AND BACKGROUND
This paper intends to characterize the common engineering and
research challenges of building practical Semantic Web
applications rather than contribute to the theoretical aspects of
Semantic Web. In fact, many of us in academia have seen multi-
faceted efforts towards realizing the Semantic Web vision. We
believe that the success of this vision will be measured by how
research in this field (i.e., theoretical) can contribute to increasing
the deployment of Semantic Web applications [25]. In particular,
we refer to Semantic Web applications that have been built to
solve commercial world problems [26, 32, 33]. These include
Semantic Search [16, 37], large scale annotation of Web pages
[11], commercialized semantic annotation technology [17] and
applications for national security [34]. The engineering process it
takes to develop such applications is similar to what we present in
this paper. The development of a Semantic Web application
typically involves a multi-step process:
1. Obtaining high quality data: Such data is often not available.
Additionally, there might be many sites from which data is to
be obtained. Thus, metadata extraction from multiple sources is
often needed [10, 23, 35].
2. Data preparation: Preparation typically follows the obtaining
of data. Cleanup and evaluation of the quality of the data is part
of data preparation.
3. Entity disambiguation: This continues to be a key research
aspect and often involves a demanding engineering effort.
Identifying the right entity is essential for semantic annotation
and data integration (i.e., [6]).
4. Metadata and ontology representation: Depending on the
application, it can be necessary to import or export data using
standards such as RDF/RDFS and OWL. Addressing
differences in modeling, representation and encodings can
require significant effort.
5. Querying and inference techniques: These are needed as a
foundation for more complex data processing and enabling
semantic analytics and discovery (i.e., [4, 19, 21, 35]).
6. Visualization: The ranking and presentation of query or
discovery results are very critical for the success of Semantic
Web applications. Users should be able to understand how
inference or discovery is justified by the data.
7. Evaluation: Often benchmarks or gold standards are not
available to measure the success of Semantic Web applications.
A frequently-used method is comparing application output with
results from human subjects.
These challenges are discussed throughout this paper in the
context of developing an application that addresses the problem of
COI detection. Figure 1 illustrates the multi-step process of
building Semantic Web applications along with the steps involved
in our approach for COI detection.
Figure 1. Multi-step Process of Semantic Web Applications
2.1 Conflict of Interest Detection Problem
Conflict of interest situations should be identified to produce
impartial decisions, such as complying with laws. For example,
the National Institutes of Health (NIH), like many other
government and private organizations, has strict definitions of
what constitutes a COI. The NIH defines COI in the context of the
grant review process as: “A Conflict Of Interest (COI) in scientific
peer review exists when a reviewer has an interest in a grant or
cooperative agreement application or an R&D contract proposal
that is likely to bias his or her evaluation of it. A reviewer who
has a real conflict of interest with an application or proposal may
not participate in its review.” Thus, one major cause for bias is
professional or social relationships between potential reviewers
and authors of the material to be reviewed. In this paper, we
Sign up today - FREE
Mendeley saves you time finding and organizing research. Learn more
- All your research in one place
- Add and import papers easily
- Access it anywhere, anytime



