USEWOD2012 - 2nd International Workshop on Usage Analysis and the Web of Data
co-located with the 21st International World Wide Web Conference (WWW2012)
Lyon, France, April 17th, 2012, 14:00 - 19:00
USEWOD2012 is part of the USEWOD workshop series.
News
|
Accepted Papers and Program
When | What | Slides |
---|---|---|
14:00-14:15 |
Welcome and Opening Laura Hollink |
|
14:15-14:35 |
History and Background of the USEWOD Data Challenge Knud Möller |
slideshare |
14:35-15:10 |
Keynote: Investigating the Semantic Gap through Query Log Analysis Peter Mika |
slideshare |
15:10-15:30 |
Characterizing Machine Agent Behavior through SPARQL Query Mining
Winner of the Data Challange, sponsored by LATC! Aravindan Raghuveer Abstract ...
Mining SPARQL queries to understand the behavior of automated programs (or machine agents) is an important step in designing systems for the semantic web. We present techniques that differ from state-of-the-art SPARQL mining techniques in two ways: 1. Move away from one SPARQL query at a time view to SPARQL user session view 2. Look at the results of SPARQL queries in addition to the query itself. Due to these two approaches, we are able to find two new patterns in SPARQL queries that help us reason better about the underlying program that generated the SPARQL queries. Through a variety of experiments, we show that the patterns found have significant support in all the four datasets provided by the USEWOD committee.
|
slideshare |
15:30-16:00 | Tea and Coffee | |
16:00-16:20 |
Learning to Rank Query Recommendations by Semantic Similarities
Sumio Fujita, Georges Dupret, and Ricardo Baeza-Yates Abstract ...
Logs of the interactions with a search engine show that users often
reformulate their queries. Examining these reformulations shows that
recommendations that precise the focus of a query are helpful,
like those based on expansions of the original queries. But it also
shows that queries that express some topical shift with respect to
the original query can help user access more rapidly the information
they need.
We propose a method to identify from the query logs of past users queries that either focus or shift the initial query topic. This method combines various click-based, topic-based and session based ranking strategies and uses supervised learning in order to maximize the semantic similarities between the query and the recommendations, while at the same diversifying them. We evaluate our method using the query/click logs of a Japanese web search engine and we show that the combination of the three methods proposed is significantly better than any of them taken individually. |
|
16:20-16:40 |
Enabling Semantic Analysis of User Browsing Patterns in the Web of Data
Julia Hoxha, Martin Junghans, and Sudhir Agarwal Abstract ...
A useful step towards better interpretation and analysis of
the usage patterns is to formalize the semantics of the resources
that users are accessing in the Web. We focus on
this problem and present an approach for the semantic formalization
of usage logs, which lays the basis for eective
techniques of querying expressive usage patterns. We also
present a query answering approach, which is useful to nd
in the logs expressive patterns of usage behavior via formulation
of semantic and temporal-based constraints.
We have processed over 30 thousand user browsing sessions
extracted from usage logs of DBPedia and Semantic Web
Dog Food. All these events are formalized semantically using
respective domain ontologies and RDF representations of the
Web resources being accessed. We show the eectiveness of
our approach through experimental results, providing in this
way an exploratory analysis of the way users browse theWeb
of Data.
|
|
16:40-17:00 |
Collaboratively Patching Linked Data
Magnus Knuth, Johannes Hercher, and Harald Sack Abstract ...
Today’s Web of Data is noisy. Linked Data often needs extensive preprocessing to enable efficient use of heterogeneous resources. While consistent and valid data provides the key to efficient data processing and aggregation we are facing two main challenges: (1st) Identification of erroneous facts and tracking their origins in dynamically connected datasets is a difficult task, and (2nd) efforts in the curation of deficient facts in Linked Data are exchanged rather rarely. Since erroneous data is often duplicated and (re-)distributed by mashup applications it is not only the responsibility of a few original publishers to keep their data tidy, but progresses to become a mission for all distributers and consumers of Linked Data, too. We present a new approach to expose and to reuse patches on erroneous data to enhance and to add quality information to the Web of Data. The feasibility of our approach is demonstrated in the example of a collaborative game that patches statements in DBpedia data and provides notifications for relevant changes.
|
|
17:00-17:20 |
Leveraging Usage Data for Linked Data Movie Entity Summarization
Andreas Thalhammer, Ioan Toma, Antonio Roa-Valverde, and Dieter Fensel Abstract ...
Novel research in the field of Linked Data focuses on the problem of entity summarization. This field addresses the problem of ranking features according to their importance for the task of identifying a particular entity. Next to a more human friendly presentation, these summarizations can play a central role for semantic search engines and semantic recommender systems. In current approaches, it has been tried to apply entity summarization based on patterns that are inherent to the regarded data.
The proposed approach of this paper focuses on the movie domain. It utilizes usage data in order to support measuring the similarity between movie entities. Using this similarity it is possible to determine the k-nearest neighbors of an entity. This leads to the idea that features that entities share with their nearest neighbors can be considered as significant or important for these entities. Additionally, we introduce a downgrading factor (similar to TF-IDF) in order to overcome the high number of commonly occurring features. We exemplify the approach based on a movie-ratings dataset that has been linked to Freebase entities. |
|
17:20-17:30 |
Closing
|
Workshop Overview and Goals
The purpose of this workshop is to investigate new developments concerning the synergy between semantics and semantic-web technology on the one hand, and the analysis and mining of usage data on the other hand. As the first USEWOD workshop at WWW 2011 has shown, these two fields complement each other well. First, semantics can be used to enhance the analysis of usage data. Second, usage data analysis can enhance semantic resources as well as Semantic Web applications. Traces of users can be used to evaluate, adapt or personalise Semantic Web applications and logs can form valuable resources from which semantic knowledge can be extracted bottom-up.
The emerging Web of Data demands a re-evaluation of existing evaluation techniques: the Linked Data community is recognising that it needs to move beyond triple counts. Usage analysis is a key method for the evaluation of a datasets and applications. New ways of accessing information enabled by the Web of Data requires the development or adaptation of algorithms, methods, and techniques to analyse and interpret the usage of Web data instead of Web pages, a research endeavour that can profit from what has been learned in more than a decade of Web usage mining. The results can provide fine-grained insights into how semantic datasets and applications are being accessed and used by both humans and machines - insights that are needed for optimising the design and ultimately ensuring the success of semantic resources.
The primary goals of this workshop are to foster the emerging community of researchers from various fields sharing an interest in usage mining and semantics, to evaluate the developments of the past year, and to further develop a roadmap for future research in this direction.
Topics of Interest (not limited to)
We welcome work that shows how the research areas combine: how semantic resources and techniques can be used to strengthen usage data analysis and, vice versa, how usage data can enhance semantic tools and applications. Within these boundaries, we keep the scope broad. We welcome contributions using any form of semantic information, from formal ontologies to linked data and folksonomies. All records of user actions are considered usage logs; we do not limit ourselves to any format or method of collection of usage information. This ranges from traditional content-consumption logs to various forms of content-production logs, i.e. navigation, application-related transactions, queries, tagging, editing, and similar activities. We welcome both papers using the USEWOD data set (competing in the challenge) and papers on other relevant topics. Topics of interest include, but are not limited to:
- Analysis and mining of usage logs of semantic resources and applications.
- Inferring semantic information from usage logs.
- Methods and tools for semantic analysis of usage logs.
- Representing and enriching usage logs with semantic information.
- Usage-based evaluation methods and frameworks; gold standards for evaluation of web applications.
- Specifics and semantics of logs for content-consumption and content-creation.
- Using semantics for recommendation, personalisation and adaptation.
- Usage-based recommendation, personalisation and adaptation of semantic web applications.
- Exploiting usage logs for semantic search.
- Data sharing, privacy, and privacy-protecting policies and techniques.
Contributions
We invite regular paper submissions, as well as challenge papers (for more information about the challenge see here). Papers must consist of original, unpublished research and must not be under review by another conference, journal, or workshop. Authors of accepted submissions will be invited to present their work at the workshop, and at least one author of each paper must register for the workshop.
Format requirements for the submission of papers are:
- Regular Papers: max. 8 pages
- (however, we also welcome shorter papers)
All accepted papers will be included in the online workshop proceedings. All papers must be prepared in ACM format.
Metadata about all papers, including title, abstract, authors and author affiliations, will also be made available publicly at http://data.semanticweb.org.
To submit a paper, please log on to the USEWOD2012 page on EasyChair.
A copy of the call for papers is available here.
USEWOD 2012 Data Challenge
In addition to regular papers, we will release a dataset large dataset (several GB) of usage data (server log files) from several major Linked Open Data sources, including DBpedia (dbpedia.org), SWDF (data.semanticweb.org), Bio2rdf and LinkedGeoData. Participants are invited to present interesting analyses, applications, alignments, etc. for these datasets, and to submit their findings as a Data Challenge paper. The best Data Challenge paper will get a prize. For more information, check out the Data Challenge Page.
Important Dates
February 22nd, 2012 | Extended Submission deadline |
February 15th, 2012 | Submission deadline |
March 17th, 2012 | Acceptance notification |
March 3rd, 2012 | Acceptance notification |
April 17th, 2012 | Workshop |
Organisation
Organising Committee (alphabetically)
- Bettina Berendt, K.U. Leuven, Belgium
- Laura Hollink, TU Delft, Netherlands
- Vera Hollink, CWI Amsterdam, Netherlands
- Markus Luczak-Rösch, FU Berlin, Networked Information Systems, Germany
- Knud Möller, Kasabi, United Kingdom
- David Vallet, Universidad Autónoma de Madrid, Spain
Programme Committee (alphabetically)
- Chris Bizer, Freie Universität Berlin, DE
- Pablo Castels, Universidad Autónoma de Madrid, ES
- Marko Grobelnik, Jozef Stefan Institute, SL
- Paul Groth, VU Amsterdam, NL
- Christoph Guéret, VU Amsterdam, NL
- Geert-Jan Houben, Delft University of Technology, NL
- Eero Hyvönen, University of Helsinki, FI
- Hideo Joho, University of Tsukuba, JP
- Jaap Kamps, University of Amsterdam, NL
- Yiannis Kompatsiaris, Informatics and Telematics Institute, GR
- Ruben Lara, Telefónica I+D, ES
- Johan Oomen, Netherlands Institute for Sound and Vision, NL
- Jacco van Ossenbruggen, Centre for Mathematics and Computer Science (CWI), NL
- Maarten de Rijke, Universiteit van Amsterdam, NL
- Marta Sabou, MODUL University Vienna, AT
- Guus Schreiber, VU Amsterdam, NL
- Fabrizio Silvestri, Istituto di Scienze e Tecnologie dell'Informazione, IT
- Sarabjot Singh Anand, University of Warwick, UK
- Markus Strohmaier, Graz University of Technology, AT
- Theodora Tsikrika, Centre for Mathematics and Computer Science (CWI), NL
- Arjen de Vries, Delft University of Technology, NL