USEWOD2012 - 2nd International Workshop on Usage Analysis and the Web of Data

co-located with the 21st International World Wide Web Conference (WWW2012)

Lyon, France, April 17th, 2012, 14:00 - 19:00

USEWOD2012 is part of the USEWOD workshop series.

News

Data Challenge SPONSOR

27 April 2012: The winner of this year's USEWOD Data Challenge is Aravindan Raghuveer for his paper "Characterizing Machine Agent Behavior through SPARQL Query Mining" (see below). Congratulations! Also, a big thank you to the LATC project for sponsoring the price. And finally, we want to thank everyone who contributed to or attended the workshop, and made sure it was successful We hope to see you again next year!
16 April 2012: Schedule changes - just to make sure there is no confusion: the workshop starts in the afternoon at 14:00, not in the morning, as indicated in the printed program.
13 April 2012: All accepted papers have been published in our online proceedings on archix.org. For the individual links, see below. Also, please note that we had to move the workshop to the afternoon, to accommodate the logistics of the conference. The new times are in the schedule!
15 March 2012: We are happy to announce the final list of accepted papers for USEWOD2012. And we're even a bit ahead of time!
29 February 2012: Thanks to all authors for your submissions! We have handed the papers over to the reviewers, and will send the acceptance notifications out on the 17th March, in time for the regular WWW registration.
14 February 2012: Due to popular demand, we have extended the deadline for submissions by one week until February 22nd. Good luck with your submissions!
16 December 2011: As an early christmas present, we are happy to announce that the USEWOD challenge data is now available! Head over to the challenge page, download and sign the usage agreement and get access to 800MB of compressed log files!
09 December 2011: The workshop date has now been set for April 17th, to avoid a clash with LDOW2012.
01 December 2011: We are very happy to announce that USEWOD has been accepted as one out of 8 workshops for this year's WWW (from a total of 37 submissions)!

Accepted Papers and Program

When	What	Slides
`14:00-14:15`	Welcome and Opening Laura Hollink	PDF
`14:15-14:35`	History and Background of the USEWOD Data Challenge Knud Möller	slideshare
`14:35-15:10`	Keynote: Investigating the Semantic Gap through Query Log Analysis Peter Mika	slideshare
`15:10-15:30`	Characterizing Machine Agent Behavior through SPARQL Query Mining Winner of the Data Challange, sponsored by LATC! Aravindan Raghuveer Abstract ... Mining SPARQL queries to understand the behavior of automated programs (or machine agents) is an important step in designing systems for the semantic web. We present techniques that differ from state-of-the-art SPARQL mining techniques in two ways: 1. Move away from one SPARQL query at a time view to SPARQL user session view 2. Look at the results of SPARQL queries in addition to the query itself. Due to these two approaches, we are able to find two new patterns in SPARQL queries that help us reason better about the underlying program that generated the SPARQL queries. Through a variety of experiments, we show that the patterns found have significant support in all the four datasets provided by the USEWOD committee.	slideshare
`15:30-16:00`	Tea and Coffee
`16:00-16:20`	Learning to Rank Query Recommendations by Semantic Similarities Sumio Fujita, Georges Dupret, and Ricardo Baeza-Yates Abstract ... Logs of the interactions with a search engine show that users often reformulate their queries. Examining these reformulations shows that recommendations that precise the focus of a query are helpful, like those based on expansions of the original queries. But it also shows that queries that express some topical shift with respect to the original query can help user access more rapidly the information they need. We propose a method to identify from the query logs of past users queries that either focus or shift the initial query topic. This method combines various click-based, topic-based and session based ranking strategies and uses supervised learning in order to maximize the semantic similarities between the query and the recommendations, while at the same diversifying them. We evaluate our method using the query/click logs of a Japanese web search engine and we show that the combination of the three methods proposed is significantly better than any of them taken individually.	PDF
`16:20-16:40`	Enabling Semantic Analysis of User Browsing Patterns in the Web of Data Julia Hoxha, Martin Junghans, and Sudhir Agarwal Abstract ... A useful step towards better interpretation and analysis of the usage patterns is to formalize the semantics of the resources that users are accessing in the Web. We focus on this problem and present an approach for the semantic formalization of usage logs, which lays the basis for eective techniques of querying expressive usage patterns. We also present a query answering approach, which is useful to nd in the logs expressive patterns of usage behavior via formulation of semantic and temporal-based constraints. We have processed over 30 thousand user browsing sessions extracted from usage logs of DBPedia and Semantic Web Dog Food. All these events are formalized semantically using respective domain ontologies and RDF representations of the Web resources being accessed. We show the eectiveness of our approach through experimental results, providing in this way an exploratory analysis of the way users browse theWeb of Data.
`16:40-17:00`	Collaboratively Patching Linked Data Magnus Knuth, Johannes Hercher, and Harald Sack Abstract ... Today’s Web of Data is noisy. Linked Data often needs extensive preprocessing to enable efficient use of heterogeneous resources. While consistent and valid data provides the key to efficient data processing and aggregation we are facing two main challenges: (1st) Identification of erroneous facts and tracking their origins in dynamically connected datasets is a difficult task, and (2nd) efforts in the curation of deficient facts in Linked Data are exchanged rather rarely. Since erroneous data is often duplicated and (re-)distributed by mashup applications it is not only the responsibility of a few original publishers to keep their data tidy, but progresses to become a mission for all distributers and consumers of Linked Data, too. We present a new approach to expose and to reuse patches on erroneous data to enhance and to add quality information to the Web of Data. The feasibility of our approach is demonstrated in the example of a collaborative game that patches statements in DBpedia data and provides notifications for relevant changes.	PDF
`17:00-17:20`	Leveraging Usage Data for Linked Data Movie Entity Summarization Andreas Thalhammer, Ioan Toma, Antonio Roa-Valverde, and Dieter Fensel Abstract ... Novel research in the field of Linked Data focuses on the problem of entity summarization. This field addresses the problem of ranking features according to their importance for the task of identifying a particular entity. Next to a more human friendly presentation, these summarizations can play a central role for semantic search engines and semantic recommender systems. In current approaches, it has been tried to apply entity summarization based on patterns that are inherent to the regarded data. The proposed approach of this paper focuses on the movie domain. It utilizes usage data in order to support measuring the similarity between movie entities. Using this similarity it is possible to determine the k-nearest neighbors of an entity. This leads to the idea that features that entities share with their nearest neighbors can be considered as significant or important for these entities. Additionally, we introduce a downgrading factor (similar to TF-IDF) in order to overcome the high number of commonly occurring features. We exemplify the approach based on a movie-ratings dataset that has been linked to Freebase entities.
`17:20-17:30`	Closing

Workshop Overview and Goals

The purpose of this workshop is to investigate new developments concerning the synergy between semantics and semantic-web technology on the one hand, and the analysis and mining of usage data on the other hand. As the first USEWOD workshop at WWW 2011 has shown, these two fields complement each other well. First, semantics can be used to enhance the analysis of usage data. Second, usage data analysis can enhance semantic resources as well as Semantic Web applications. Traces of users can be used to evaluate, adapt or personalise Semantic Web applications and logs can form valuable resources from which semantic knowledge can be extracted bottom-up.

The emerging Web of Data demands a re-evaluation of existing evaluation techniques: the Linked Data community is recognising that it needs to move beyond triple counts. Usage analysis is a key method for the evaluation of a datasets and applications. New ways of accessing information enabled by the Web of Data requires the development or adaptation of algorithms, methods, and techniques to analyse and interpret the usage of Web data instead of Web pages, a research endeavour that can profit from what has been learned in more than a decade of Web usage mining. The results can provide fine-grained insights into how semantic datasets and applications are being accessed and used by both humans and machines - insights that are needed for optimising the design and ultimately ensuring the success of semantic resources.

The primary goals of this workshop are to foster the emerging community of researchers from various fields sharing an interest in usage mining and semantics, to evaluate the developments of the past year, and to further develop a roadmap for future research in this direction.

Topics of Interest (not limited to)

We welcome work that shows how the research areas combine: how semantic resources and techniques can be used to strengthen usage data analysis and, vice versa, how usage data can enhance semantic tools and applications. Within these boundaries, we keep the scope broad. We welcome contributions using any form of semantic information, from formal ontologies to linked data and folksonomies. All records of user actions are considered usage logs; we do not limit ourselves to any format or method of collection of usage information. This ranges from traditional content-consumption logs to various forms of content-production logs, i.e. navigation, application-related transactions, queries, tagging, editing, and similar activities. We welcome both papers using the USEWOD data set (competing in the challenge) and papers on other relevant topics. Topics of interest include, but are not limited to:

Analysis and mining of usage logs of semantic resources and applications.
Inferring semantic information from usage logs.
Methods and tools for semantic analysis of usage logs.
Representing and enriching usage logs with semantic information.
Usage-based evaluation methods and frameworks; gold standards for evaluation of web applications.
Specifics and semantics of logs for content-consumption and content-creation.
Using semantics for recommendation, personalisation and adaptation.
Usage-based recommendation, personalisation and adaptation of semantic web applications.
Exploiting usage logs for semantic search.
Data sharing, privacy, and privacy-protecting policies and techniques.

Contributions

We invite regular paper submissions, as well as challenge papers (for more information about the challenge see here). Papers must consist of original, unpublished research and must not be under review by another conference, journal, or workshop. Authors of accepted submissions will be invited to present their work at the workshop, and at least one author of each paper must register for the workshop.

Format requirements for the submission of papers are:

Regular Papers: max. 8 pages
(however, we also welcome shorter papers)

All accepted papers will be included in the online workshop proceedings. All papers must be prepared in ACM format.

Metadata about all papers, including title, abstract, authors and author affiliations, will also be made available publicly at http://data.semanticweb.org.

To submit a paper, please log on to the USEWOD2012 page on EasyChair.

A copy of the call for papers is available here.

USEWOD 2012 Data Challenge

In addition to regular papers, we will release a dataset large dataset (several GB) of usage data (server log files) from several major Linked Open Data sources, including DBpedia (dbpedia.org), SWDF (data.semanticweb.org), Bio2rdf and LinkedGeoData. Participants are invited to present interesting analyses, applications, alignments, etc. for these datasets, and to submit their findings as a Data Challenge paper. The best Data Challenge paper will get a prize. For more information, check out the Data Challenge Page.

Important Dates

February 22nd, 2012	Extended Submission deadline
February 15th, 2012	Submission deadline
March 17th, 2012	Acceptance notification
March 3rd, 2012	Acceptance notification
April 17th, 2012	Workshop

Organisation

Organising Committee (alphabetically)

Bettina Berendt, K.U. Leuven, Belgium
Laura Hollink, TU Delft, Netherlands
Vera Hollink, CWI Amsterdam, Netherlands
Markus Luczak-Rösch, FU Berlin, Networked Information Systems, Germany
Knud Möller, Kasabi, United Kingdom
David Vallet, Universidad Autónoma de Madrid, Spain

Programme Committee (alphabetically)

Chris Bizer, Freie Universität Berlin, DE
Pablo Castels, Universidad Autónoma de Madrid, ES
Marko Grobelnik, Jozef Stefan Institute, SL
Paul Groth, VU Amsterdam, NL
Christoph Guéret, VU Amsterdam, NL
Geert-Jan Houben, Delft University of Technology, NL
Eero Hyvönen, University of Helsinki, FI
Hideo Joho, University of Tsukuba, JP
Jaap Kamps, University of Amsterdam, NL
Yiannis Kompatsiaris, Informatics and Telematics Institute, GR
Ruben Lara, Telefónica I+D, ES
Johan Oomen, Netherlands Institute for Sound and Vision, NL
Jacco van Ossenbruggen, Centre for Mathematics and Computer Science (CWI), NL
Maarten de Rijke, Universiteit van Amsterdam, NL
Marta Sabou, MODUL University Vienna, AT
Guus Schreiber, VU Amsterdam, NL
Fabrizio Silvestri, Istituto di Scienze e Tecnologie dell'Informazione, IT
Sarabjot Singh Anand, University of Warwick, UK
Markus Strohmaier, Graz University of Technology, AT
Theodora Tsikrika, Centre for Mathematics and Computer Science (CWI), NL
Arjen de Vries, Delft University of Technology, NL

// //