Overview
NewsREEL 2015 continued in the spirit of its predecessor challenging participants to find the best news recommendation algorithms. Unlike other campaigns, participants received not only a comprehensive data set but could additionally access an actual news recommender systems. We thereby bridged the gap between academic and industrial evaluation paradigms. A re-newed version of the plista data set provided similar data structures yet conveyed a more recent as well as more comprehensive snapshot of interactions on various news portals. Additionally, we provided a framework capable of re-iterating recorded interactions. Thereby, participants could compare algorithms in a fashion similar to the actual use case. An overview of the tasks is given below, a more detailed description of NewsREEL 2015 can be found in the proceedings of CLEF 2015. The performance results were presented during a half day workshop in Toulouse. An overview of the submissions of participating teams is provided in the working notes of CLEF 2015.
Task description
- Task 1: Benchmarking News Recommendation in a Living Lab. Participants gained access to an operating news recommendation service via the Open Recommendation Platform (ORP). Having deployed a recommender systems, participants received recommendation requests from various publishers. These news portals subsequently displayed recommended news articles on their websites. ORP keeps track of users reactions in terms of clicks. In addition, participants received a stream of data informing about newly added or altered news articles and interactions between visitors and publishers. We challenged participants to achieve the highest click-through-rate (CTR). This rate relates the number of clicks on recommendations with the number of requests. Participants had to keep function aspects such as response time in mind. Failed requests reduce the CTR.
- Task 2: Benchmarking News Recommendation in a Simulated Environment. Participants received a comprehensive data set derived from log files recorded by ORP. The logs span from July to August 2014 and include several large-scale news providers. Participants could re-iterate the logs by means of Idomaar. Thereby, they simulated an environment close to the actual use case. Additionally, they were able to compare different recommendation algorithms on identical data. This improved reproducibility and comparability. We measured how well their methods predicted which news articles a visitor would read in the future. Further, we analysed how much resources their methods consumed. This yielded insights on time and space complexity. Thereby, we received a knowledge on how to estimate how well methods could deal with response time restrictions and load peaks.
Results
Forty-two teams registered to participate in NewsREEL 2015. A vast majority of 38 teams signed up for both tasks leaving four teams focusing on individual tasks. As the competition arrived at the decisive stage, we had nine teams competing for the best news recommendations in Task 1. The following figures illustrate individual performances by evaluation time slot.
We observed varying strategies. Some participants explored many recommendation algorithms thus spreading their requests across multiple recommenders. Conversely, other participants accumulated more requests for individual recommenders. CTR scattered around 1 per cent.
Organisers
- Frank Hopfgartner, University of Glasgow
- Torben Brodt, plista GmbH
- Andreas Lommatzsch, TU Berlin
- Benjamin Kille, TU Berlin
- Roberto Turrin, Contentwise
- Jonas Seiler, plista GmbH
- Balazs Hidasi, Gravity R&D
- Martha Larson, TU Delft
Steering Committee
- Paolo Cremonesi, Politecnico de Milano
- Hideo Joho, University of Tsukuba
- Joemon M. Jose, University of Glasgow
- Udo Kruschwitz, University of Essex
- Jimmy Lin, University of Maryland
- Vivien Petras, Humboldt University
- Domonkos Tikk, Gravity R&D and Óbuda University