SimpleText@CLEF-2021 Pilot tasks


SimpleText@CLEF-2021 Pilot tasks

Home Call for papers Important dates Pilot tasks  
Program Publications Organisers Contact

SimpleText Pilot Task Guidelines

We invite you to submit both automatic and manual runs! Manual intervention should be reported.



Access

Please register at the SimpleText@CLEF workshop in order to access the data: http://clef2021-labs-registration.dei.unipd.it/
After registration, you will receive an email with information on how to log in to the data server: https://guacamole.univ-avignon.fr

Result submission:

Participants should put their run results into the folder Documents created for their user.

2021 DataSet

For this edition we use the Citation Network Dataset: DBLP+Citation, ACM Citation network (https://www.aminer.org/citation). An elastic search index is provided to participants accessible through a GUI API. This Index is adequate to:

  • apply basic passage retrieval methods based on vector or language IR models
  • generate Latent Dirichlet Allocation models,
  • train Graph Neural Networks for citation recommendation as carried out in https://stellargraph.readthedocs.io/ for example,
  • apply deep bi directionnal transformers for query expansion.
  • and much more…

2021 Queries

For this edition queries are a selection of recent press titles from The Guardian enriched with keywords manually extracted from the content of the article. It has been checked that each keyword allows to extract at least 5 relevant abstracts. The use of these keywords is optional.

Input format for all tasks:

  • Topics are in the MD format