DS@GT ARC

FAQ

1. What is CLEF?

CLEF (Conference and Labs of the Evaluation Forum) is an annual independent peer-reviewed conference that focuses on advancing research in information access systems, particularly in multilingual and multimodal contexts.

CLEF aims to maintain and expand upon its tradition of community-based evaluation and discussion on evaluation issues in information retrieval. It provides an infrastructure for testing, tuning, and evaluating information access systems, creating reusable test collections, and exploring new evaluation methodologies.

Overall, CLEF strives to advance the state-of-the-art in information access technologies through its combination of academic conferences and practical evaluation labs.

2. What lab/task should I pursue?

Each lab and task addresses a problem in a particular domain. Are you more interested in natural language processing? Computer vision? Biodiversity conservation? Medical applications? Multi-modality? You may find a summary of each lab in the last question of this FAQ. Review the task overview papers on the CLEF working notes page: CEUR-WS.org/Vol-3740 - Working Notes of CLEF 2024.

3. How many tasks can I participate in?

There is no limit to tasks that someone can participate in. We strongly recommend first-time members not doing more than one task.

4. What’s the difference between a team lead and a team member?

A team lead is the main person responsible for delivering the task, including:

A team member is responsible for the following:

Team leads and members are expected to commit time, have strong programming skills, and be legitimately curious about the lab and task.

5. How do I become a team lead for a task?

Apply in the form DS@GT CLEF 2026 Competition Signup (Form TBD) and specify your desired role (team lead or team member). If you apply to be a team lead, reach out to Anthony Miyaguchi (acmiyaguchi at gatech.edu) or Murilo Gustineli (murilogustineli at gatech.edu) with an overview of your proposed solution for the particular task.

6. What is the time commitment required to participate?

The time commitment varies depending on your role and the effort you want to put in. However, to make a meaningful contribution, you should expect to dedicate around 100–150 hours throughout the project. Think of it as the equivalent of a 2–3 unit course, requiring consistent effort. Team leads require additional time to manage their tasks and coordinate with team members. This is the type of experience where you get out what you put in. Ultimately, your level of involvement is up to you, but consistent effort is key to gaining valuable experience and making an impact.

7. Can two teams participate in the same task?

No. A person can be part of one or more teams. But a team can only do one task.

8. Why can’t I edit the meeting documents?

You must join the CLEF 2026 Google Group (TBD) to be able to edit the meeting documents.
This functionality will be granted to you after joining.

9. Is this opportunity only available for current students, or can alumni participate as well?

This opportunity is not limited to current students—GT alumni are also welcome to join our group! However, participants must be members of the Data Science @ Georgia Tech (DS@GT) club and have paid their membership dues. To join:

10. How can I earn academic credit for participating in CLEF?

If you are an OMSCS student, there are two primary ways to earn academic credit through CLEF participation:

  1. CS 8903 - Specieal Problems
  2. CS 8803 O24 - Intro to Research

CS 8903 – Special Problems

This is a supervised research course that requires special permission to enroll. To take this course, you need to:

CS 8803 O24 – Intro to Research

This course offers a general introduction to research methods and computer science research. Unlike CS 8903, you can register for this course as part of your regular course selection in the OMSCS program.

11. What are the labs available at CLEF 2025?

Below is a short overview of the labs under the CLEF 2024 conference. You may find more information on each lab by reviewing their respective overview papers on the CLEF working notes page: CEUR-WS.org/Vol-3740 - Working Notes of CLEF 2024

BioASQ

BioASQ focuses on biomedical semantic indexing and question answering, aiming to advance systems that utilize online biomedical information to address the needs of scientists in the field.

CheckThat!

In its seventh edition, CheckThat! offers challenges related to journalistic verification processes, including assessing check-worthiness, understanding influence strategies, and identifying stances on questionable affairs.

ELOQUENT

The ELOQUENT lab focuses on evaluating generative language models through innovative methods.

eRisk

eRisk explores early risk detection on the Internet, focusing on evaluation methodologies, effectiveness metrics, and practical applications related to health and safety.

EXIST

EXIST (sEXism Identification in Social Networks) aims to detect and analyze sexist content on social media platforms.

ImageCLEF

ImageCLEF evaluates technologies for annotation, indexing, classification, and retrieval of multimodal data, focusing on large collections of multimodal data across various domains.

JOKER Lab

The JOKER Lab aims to advance research on automated processing of verbal humor, including tasks such as retrieval, classification, interpretation, generation, and translation of humorous content.

LifeCLEF

LifeCLEF is one of the oldest CLEF labs, focusing on multimedia and machine learning for biodiversity monitoring.

LongEval

LongEval explores the temporal persistence of Information Retrieval systems and Text Classifiers, evaluating system performance degradation over time using evolving data.

PAN

PAN is a series of scientific events and shared tasks on digital text forensics and stylometry, aiming to advance the state of the art in these areas.

QuantumCLEF

QuantumCLEF provides an evaluation infrastructure for developing Quantum Computing algorithms, particularly Quantum Annealing algorithms, for Information Retrieval and Recommender Systems.

SimpleText Lab

SimpleText Lab addresses challenges associated with making scientific information accessible to a wide audience, providing data and benchmarks for scientific text summarization and simplification.

TalentCLEF

TalentCLEF focuses on technological advancement in Human Capital Management by establishing a public benchmark for NLP models that facilitates their application in real-world Human Resources (HR) scenarios.

Touché

Touché focuses on developing technologies that support people in decision-making and opinion-forming, aiming to improve our understanding of these processes.

These labs collectively contribute to the CLEF tradition of community-based evaluation and discussion on various aspects of information access systems.