Overview & Goals¶

The Clinical Problem ❓¶

Early-stage cancers are difficult to detect in routine clinical practice. They often present with subtle or non-specific signs and are easily missed among the many normal or benign findings seen every day. Catching these cases early is crucial, as timely diagnosis can significantly improve patient outcomes (Etzioni et al., 2003). However, these cancers are rare, and their subtle features make them hard to distinguish—even for experienced clinicians. This rarity poses a major challenge when developing automated tools to support diagnosis. Most data comes from normal or non-cancerous cases, making it difficult to train models that can reliably identify the few critical anomalies without being overwhelmed by the common findings.

The Need for a Benchmark 📊¶

Given the difficulty of detecting rare and subtle early-stage cancers, there is a pressing need for well-designed tools to support clinicians. Computer-aided detection (CADe) systems hold promise—but developing and evaluating these systems in low-prevalence settings remains a major hurdle. In real-world clinical environments, early cancers are vastly outnumbered by normal or benign findings. This imbalance can skew model performance and lead to misleading results if not properly addressed during development (Godau et al., 2025). Systems trained on artificially balanced datasets may appear accurate in testing but often fail in clinical practice, where the true distribution is heavily skewed. Without rigorous benchmarking, CADe systems risk two major pitfalls: being too sensitive and generating a flood of false positives, or being too conservative and missing early cancers. Striking the right balance between sensitivity and specificity is essential. That’s why a carefully constructed benchmark is crucial—to provide a realistic, standardized way to evaluate performance under conditions that mirror the clinical reality. This challenge aims to fill that gap.

The RARE Challenge 🎯¶

The RARE challenge is part of EndoVis and is dedicated to advancing automated detection of rare, clinically important findings in low-prevalence settings. It provides a benchmark designed to reflect real-world clinical distributions, where positive cases are scarce and traditional evaluation metrics can be insufficient to assess practical utility.

The clinical use case of the challenge remains focused on early-stage cancer detection in patients with Barrett’s Esophagus (BE). BE is a condition where the lining of the esophagus changes, increasing the risk of developing cancer. During routine endoscopic surveillance, early neoplastic changes can be extremely subtle and are often missed. Yet early detection is critical: when identified in time, patients can often be treated with minimally invasive endoscopic procedures, achieving long-term success rates above 90% (Pech et al., 2014). In contrast, if progression to advanced cancer occurs, five-year survival rates drop dramatically to around 15% (American Cancer Society, 2025). Despite these high stakes, the prevalence of early neoplasia in BE surveillance remains exceptionally low—typically below 1% (Hvid-Jensen et al., 2011)—making this a particularly challenging and clinically relevant setting.

In last year’s RARE25 Challenge, more than 10 teams submitted to the final test phase, resulting in highly insightful outcomes and strong overall performance, with AUC values exceeding 0.91. However, the results also highlighted again that algorithms achieving high AUC could still perform poorly on the challenge metric that explicitly incorporates prevalence, underlining that there remains substantial room for improvement in developing clinically viable systems.

Building on these insights, RARE26 introduces a substantial expansion of the available data through the release of a large-scale unlabeled dataset. This creates new opportunities for self-supervised, unsupervised, and semi-supervised approaches, enabling participants to leverage representation learning and data-efficient methods to better tackle rare-case detection in realistic clinical scenarios.

Prizes 🏆¶

All the participating teams will be invited to contribute to the research paper and be listed as authors on the forthcoming journal publication summarizing the outcomes of our challenge.

Additionally, the top 5 teams will receive the following prizes in the form of a gift card:

🥇 1st place: €1000
🥈 2nd place: €500
🥉 3rd place: €250
🏅 4th place: €150
🏅 5th place: €100