Available Benchmarks¶

`BEIR`¶

Zero-shot retrieval quality across a heterogeneous set of IR tasks and domains, providing a common framework for comparing NLP-based retrieval models.

Learn more →

Tasks

name	type	modalities	languages
TRECCOVID	Retrieval	text	eng
NFCorpus	Retrieval	text	eng
NQ	Retrieval	text	eng
HotpotQA	Retrieval	text	eng
FiQA2018	Retrieval	text	eng
ArguAna	Retrieval	text	eng
Touche2020	Retrieval	text	eng
CQADupstackRetrieval	Retrieval	text	eng
QuoraRetrieval	Retrieval	text	eng
DBPedia	Retrieval	text	eng
SCIDOCS	Retrieval	text	eng
FEVER	Retrieval	text	eng
ClimateFEVER	Retrieval	text	eng
SciFact	Retrieval	text	eng
MSMARCO	Retrieval	text	eng

Citation

@inproceedings{thakur2021beir,
  author = {Nandan Thakur and Nils Reimers and Andreas R{\"u}ckl{\'e} and Abhishek Srivastava and Iryna Gurevych},
  booktitle = {Thirty-fifth Conference on Neural Information Processing Systems Datasets and Benchmarks Track (Round 2)},
  title = {{BEIR}: A Heterogeneous Benchmark for Zero-shot Evaluation of Information Retrieval Models},
  url = {https://openreview.net/forum?id=wCu6T5xFjeJ},
  year = {2021},
}

`BEIR-NL`¶

Zero-shot retrieval quality in Dutch across the BEIR task suite, created through automated translation of the original English benchmark.

Learn more →

Tasks

name	type	modalities	languages
ArguAna-NL	Retrieval	text	nld
CQADupstack-NL	Retrieval	text	nld
FEVER-NL	Retrieval	text	nld
NQ-NL	Retrieval	text	nld
Touche2020-NL	Retrieval	text	nld
FiQA2018-NL	Retrieval	text	nld
Quora-NL	Retrieval	text	nld
HotpotQA-NL	Retrieval	text	nld
SCIDOCS-NL	Retrieval	text	nld
ClimateFEVER-NL	Retrieval	text	nld
mMARCO-NL	Retrieval	text	nld
SciFact-NL	Retrieval	text	nld
DBPedia-NL	Retrieval	text	nld
NFCorpus-NL	Retrieval	text	nld
TRECCOVID-NL	Retrieval	text	nld

Citation

@misc{banar2024beirnlzeroshotinformationretrieval,
  archiveprefix = {arXiv},
  author = {Nikolay Banar and Ehsan Lotfi and Walter Daelemans},
  eprint = {2412.08329},
  primaryclass = {cs.CL},
  title = {BEIR-NL: Zero-shot Information Retrieval Benchmark for the Dutch Language},
  url = {https://arxiv.org/abs/2412.08329},
  year = {2024},
}

`BRIGHT`¶

Reasoning-intensive retrieval quality across real-world queries spanning diverse domains including economics, psychology, mathematics, and coding, drawn from naturally occurring and carefully curated human data.

Learn more →

Tasks

name	type	modalities	languages
BrightRetrieval	Retrieval	text	eng

Citation

@article{su2024bright,
  author = {Su, Hongjin and Yen, Howard and Xia, Mengzhou and Shi, Weijia and Muennighoff, Niklas and Wang, Han-yu and Liu, Haisu and Shi, Quan and Siegel, Zachary S and Tang, Michael and others},
  journal = {arXiv preprint arXiv:2407.12883},
  title = {Bright: A realistic and challenging benchmark for reasoning-intensive retrieval},
  year = {2024},
}

`BRIGHT (long)`¶

Reasoning-intensive retrieval quality across real-world queries spanning diverse domains, filtered to longer documents to stress-test models on extended contexts.

Learn more →

Tasks

name	type	modalities	languages
BrightLongRetrieval	Retrieval	text	eng

Citation

@article{su2024bright,
  author = {Su, Hongjin and Yen, Howard and Xia, Mengzhou and Shi, Weijia and Muennighoff, Niklas and Wang, Han-yu and Liu, Haisu and Shi, Quan and Siegel, Zachary S and Tang, Michael and others},
  journal = {arXiv preprint arXiv:2407.12883},
  title = {Bright: A realistic and challenging benchmark for reasoning-intensive retrieval},
  year = {2024},
}

`BRIGHT(v1.1)`¶

Reasoning-intensive retrieval quality across real-world queries spanning diverse domains including economics, psychology, mathematics, and coding. v1.1 restructures tasks into separate datasets and adds per-task prompts.

Learn more →

Tasks

name	type	modalities	languages
BrightBiologyRetrieval	Retrieval	text	eng
BrightEarthScienceRetrieval	Retrieval	text	eng
BrightEconomicsRetrieval	Retrieval	text	eng
BrightPsychologyRetrieval	Retrieval	text	eng
BrightRoboticsRetrieval	Retrieval	text	eng
BrightStackoverflowRetrieval	Retrieval	text	eng
BrightSustainableLivingRetrieval	Retrieval	text	eng
BrightPonyRetrieval	Retrieval	text	eng
BrightLeetcodeRetrieval	Retrieval	text	eng
BrightAopsRetrieval	Retrieval	text	eng
BrightTheoremQATheoremsRetrieval	Retrieval	text	eng
BrightTheoremQAQuestionsRetrieval	Retrieval	text	eng
BrightBiologyLongRetrieval	Retrieval	text	eng
BrightEarthScienceLongRetrieval	Retrieval	text	eng
BrightEconomicsLongRetrieval	Retrieval	text	eng
BrightPsychologyLongRetrieval	Retrieval	text	eng
BrightRoboticsLongRetrieval	Retrieval	text	eng
BrightStackoverflowLongRetrieval	Retrieval	text	eng
BrightSustainableLivingLongRetrieval	Retrieval	text	eng
BrightPonyLongRetrieval	Retrieval	text	eng

Citation

@article{su2024bright,
  author = {Su, Hongjin and Yen, Howard and Xia, Mengzhou and Shi, Weijia and Muennighoff, Niklas and Wang, Han-yu and Liu, Haisu and Shi, Quan and Siegel, Zachary S and Tang, Michael and others},
  journal = {arXiv preprint arXiv:2407.12883},
  title = {Bright: A realistic and challenging benchmark for reasoning-intensive retrieval},
  year = {2024},
}

`BuiltBench(eng)`¶

Text embedding quality in the built environment domain across clustering, retrieval, and reranking, spanning architecture, engineering, construction, and operations management.

Learn more →

Tasks

name	type	modalities	languages
BuiltBenchClusteringP2P	Clustering	text	eng
BuiltBenchClusteringS2S	Clustering	text	eng
BuiltBenchRetrieval	Retrieval	text	eng
BuiltBenchReranking	Reranking	text	eng

Citation

@article{shahinmoghadam2024benchmarking,
  author = {Shahinmoghadam, Mehrzad and Motamedi, Ali},
  journal = {arXiv preprint arXiv:2411.12056},
  title = {Benchmarking pre-trained text embedding models in aligning built asset information},
  year = {2024},
}

`ChemTEB`¶

Chemical domain text embedding quality across bitext mining, classification, clustering, pair classification, and retrieval.

Learn more →

Tasks

name	type	modalities	languages
PubChemSMILESBitextMining	BitextMining	text	eng
SDSEyeProtectionClassification	Classification	text	eng
SDSGlovesClassification	Classification	text	eng
WikipediaBioMetChemClassification	Classification	text	eng
WikipediaGreenhouseEnantiopureClassification	Classification	text	eng
WikipediaSolidStateColloidalClassification	Classification	text	eng
WikipediaOrganicInorganicClassification	Classification	text	eng
WikipediaCryobiologySeparationClassification	Classification	text	eng
WikipediaChemistryTopicsClassification	Classification	text	eng
WikipediaTheoreticalAppliedClassification	Classification	text	eng
WikipediaChemFieldsClassification	Classification	text	eng
WikipediaLuminescenceClassification	Classification	text	eng
WikipediaIsotopesFissionClassification	Classification	text	eng
WikipediaSaltsSemiconductorsClassification	Classification	text	eng
WikipediaBiolumNeurochemClassification	Classification	text	eng
WikipediaCrystallographyAnalyticalClassification	Classification	text	eng
WikipediaCompChemSpectroscopyClassification	Classification	text	eng
WikipediaChemEngSpecialtiesClassification	Classification	text	eng
WikipediaChemistryTopicsClustering	Clustering	text	eng
WikipediaSpecialtiesInChemistryClustering	Clustering	text	eng
PubChemAISentenceParaphrasePC	PairClassification	text	eng
PubChemSMILESPC	PairClassification	text	eng
PubChemSynonymPC	PairClassification	text	eng
PubChemWikiParagraphsPC	PairClassification	text	eng
PubChemWikiPairClassification	PairClassification	text	ces, deu, eng, fra, hin, ... (13)
ChemNQRetrieval	Retrieval	text	eng
ChemHotpotQARetrieval	Retrieval	text	eng

Citation

@article{kasmaee2024chemteb,
  author = {Kasmaee, Ali Shiraee and Khodadad, Mohammad and Saloot, Mohammad Arshi and Sherck, Nick and Dokas, Stephen and Mahyar, Hamidreza and Samiee, Soheila},
  journal = {arXiv preprint arXiv:2412.00532},
  title = {ChemTEB: Chemical Text Embedding Benchmark, an Overview of Embedding Models Performance \& Efficiency on a Specific Domain},
  year = {2024},
}

`ChemTEB(v1.1)`¶

Chemical domain text embedding quality across bitext mining, classification, clustering, pair classification, and retrieval. v1.1 adds the ChemRxivRetrieval task.

Learn more →

Tasks

name	type	modalities	languages
PubChemSMILESBitextMining	BitextMining	text	eng
SDSEyeProtectionClassification	Classification	text	eng
SDSGlovesClassification	Classification	text	eng
WikipediaBioMetChemClassification	Classification	text	eng
WikipediaGreenhouseEnantiopureClassification	Classification	text	eng
WikipediaSolidStateColloidalClassification	Classification	text	eng
WikipediaOrganicInorganicClassification	Classification	text	eng
WikipediaCryobiologySeparationClassification	Classification	text	eng
WikipediaChemistryTopicsClassification	Classification	text	eng
WikipediaTheoreticalAppliedClassification	Classification	text	eng
WikipediaChemFieldsClassification	Classification	text	eng
WikipediaLuminescenceClassification	Classification	text	eng
WikipediaIsotopesFissionClassification	Classification	text	eng
WikipediaSaltsSemiconductorsClassification	Classification	text	eng
WikipediaBiolumNeurochemClassification	Classification	text	eng
WikipediaCrystallographyAnalyticalClassification	Classification	text	eng
WikipediaCompChemSpectroscopyClassification	Classification	text	eng
WikipediaChemEngSpecialtiesClassification	Classification	text	eng
WikipediaChemistryTopicsClustering	Clustering	text	eng
WikipediaSpecialtiesInChemistryClustering	Clustering	text	eng
PubChemAISentenceParaphrasePC	PairClassification	text	eng
PubChemSMILESPC	PairClassification	text	eng
PubChemSynonymPC	PairClassification	text	eng
PubChemWikiParagraphsPC	PairClassification	text	eng
PubChemWikiPairClassification	PairClassification	text	ces, deu, eng, fra, hin, ... (13)
ChemNQRetrieval	Retrieval	text	eng
ChemHotpotQARetrieval	Retrieval	text	eng
ChemRxivRetrieval	Retrieval	text	eng

Citation

@article{kasmaee2024chemteb,
  author = {Kasmaee, Ali Shiraee and Khodadad, Mohammad and Saloot, Mohammad Arshi and Sherck, Nick and Dokas, Stephen and Mahyar, Hamidreza and Samiee, Soheila},
  journal = {arXiv preprint arXiv:2412.00532},
  title = {ChemTEB: Chemical Text Embedding Benchmark, an Overview of Embedding Models Performance \& Efficiency on a Specific Domain},
  year = {2024},
}

@article{kasmaee2025chembed,
  author = {Kasmaee, Ali Shiraee and Khodadad, Mohammad and Astaraki, Mahdi and Saloot, Mohammad Arshi and Sherck, Nicholas and Mahyar, Hamidreza and Samiee, Soheila},
  journal = {arXiv preprint arXiv:2508.01643},
  title = {Chembed: Enhancing chemical literature search through domain-specific text embeddings},
  year = {2025},
}

`CoIR`¶

Code information retrieval across diverse programming languages and coding tasks, including code search, question answering, and text-to-SQL retrieval.

Learn more →

Tasks

name	type	modalities	languages
AppsRetrieval	Retrieval	text	eng, python
CodeFeedbackMT	Retrieval	text	eng
CodeFeedbackST	Retrieval	text	eng
CodeSearchNetCCRetrieval	Retrieval	text	go, java, javascript, php, python, ... (6)
CodeTransOceanContest	Retrieval	text	c++, python
CodeTransOceanDL	Retrieval	text	python
CosQA	Retrieval	text	eng, python
COIRCodeSearchNetRetrieval	Retrieval	text	go, java, javascript, php, python, ... (6)
StackOverflowQA	Retrieval	text	eng
SyntheticText2SQL	Retrieval	text	eng, sql

Citation

@misc{li2024coircomprehensivebenchmarkcode,
  archiveprefix = {arXiv},
  author = {Xiangyang Li and Kuicai Dong and Yi Quan Lee and Wei Xia and Yichun Yin and Hao Zhang and Yong Liu and Yasheng Wang and Ruiming Tang},
  eprint = {2407.02883},
  primaryclass = {cs.IR},
  title = {CoIR: A Comprehensive Benchmark for Code Information Retrieval Models},
  url = {https://arxiv.org/abs/2407.02883},
  year = {2024},
}

`CoREB(v1)`¶

Code embedding and reranking quality across code-to-text, text-to-code, and code-to-code retrieval tasks, using counterfactually rewritten problems in five programming languages to limit training data contamination.

Learn more →

Tasks

name	type	modalities	languages
CorebC2TRetrieval	Retrieval	text	c++, eng, go, java, python, ... (6)
CorebC2CRetrieval	Retrieval	text	c++, eng, go, java, python, ... (6)
CorebT2CRetrieval	Retrieval	text	c++, eng, go, java, python, ... (6)
CorebC2TReranking	Reranking	text	c++, eng, go, java, python, ... (6)
CorebC2CReranking	Reranking	text	c++, eng, go, java, python, ... (6)
CorebT2CReranking	Reranking	text	c++, eng, go, java, python, ... (6)

Citation

@article{xue2026coreb,
  author = {Xue, Siqiao and Liao, Zihan and Qin, Jin and Zhang, Ziyin and Mu, Yixiang and Zhou, Fan and Yu, Hang},
  journal = {arXiv preprint arXiv:2605.04615},
  title = {Beyond Retrieval: A Multitask Benchmark and Model for Code Search},
  url = {https://arxiv.org/abs/2605.04615},
  year = {2026},
}

`CodeRAG`¶

Code retrieval quality for retrieval-augmented generation, covering programming solutions, online tutorials, library documentation, and Stack Overflow posts.

Learn more →

Tasks

name	type	modalities	languages
CodeRAGLibraryDocumentationSolutions	Reranking	text	python
CodeRAGOnlineTutorials	Reranking	text	python
CodeRAGProgrammingSolutions	Reranking	text	python
CodeRAGStackoverflowPosts	Reranking	text	python

Citation

@misc{wang2024coderagbenchretrievalaugmentcode,
  archiveprefix = {arXiv},
  author = {Zora Zhiruo Wang and Akari Asai and Xinyan Velocity Yu and Frank F. Xu and Yiqing Xie and Graham Neubig and Daniel Fried},
  eprint = {2406.14497},
  primaryclass = {cs.SE},
  title = {CodeRAG-Bench: Can Retrieval Augment Code Generation?},
  url = {https://arxiv.org/abs/2406.14497},
  year = {2024},
}

`Encodechka`¶

Russian text embedding quality across paraphrase identification, sentiment analysis, toxicity classification, intent classification, natural language inference, and semantic similarity.

Learn more →

Tasks

name	type	modalities	languages
RUParaPhraserSTS	STS	text	rus
SentiRuEval2016	Classification	text	rus
RuToxicOKMLCUPClassification	Classification	text	rus
InappropriatenessClassificationv2	Classification	text	rus
RuNLUIntentClassification	Classification	text	rus
XNLI	PairClassification	text	ara, bul, deu, ell, eng, ... (14)
RuSTSBenchmarkSTS	STS	text	rus

Citation

@misc{dale_encodechka,
  author = {Dale, David},
  editor = {habr.com},
  month = {June},
  note = {[Online; posted 12-June-2022]},
  title = {Russian rating of sentence encoders},
  url = {https://habr.com/ru/articles/669674/},
  year = {2022},
}

`FollowIR`¶

Instruction-following retrieval quality, measuring how well models retrieve relevant documents when given detailed natural language instructions alongside queries.

Learn more →

Tasks

name	type	modalities	languages
Robust04InstructionRetrieval	InstructionReranking	text	eng
News21InstructionRetrieval	InstructionReranking	text	eng
Core17InstructionRetrieval	InstructionReranking	text	eng

Citation

@misc{weller2024followir,
  archiveprefix = {arXiv},
  author = {Orion Weller and Benjamin Chang and Sean MacAvaney and Kyle Lo and Arman Cohan and Benjamin Van Durme and Dawn Lawrie and Luca Soldaini},
  eprint = {2403.15246},
  primaryclass = {cs.IR},
  title = {FollowIR: Evaluating and Teaching Information Retrieval Models to Follow Instructions},
  year = {2024},
}

`HUME(v1)`¶

Text embedding performance benchmarked against human annotator scores across classification, clustering, reranking, and semantic similarity tasks, capturing where models exceed or fall short of human-level judgment.

Tasks

name	type	modalities	languages
HUMEEmotionClassification	Classification	text	eng
HUMEToxicConversationsClassification	Classification	text	eng
HUMETweetSentimentExtractionClassification	Classification	text	eng
HUMEMultilingualSentimentClassification	Classification	text	ara, eng, nob, rus
HUMEArxivClusteringP2P	Clustering	text	eng
HUMERedditClusteringP2P	Clustering	text	eng
HUMEWikiCitiesClustering	Clustering	text	eng
HUMESIB200ClusteringS2S	Clustering	text	ara, dan, eng, fra, rus
HUMECore17InstructionReranking	Reranking	text	eng
HUMENews21InstructionReranking	Reranking	text	eng
HUMERobust04InstructionReranking	Reranking	text	eng
HUMEWikipediaRerankingMultilingual	Reranking	text	dan, eng, nob
HUMESICK-R	STS	text	eng
HUMESTS12	STS	text	eng
HUMESTSBenchmark	STS	text	eng
HUMESTS22	STS	text	ara, eng, fra, rus

`JMTEB(v2)`¶

Japanese text embedding quality across clustering, classification, semantic similarity, retrieval, and reranking. v2 extends the benchmark to 28 datasets for more comprehensive evaluation compared with MTEB(jpn, v1).

Learn more →

Tasks

name	type	modalities	languages
LivedoorNewsClustering.v2	Clustering	text	jpn
MewsC16JaClustering	Clustering	text	jpn
SIB200ClusteringS2S	Clustering	text	ace, acm, acq, aeb, afr, ... (197)
AmazonReviewsClassification	Classification	text	cmn, deu, eng, fra, jpn, ... (6)
AmazonCounterfactualClassification	Classification	text	deu, eng, jpn
MassiveIntentClassification	Classification	text	afr, amh, ara, aze, ben, ... (50)
MassiveScenarioClassification	Classification	text	afr, amh, ara, aze, ben, ... (50)
JapaneseSentimentClassification	Classification	text	jpn
SIB200Classification	Classification	text	ace, acm, acq, aeb, afr, ... (197)
WRIMEClassification	Classification	text	jpn
JSTS	STS	text	jpn
JSICK	STS	text	jpn
JaqketRetrieval	Retrieval	text	jpn
MrTidyRetrieval	Retrieval	text	ara, ben, eng, fin, ind, ... (11)
JaGovFaqsRetrieval	Retrieval	text	jpn
NLPJournalTitleAbsRetrieval.V2	Retrieval	text	jpn
NLPJournalTitleIntroRetrieval.V2	Retrieval	text	jpn
NLPJournalAbsIntroRetrieval.V2	Retrieval	text	jpn
NLPJournalAbsArticleRetrieval.V2	Retrieval	text	jpn
JaCWIRRetrieval	Retrieval	text	jpn
MIRACLRetrieval	Retrieval	text	ara, ben, deu, eng, fas, ... (18)
MintakaRetrieval	Retrieval	text	ara, deu, fra, hin, ita, ... (8)
MultiLongDocRetrieval	Retrieval	text	ara, cmn, deu, eng, fra, ... (13)
ESCIReranking	Reranking	text	eng, jpn, spa
JQaRAReranking	Reranking	text	jpn
JaCWIRReranking	Reranking	text	jpn
MIRACLReranking	Reranking	text	ara, ben, deu, eng, fas, ... (18)
MultiLongDocReranking	Reranking	text	ara, deu, eng, fra, hin, ... (13)

Citation

@article{li2025jmteb,
  author = {Li, Shengzhe and Ohagi, Masaya and Ri, Ryokan and Fukuchi, Akihiko and Shibata, Tomohide and Kawahara, Daisuke},
  issue = {3},
  journal = {Vol.2025-NL-265,No.3,1-15},
  month = {sep},
  title = {{JMTEB and JMTEB-lite: Japanese Massive Text Embedding Benchmark and Its Lightweight Version}},
  year = {2025},
}

`JMTEB-lite(v1)`¶

Japanese text embedding quality across clustering, classification, semantic similarity, retrieval, and reranking, with heavy datasets optimized via hard negative pooling to enable faster evaluation while maintaining rankings consistent with JMTEB.

Learn more →

Tasks

name	type	modalities	languages
LivedoorNewsClustering.v2	Clustering	text	jpn
MewsC16JaClustering	Clustering	text	jpn
SIB200ClusteringS2S	Clustering	text	ace, acm, acq, aeb, afr, ... (197)
AmazonReviewsClassification	Classification	text	cmn, deu, eng, fra, jpn, ... (6)
AmazonCounterfactualClassification	Classification	text	deu, eng, jpn
MassiveIntentClassification	Classification	text	afr, amh, ara, aze, ben, ... (50)
MassiveScenarioClassification	Classification	text	afr, amh, ara, aze, ben, ... (50)
JapaneseSentimentClassification	Classification	text	jpn
SIB200Classification	Classification	text	ace, acm, acq, aeb, afr, ... (197)
WRIMEClassification	Classification	text	jpn
JSTS	STS	text	jpn
JSICK	STS	text	jpn
JaqketRetrievalLite	Retrieval	text	jpn
MrTyDiJaRetrievalLite	Retrieval	text	jpn
JaGovFaqsRetrieval	Retrieval	text	jpn
NLPJournalTitleAbsRetrieval.V2	Retrieval	text	jpn
NLPJournalTitleIntroRetrieval.V2	Retrieval	text	jpn
NLPJournalAbsIntroRetrieval.V2	Retrieval	text	jpn
NLPJournalAbsArticleRetrieval.V2	Retrieval	text	jpn
JaCWIRRetrievalLite	Retrieval	text	jpn
MIRACLJaRetrievalLite	Retrieval	text	jpn
MintakaRetrieval	Retrieval	text	ara, deu, fra, hin, ita, ... (8)
MultiLongDocRetrieval	Retrieval	text	ara, cmn, deu, eng, fra, ... (13)
ESCIReranking	Reranking	text	eng, jpn, spa
JQaRARerankingLite	Reranking	text	jpn
JaCWIRRerankingLite	Reranking	text	jpn
MIRACLReranking	Reranking	text	ara, ben, deu, eng, fas, ... (18)
MultiLongDocReranking	Reranking	text	ara, deu, eng, fra, hin, ... (13)

Citation

@article{li2025jmteb,
  author = {Li, Shengzhe and Ohagi, Masaya and Ri, Ryokan and Fukuchi, Akihiko and Shibata, Tomohide and Kawahara, Daisuke},
  issue = {3},
  journal = {Vol.2025-NL-265,No.3,1-15},
  month = {sep},
  title = {{JMTEB and JMTEB-lite: Japanese Massive Text Embedding Benchmark and Its Lightweight Version}},
  year = {2025},
}

`JinaVDR`¶

Visual document retrieval across multilingual, domain-diverse, and layout-rich document types, spanning medical, legal, financial, technical, and other domains across multiple languages.

Learn more →

Tasks

name	type	modalities	languages
JinaVDRMedicalPrescriptionsRetrieval	DocumentUnderstanding	text, image	eng
JinaVDRStanfordSlideRetrieval	DocumentUnderstanding	text, image	eng
JinaVDRDonutVQAISynHMPRetrieval	DocumentUnderstanding	text, image	eng
JinaVDRTableVQARetrieval	DocumentUnderstanding	text, image	eng
JinaVDRChartQARetrieval	DocumentUnderstanding	text, image	eng
JinaVDRTQARetrieval	DocumentUnderstanding	text, image	eng
JinaVDROpenAINewsRetrieval	DocumentUnderstanding	text, image	eng
JinaVDREuropeanaDeNewsRetrieval	DocumentUnderstanding	text, image	deu
JinaVDREuropeanaEsNewsRetrieval	DocumentUnderstanding	text, image	spa
JinaVDREuropeanaItScansRetrieval	DocumentUnderstanding	text, image	ita
JinaVDREuropeanaNlLegalRetrieval	DocumentUnderstanding	text, image	nld
JinaVDRHindiGovVQARetrieval	DocumentUnderstanding	text, image	hin
JinaVDRAutomobileCatelogRetrieval	DocumentUnderstanding	text, image	jpn
JinaVDRBeveragesCatalogueRetrieval	DocumentUnderstanding	text, image	rus
JinaVDRRamensBenchmarkRetrieval	DocumentUnderstanding	text, image	jpn
JinaVDRJDocQARetrieval	DocumentUnderstanding	text, image	jpn
JinaVDRHungarianDocQARetrieval	DocumentUnderstanding	text, image	hun
JinaVDRArabicChartQARetrieval	DocumentUnderstanding	text, image	ara
JinaVDRArabicInfographicsVQARetrieval	DocumentUnderstanding	text, image	ara
JinaVDROWIDChartsRetrieval	DocumentUnderstanding	text, image	eng
JinaVDRMPMQARetrieval	DocumentUnderstanding	text, image	eng
JinaVDRJina2024YearlyBookRetrieval	DocumentUnderstanding	text, image	eng
JinaVDRWikimediaCommonsMapsRetrieval	DocumentUnderstanding	text, image	eng
JinaVDRPlotQARetrieval	DocumentUnderstanding	text, image	eng
JinaVDRMMTabRetrieval	DocumentUnderstanding	text, image	eng
JinaVDRCharXivOCRRetrieval	DocumentUnderstanding	text, image	eng
JinaVDRStudentEnrollmentSyntheticRetrieval	DocumentUnderstanding	text, image	eng
JinaVDRGitHubReadmeRetrieval	DocumentUnderstanding	text, image	ara, ben, deu, eng, fra, ... (17)
JinaVDRTweetStockSyntheticsRetrieval	DocumentUnderstanding	text, image	ara, deu, eng, fra, hin, ... (10)
JinaVDRAirbnbSyntheticRetrieval	DocumentUnderstanding	text, image	ara, deu, eng, fra, hin, ... (10)
JinaVDRShanghaiMasterPlanRetrieval	DocumentUnderstanding	text, image	zho
JinaVDRWikimediaCommonsDocumentsRetrieval	DocumentUnderstanding	text, image	ara, ben, deu, eng, fra, ... (20)
JinaVDREuropeanaFrNewsRetrieval	DocumentUnderstanding	text, image	fra
JinaVDRDocQAHealthcareIndustryRetrieval	DocumentUnderstanding	text, image	eng
JinaVDRDocQAAI	DocumentUnderstanding	text, image	eng
JinaVDRShiftProjectRetrieval	DocumentUnderstanding	text, image	eng
JinaVDRTatQARetrieval	DocumentUnderstanding	text, image	eng
JinaVDRInfovqaRetrieval	DocumentUnderstanding	text, image	eng
JinaVDRDocVQARetrieval	DocumentUnderstanding	text, image	eng
JinaVDRDocQAGovReportRetrieval	DocumentUnderstanding	text, image	eng
JinaVDRTabFQuadRetrieval	DocumentUnderstanding	text, image	eng
JinaVDRDocQAEnergyRetrieval	DocumentUnderstanding	text, image	eng
JinaVDRArxivQARetrieval	DocumentUnderstanding	text, image	eng

Citation

@misc{günther2025jinaembeddingsv4universalembeddingsmultimodal,
  archiveprefix = {arXiv},
  author = {Michael Günther and Saba Sturua and Mohammad Kalim Akram and Isabelle Mohr and Andrei Ungureanu and Bo Wang and Sedigheh Eslami and Scott Martens and Maximilian Werk and Nan Wang and Han Xiao},
  eprint = {2506.18902},
  primaryclass = {cs.AI},
  title = {jina-embeddings-v4: Universal Embeddings for Multimodal Multilingual Retrieval},
  url = {https://arxiv.org/abs/2506.18902},
  year = {2025},
}

`KoViDoRe(v2)`¶

Korean visual document retrieval across enterprise document domains including cybersecurity, economics, energy, and HR.

Learn more →

Tasks

name	type	modalities	languages
KoVidore2CybersecurityRetrieval	DocumentUnderstanding	text, image	kor
KoVidore2EconomicRetrieval	DocumentUnderstanding	text, image	kor
KoVidore2EnergyRetrieval	DocumentUnderstanding	text, image	kor
KoVidore2HrRetrieval	DocumentUnderstanding	text, image	kor

Citation

@misc{choi2026kovidorev2,
  author = {Yongbin Choi},
  note = {A benchmark for evaluating Korean vision document retrieval with multi-page reasoning queries in practical domains},
  title = {KoViDoRe v2: a comprehensive evaluation of vision document retrieval for enterprise use-cases},
  url = {https://github.com/whybe-choi/kovidore-data-generator},
  year = {2026},
}

`LMEB`¶

Long-horizon memory retrieval quality across episodic, dialogue, semantic, and procedural retrieval tasks, measuring how well embedding models retrieve evidence in long-term memory scenarios.

Learn more →

Tasks

name	type	modalities	languages
EPBench	Retrieval	text	eng
KnowMeBench	Retrieval	text	eng
LoCoMo	Retrieval	text	eng
LongMemEval	Retrieval	text	eng
REALTALK	Retrieval	text	eng
TMD	Retrieval	text	eng
MemBench	Retrieval	text	eng
ConvoMem	Retrieval	text	eng
QASPER	Retrieval	text	eng
NovelQA	Retrieval	text	eng
PeerQA	Retrieval	text	eng
CovidQA	Retrieval	text	eng
ESGReports	Retrieval	text	eng
LMEBMLDR	Retrieval	text	eng
LooGLE	Retrieval	text	eng
LMEB_SciFact	Retrieval	text	eng
Gorilla	Retrieval	text	eng
ToolBench	Retrieval	text	eng
ReMe	Retrieval	text	eng
ProceduralMemBench	Retrieval	text	eng
MemGovern	Retrieval	text	eng
DeepPlanning	Retrieval	text	eng

Citation

@misc{zhao2026lmeb,
  archiveprefix = {arXiv},
  author = {Zhao, Xinping and Hu, Xinshuo and Xu, Jiaxin and Tang, Danyu and Zhang, Xin and Zhou, Mengjia and Zhong, Yan and Zhou, Yao and Shan, Zifei and Zhang, Meishan and Hu, Baotian and Zhang, Min},
  eprint = {2603.12572},
  primaryclass = {cs.CL},
  title = {LMEB: Long-horizon Memory Embedding Benchmark},
  url = {https://arxiv.org/abs/2603.12572},
  year = {2026},
}

`LongEmbed`¶

Long-context retrieval quality across synthetic and real-world tasks featuring documents of varying length with dispersed target information.

Learn more →

Tasks

name	type	modalities	languages
LEMBNarrativeQARetrieval	Retrieval	text	eng
LEMBNeedleRetrieval	Retrieval	text	eng
LEMBPasskeyRetrieval	Retrieval	text	eng
LEMBQMSumRetrieval	Retrieval	text	eng
LEMBSummScreenFDRetrieval	Retrieval	text	eng
LEMBWikimQARetrieval	Retrieval	text	eng

Citation

@article{zhu2024longembed,
  author = {Zhu, Dawei and Wang, Liang and Yang, Nan and Song, Yifan and Wu, Wenhao and Wei, Furu and Li, Sujian},
  journal = {arXiv preprint arXiv:2404.12096},
  title = {LongEmbed: Extending Embedding Models for Long Context Retrieval},
  year = {2024},
}

`MAEB(beta)`¶

Audio embedding quality across both audio-only and audio-text cross-modal tasks, spanning retrieval, classification, clustering, multilabel classification, pair classification, reranking, and zero-shot classification. Currently in beta pending peer review.

Tasks

name	type	modalities	languages
ClothoT2ARetrieval	Any2AnyRetrieval	text, audio	eng
CommonVoiceMini21T2ARetrieval	Any2AnyRetrieval	text, audio	abk, afr, amh, ara, asm, ... (114)
FleursT2ARetrieval	Any2AnyRetrieval	text, audio	afr, amh, ara, asm, ast, ... (102)
GigaSpeechT2ARetrieval	Any2AnyRetrieval	text, audio	eng
JamAltArtistA2ARetrieval	Any2AnyRetrieval	audio	deu, eng, fra, spa
JamAltLyricA2TRetrieval	Any2AnyRetrieval	text, audio	deu, eng, fra, spa
MACST2ARetrieval	Any2AnyRetrieval	text, audio	eng
SpokenSQuADT2ARetrieval	Any2AnyRetrieval	text, audio	eng
UrbanSound8KT2ARetrieval	Any2AnyRetrieval	text, audio	zxx
BeijingOpera	AudioClassification	audio	zxx
BirdCLEF	AudioClassification	audio	zxx
CREMA_D	AudioClassification	audio	eng
CommonLanguageAgeDetection	AudioClassification	audio	eng
GTZANGenre	AudioClassification	audio	zxx
IEMOCAPGender	AudioClassification	audio	eng
MInDS14	AudioClassification	audio	ces, deu, eng, fra, ita, ... (12)
MridinghamTonic	AudioClassification	audio	zxx
SIBFLEURS	AudioClassification	audio	afr, amh, arb, asm, ast, ... (101)
VoxCelebSA	AudioClassification	audio	eng
VoxPopuliLanguageID	AudioClassification	audio	deu, eng, fra, pol, spa
CREMA_DClustering	AudioClustering	audio	eng
VehicleSoundClustering	AudioClustering	audio	zxx
VoxPopuliGenderClustering	AudioClustering	audio	deu, eng, fra, pol, spa
CREMADPairClassification	AudioPairClassification	audio	eng
NMSQAPairClassification	AudioPairClassification	audio	eng
VoxPopuliAccentPairClassification	AudioPairClassification	audio	eng
GTZANAudioReranking	AudioReranking	audio	zxx
RavdessZeroshot	AudioZeroshotClassification	audio, text	eng
SpeechCommandsZeroshotv0.02	AudioZeroshotClassification	audio, text	eng
FSD2019Kaggle	AudioMultilabelClassification	audio	eng

Citation

@misc{assadi2026maebmassiveaudioembedding,
  archiveprefix = {arXiv},
  author = {Adnan El Assadi and Isaac Chung and Chenghao Xiao and Roman Solomatin and Animesh Jha and Rahul Chand and Silky Singh and Kaitlyn Wang and Ali Sartaz Khan and Marc Moussa Nasser and Sufen Fong and Pengfei He and Alan Xiao and Ayush Sunil Munot and Aditya Shrivastava and Artem Gazizov and Niklas Muennighoff and Kenneth Enevoldsen},
  eprint = {2602.16008},
  primaryclass = {cs.SD},
  title = {MAEB: Massive Audio Embedding Benchmark},
  url = {https://arxiv.org/abs/2602.16008},
  year = {2026},
}

`MAEB(beta, audio-only)`¶

Audio-only embedding quality across classification, clustering, pair classification, reranking, and retrieval tasks. Currently in beta pending peer review.

Tasks

name	type	modalities	languages
JamAltArtistA2ARetrieval	Any2AnyRetrieval	audio	deu, eng, fra, spa
BeijingOpera	AudioClassification	audio	zxx
BirdCLEF	AudioClassification	audio	zxx
CREMA_D	AudioClassification	audio	eng
CommonLanguageAgeDetection	AudioClassification	audio	eng
GTZANGenre	AudioClassification	audio	zxx
IEMOCAPGender	AudioClassification	audio	eng
MInDS14	AudioClassification	audio	ces, deu, eng, fra, ita, ... (12)
MridinghamTonic	AudioClassification	audio	zxx
SIBFLEURS	AudioClassification	audio	afr, amh, arb, asm, ast, ... (101)
VoxCelebSA	AudioClassification	audio	eng
VoxPopuliLanguageID	AudioClassification	audio	deu, eng, fra, pol, spa
CREMA_DClustering	AudioClustering	audio	eng
VehicleSoundClustering	AudioClustering	audio	zxx
VoxPopuliGenderClustering	AudioClustering	audio	deu, eng, fra, pol, spa
CREMADPairClassification	AudioPairClassification	audio	eng
NMSQAPairClassification	AudioPairClassification	audio	eng
VoxPopuliAccentPairClassification	AudioPairClassification	audio	eng
GTZANAudioReranking	AudioReranking	audio	zxx

Citation

@misc{assadi2026maebmassiveaudioembedding,
  archiveprefix = {arXiv},
  author = {Adnan El Assadi and Isaac Chung and Chenghao Xiao and Roman Solomatin and Animesh Jha and Rahul Chand and Silky Singh and Kaitlyn Wang and Ali Sartaz Khan and Marc Moussa Nasser and Sufen Fong and Pengfei He and Alan Xiao and Ayush Sunil Munot and Aditya Shrivastava and Artem Gazizov and Niklas Muennighoff and Kenneth Enevoldsen},
  eprint = {2602.16008},
  primaryclass = {cs.SD},
  title = {MAEB: Massive Audio Embedding Benchmark},
  url = {https://arxiv.org/abs/2602.16008},
  year = {2026},
}

`MIEB(Img)`¶

Image-only embedding quality across retrieval, classification, clustering, and visual STS, excluding tasks that require a text encoder.

Learn more →

Tasks

name	type	modalities	languages
CUB200I2IRetrieval	Any2AnyRetrieval	image	eng
FORBI2IRetrieval	Any2AnyRetrieval	image	eng
GLDv2I2IRetrieval	Any2AnyRetrieval	image	eng
METI2IRetrieval	Any2AnyRetrieval	image	eng
NIGHTSI2IRetrieval	Any2AnyRetrieval	image	eng
ROxfordEasyI2IRetrieval	Any2AnyRetrieval	image	eng
ROxfordMediumI2IRetrieval	Any2AnyRetrieval	image	eng
ROxfordHardI2IRetrieval	Any2AnyRetrieval	image	eng
RP2kI2IRetrieval	Any2AnyRetrieval	image	eng
RParisEasyI2IRetrieval	Any2AnyRetrieval	image	eng
RParisMediumI2IRetrieval	Any2AnyRetrieval	image	eng
RParisHardI2IRetrieval	Any2AnyRetrieval	image	eng
SketchyI2IRetrieval	Any2AnyRetrieval	image	eng
SOPI2IRetrieval	Any2AnyRetrieval	image	eng
StanfordCarsI2IRetrieval	Any2AnyRetrieval	image	eng
Birdsnap	ImageClassification	image	eng
Caltech101	ImageClassification	image	eng
CIFAR10	ImageClassification	image	eng
CIFAR100	ImageClassification	image	eng
Country211	ImageClassification	image	eng
DTD	ImageClassification	image	eng
EuroSAT	ImageClassification	image	eng
FER2013	ImageClassification	image	eng
FGVCAircraft	ImageClassification	image	eng
Food101Classification	ImageClassification	image	eng
GTSRB	ImageClassification	image	eng
Imagenet1k	ImageClassification	image	eng
MNIST	ImageClassification	image	eng
OxfordFlowersClassification	ImageClassification	image	eng
OxfordPets	ImageClassification	image	eng
PatchCamelyon	ImageClassification	image	eng
RESISC45	ImageClassification	image	eng
StanfordCars	ImageClassification	image	eng
STL10	ImageClassification	image	eng
SUN397	ImageClassification	image	eng
UCF101	ImageClassification	image	eng
CIFAR10Clustering	ImageClustering	image	eng
CIFAR100Clustering	ImageClustering	image	eng
ImageNetDog15Clustering	ImageClustering	image	eng
ImageNet10Clustering	ImageClustering	image	eng
TinyImageNetClustering	ImageClustering	image	eng
VOC2007	ImageClassification	image	eng
STS12VisualSTS	VisualSTS(eng)	image	eng
STS13VisualSTS	VisualSTS(eng)	image	eng
STS14VisualSTS	VisualSTS(eng)	image	eng
STS15VisualSTS	VisualSTS(eng)	image	eng
STS16VisualSTS	VisualSTS(eng)	image	eng
STS17MultilingualVisualSTS	VisualSTS(multi)	image	ara, deu, eng, fra, ita, ... (9)
STSBenchmarkMultilingualVisualSTS	VisualSTS(multi)	image	cmn, deu, eng, fra, ita, ... (10)

Citation

@inproceedings{xiao2025mieb,
  author = {Xiao, Chenghao and Chung, Isaac and Kerboua, Imene and Stirling, Jamie and Zhang, Xin and Kardos, M\'arton and Solomatin, Roman and Al Moubayed, Noura and Enevoldsen, Kenneth and Muennighoff, Niklas},
  booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
  month = {October},
  pages = {22187-22198},
  title = {MIEB: Massive Image Embedding Benchmark},
  year = {2025},
}

`MIEB(Multilingual)`¶

Multilingual image embedding quality across 39 languages, spanning image classification (zero-shot and linear probing), clustering, retrieval, compositionality evaluation, document understanding, visual STS, and CV-centric tasks. Extends MIEB(eng) with multilingual retrieval datasets and the multilingual portions of VisualSTS-b and VisualSTS-16.

Learn more →

Tasks

name	type	modalities	languages
Birdsnap	ImageClassification	image	eng
Caltech101	ImageClassification	image	eng
CIFAR10	ImageClassification	image	eng
CIFAR100	ImageClassification	image	eng
Country211	ImageClassification	image	eng
DTD	ImageClassification	image	eng
EuroSAT	ImageClassification	image	eng
FER2013	ImageClassification	image	eng
FGVCAircraft	ImageClassification	image	eng
Food101Classification	ImageClassification	image	eng
GTSRB	ImageClassification	image	eng
Imagenet1k	ImageClassification	image	eng
MNIST	ImageClassification	image	eng
OxfordFlowersClassification	ImageClassification	image	eng
OxfordPets	ImageClassification	image	eng
PatchCamelyon	ImageClassification	image	eng
RESISC45	ImageClassification	image	eng
StanfordCars	ImageClassification	image	eng
STL10	ImageClassification	image	eng
SUN397	ImageClassification	image	eng
UCF101	ImageClassification	image	eng
VOC2007	ImageClassification	image	eng
CIFAR10Clustering	ImageClustering	image	eng
CIFAR100Clustering	ImageClustering	image	eng
ImageNetDog15Clustering	ImageClustering	image	eng
ImageNet10Clustering	ImageClustering	image	eng
TinyImageNetClustering	ImageClustering	image	eng
BirdsnapZeroShot	ZeroShotClassification	image, text	eng
Caltech101ZeroShot	ZeroShotClassification	text, image	eng
CIFAR10ZeroShot	ZeroShotClassification	text, image	eng
CIFAR100ZeroShot	ZeroShotClassification	text, image	eng
CLEVRZeroShot	ZeroShotClassification	text, image	eng
CLEVRCountZeroShot	ZeroShotClassification	text, image	eng
Country211ZeroShot	ZeroShotClassification	image, text	eng
DTDZeroShot	ZeroShotClassification	image, text	eng
EuroSATZeroShot	ZeroShotClassification	image, text	eng
FER2013ZeroShot	ZeroShotClassification	image, text	eng
FGVCAircraftZeroShot	ZeroShotClassification	text, image	eng
Food101ZeroShot	ZeroShotClassification	text, image	eng
GTSRBZeroShot	ZeroShotClassification	image, text	eng
Imagenet1kZeroShot	ZeroShotClassification	image, text	eng
MNISTZeroShot	ZeroShotClassification	image, text	eng
OxfordPetsZeroShot	ZeroShotClassification	text, image	eng
PatchCamelyonZeroShot	ZeroShotClassification	image, text	eng
RenderedSST2	ZeroShotClassification	text, image	eng
RESISC45ZeroShot	ZeroShotClassification	image, text	eng
StanfordCarsZeroShot	ZeroShotClassification	image, text	eng
STL10ZeroShot	ZeroShotClassification	image, text	eng
SUN397ZeroShot	ZeroShotClassification	image, text	eng
UCF101ZeroShot	ZeroShotClassification	image, text	eng
BLINKIT2IMultiChoice	VisionCentricQA	text, image	eng
BLINKIT2TMultiChoice	VisionCentricQA	text, image	eng
CVBenchCount	VisionCentricQA	image, text	eng
CVBenchRelation	VisionCentricQA	text, image	eng
CVBenchDepth	VisionCentricQA	text, image	eng
CVBenchDistance	VisionCentricQA	text, image	eng
AROCocoOrder	Compositionality	text, image	eng
AROFlickrOrder	Compositionality	text, image	eng
AROVisualAttribution	Compositionality	text, image	eng
AROVisualRelation	Compositionality	text, image	eng
SugarCrepe	Compositionality	text, image	eng
Winoground	Compositionality	text, image	eng
ImageCoDe	Compositionality	text, image	eng
STS12VisualSTS	VisualSTS(eng)	image	eng
STS13VisualSTS	VisualSTS(eng)	image	eng
STS14VisualSTS	VisualSTS(eng)	image	eng
STS15VisualSTS	VisualSTS(eng)	image	eng
STS16VisualSTS	VisualSTS(eng)	image	eng
BLINKIT2IRetrieval	Any2AnyRetrieval	text, image	eng
BLINKIT2TRetrieval	Any2AnyRetrieval	text, image	eng
CIRRIT2IRetrieval	Any2AnyRetrieval	text, image	eng
CUB200I2IRetrieval	Any2AnyRetrieval	image	eng
EDIST2ITRetrieval	Any2AnyRetrieval	text, image	eng
Fashion200kI2TRetrieval	Any2AnyRetrieval	text, image	eng
Fashion200kT2IRetrieval	Any2AnyRetrieval	text, image	eng
FashionIQIT2IRetrieval	Any2AnyRetrieval	text, image	eng
Flickr30kI2TRetrieval	Any2AnyRetrieval	text, image	eng
Flickr30kT2IRetrieval	Any2AnyRetrieval	text, image	eng
FORBI2IRetrieval	Any2AnyRetrieval	image	eng
GLDv2I2IRetrieval	Any2AnyRetrieval	image	eng
GLDv2I2TRetrieval	Any2AnyRetrieval	text, image	eng
HatefulMemesI2TRetrieval	Any2AnyRetrieval	text, image	eng
HatefulMemesT2IRetrieval	Any2AnyRetrieval	text, image	eng
ImageCoDeT2IRetrieval	Any2AnyRetrieval	text, image	eng
InfoSeekIT2ITRetrieval	Any2AnyRetrieval	text, image	eng
InfoSeekIT2TRetrieval	Any2AnyRetrieval	text, image	eng
MemotionI2TRetrieval	Any2AnyRetrieval	text, image	eng
MemotionT2IRetrieval	Any2AnyRetrieval	text, image	eng
METI2IRetrieval	Any2AnyRetrieval	image	eng
MSCOCOI2TRetrieval	Any2AnyRetrieval	text, image	eng
MSCOCOT2IRetrieval	Any2AnyRetrieval	text, image	eng
NIGHTSI2IRetrieval	Any2AnyRetrieval	image	eng
OVENIT2ITRetrieval	Any2AnyRetrieval	image, text	eng
OVENIT2TRetrieval	Any2AnyRetrieval	text, image	eng
ROxfordEasyI2IRetrieval	Any2AnyRetrieval	image	eng
ROxfordMediumI2IRetrieval	Any2AnyRetrieval	image	eng
ROxfordHardI2IRetrieval	Any2AnyRetrieval	image	eng
RP2kI2IRetrieval	Any2AnyRetrieval	image	eng
RParisEasyI2IRetrieval	Any2AnyRetrieval	image	eng
RParisMediumI2IRetrieval	Any2AnyRetrieval	image	eng
RParisHardI2IRetrieval	Any2AnyRetrieval	image	eng
SciMMIRI2TRetrieval	Any2AnyRetrieval	text, image	eng
SciMMIRT2IRetrieval	Any2AnyRetrieval	text, image	eng
SketchyI2IRetrieval	Any2AnyRetrieval	image	eng
SOPI2IRetrieval	Any2AnyRetrieval	image	eng
StanfordCarsI2IRetrieval	Any2AnyRetrieval	image	eng
TUBerlinT2IRetrieval	Any2AnyRetrieval	text, image	eng
VidoreArxivQARetrieval	DocumentUnderstanding	text, image	eng
VidoreDocVQARetrieval	DocumentUnderstanding	text, image	eng
VidoreInfoVQARetrieval	DocumentUnderstanding	text, image	eng
VidoreTabfquadRetrieval	DocumentUnderstanding	text, image	eng
VidoreTatdqaRetrieval	DocumentUnderstanding	text, image	eng
VidoreShiftProjectRetrieval	DocumentUnderstanding	text, image	eng
VidoreSyntheticDocQAAIRetrieval	DocumentUnderstanding	text, image	eng
VidoreSyntheticDocQAEnergyRetrieval	DocumentUnderstanding	text, image	eng
VidoreSyntheticDocQAGovernmentReportsRetrieval	DocumentUnderstanding	text, image	eng
VidoreSyntheticDocQAHealthcareIndustryRetrieval	DocumentUnderstanding	text, image	eng
VisualNewsI2TRetrieval	Any2AnyRetrieval	image, text	eng
VisualNewsT2IRetrieval	Any2AnyRetrieval	image, text	eng
VizWizIT2TRetrieval	Any2AnyRetrieval	text, image	eng
VQA2IT2TRetrieval	Any2AnyRetrieval	text, image	eng
WebQAT2ITRetrieval	Any2AnyRetrieval	image, text	eng
WebQAT2TRetrieval	Any2AnyRetrieval	text	eng
WITT2IRetrieval	Any2AnyMultilingualRetrieval	text, image	ara, bul, dan, ell, eng, ... (11)
XFlickr30kCoT2IRetrieval	Any2AnyMultilingualRetrieval	text, image	deu, eng, ind, jpn, rus, ... (8)
XM3600T2IRetrieval	Any2AnyMultilingualRetrieval	text, image	ara, ben, ces, dan, deu, ... (38)
VisualSTS17Eng	VisualSTS(eng)	image	ara, deu, eng, fra, ita, ... (9)
VisualSTS-b-Eng	VisualSTS(eng)	image	cmn, deu, eng, fra, ita, ... (10)
VisualSTS17Multilingual	VisualSTS(multi)	image	ara, deu, eng, fra, ita, ... (9)
VisualSTS-b-Multilingual	VisualSTS(multi)	image	cmn, deu, eng, fra, ita, ... (10)

Citation

@inproceedings{xiao2025mieb,
  author = {Xiao, Chenghao and Chung, Isaac and Kerboua, Imene and Stirling, Jamie and Zhang, Xin and Kardos, M\'arton and Solomatin, Roman and Al Moubayed, Noura and Enevoldsen, Kenneth and Muennighoff, Niklas},
  booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
  month = {October},
  pages = {22187-22198},
  title = {MIEB: Massive Image Embedding Benchmark},
  year = {2025},
}

`MIEB(eng)`¶

English image embedding quality across image classification (zero-shot and linear probing), clustering, retrieval, compositionality evaluation, document understanding, visual STS, and CV-centric tasks.

Learn more →

Tasks

name	type	modalities	languages
Birdsnap	ImageClassification	image	eng
Caltech101	ImageClassification	image	eng
CIFAR10	ImageClassification	image	eng
CIFAR100	ImageClassification	image	eng
Country211	ImageClassification	image	eng
DTD	ImageClassification	image	eng
EuroSAT	ImageClassification	image	eng
FER2013	ImageClassification	image	eng
FGVCAircraft	ImageClassification	image	eng
Food101Classification	ImageClassification	image	eng
GTSRB	ImageClassification	image	eng
Imagenet1k	ImageClassification	image	eng
MNIST	ImageClassification	image	eng
OxfordFlowersClassification	ImageClassification	image	eng
OxfordPets	ImageClassification	image	eng
PatchCamelyon	ImageClassification	image	eng
RESISC45	ImageClassification	image	eng
StanfordCars	ImageClassification	image	eng
STL10	ImageClassification	image	eng
SUN397	ImageClassification	image	eng
UCF101	ImageClassification	image	eng
VOC2007	ImageClassification	image	eng
CIFAR10Clustering	ImageClustering	image	eng
CIFAR100Clustering	ImageClustering	image	eng
ImageNetDog15Clustering	ImageClustering	image	eng
ImageNet10Clustering	ImageClustering	image	eng
TinyImageNetClustering	ImageClustering	image	eng
BirdsnapZeroShot	ZeroShotClassification	image, text	eng
Caltech101ZeroShot	ZeroShotClassification	text, image	eng
CIFAR10ZeroShot	ZeroShotClassification	text, image	eng
CIFAR100ZeroShot	ZeroShotClassification	text, image	eng
CLEVRZeroShot	ZeroShotClassification	text, image	eng
CLEVRCountZeroShot	ZeroShotClassification	text, image	eng
Country211ZeroShot	ZeroShotClassification	image, text	eng
DTDZeroShot	ZeroShotClassification	image, text	eng
EuroSATZeroShot	ZeroShotClassification	image, text	eng
FER2013ZeroShot	ZeroShotClassification	image, text	eng
FGVCAircraftZeroShot	ZeroShotClassification	text, image	eng
Food101ZeroShot	ZeroShotClassification	text, image	eng
GTSRBZeroShot	ZeroShotClassification	image, text	eng
Imagenet1kZeroShot	ZeroShotClassification	image, text	eng
MNISTZeroShot	ZeroShotClassification	image, text	eng
OxfordPetsZeroShot	ZeroShotClassification	text, image	eng
PatchCamelyonZeroShot	ZeroShotClassification	image, text	eng
RenderedSST2	ZeroShotClassification	text, image	eng
RESISC45ZeroShot	ZeroShotClassification	image, text	eng
StanfordCarsZeroShot	ZeroShotClassification	image, text	eng
STL10ZeroShot	ZeroShotClassification	image, text	eng
SUN397ZeroShot	ZeroShotClassification	image, text	eng
UCF101ZeroShot	ZeroShotClassification	image, text	eng
BLINKIT2IMultiChoice	VisionCentricQA	text, image	eng
BLINKIT2TMultiChoice	VisionCentricQA	text, image	eng
CVBenchCount	VisionCentricQA	image, text	eng
CVBenchRelation	VisionCentricQA	text, image	eng
CVBenchDepth	VisionCentricQA	text, image	eng
CVBenchDistance	VisionCentricQA	text, image	eng
AROCocoOrder	Compositionality	text, image	eng
AROFlickrOrder	Compositionality	text, image	eng
AROVisualAttribution	Compositionality	text, image	eng
AROVisualRelation	Compositionality	text, image	eng
SugarCrepe	Compositionality	text, image	eng
Winoground	Compositionality	text, image	eng
ImageCoDe	Compositionality	text, image	eng
STS12VisualSTS	VisualSTS(eng)	image	eng
STS13VisualSTS	VisualSTS(eng)	image	eng
STS14VisualSTS	VisualSTS(eng)	image	eng
STS15VisualSTS	VisualSTS(eng)	image	eng
STS16VisualSTS	VisualSTS(eng)	image	eng
BLINKIT2IRetrieval	Any2AnyRetrieval	text, image	eng
BLINKIT2TRetrieval	Any2AnyRetrieval	text, image	eng
CIRRIT2IRetrieval	Any2AnyRetrieval	text, image	eng
CUB200I2IRetrieval	Any2AnyRetrieval	image	eng
EDIST2ITRetrieval	Any2AnyRetrieval	text, image	eng
Fashion200kI2TRetrieval	Any2AnyRetrieval	text, image	eng
Fashion200kT2IRetrieval	Any2AnyRetrieval	text, image	eng
FashionIQIT2IRetrieval	Any2AnyRetrieval	text, image	eng
Flickr30kI2TRetrieval	Any2AnyRetrieval	text, image	eng
Flickr30kT2IRetrieval	Any2AnyRetrieval	text, image	eng
FORBI2IRetrieval	Any2AnyRetrieval	image	eng
GLDv2I2IRetrieval	Any2AnyRetrieval	image	eng
GLDv2I2TRetrieval	Any2AnyRetrieval	text, image	eng
HatefulMemesI2TRetrieval	Any2AnyRetrieval	text, image	eng
HatefulMemesT2IRetrieval	Any2AnyRetrieval	text, image	eng
ImageCoDeT2IRetrieval	Any2AnyRetrieval	text, image	eng
InfoSeekIT2ITRetrieval	Any2AnyRetrieval	text, image	eng
InfoSeekIT2TRetrieval	Any2AnyRetrieval	text, image	eng
MemotionI2TRetrieval	Any2AnyRetrieval	text, image	eng
MemotionT2IRetrieval	Any2AnyRetrieval	text, image	eng
METI2IRetrieval	Any2AnyRetrieval	image	eng
MSCOCOI2TRetrieval	Any2AnyRetrieval	text, image	eng
MSCOCOT2IRetrieval	Any2AnyRetrieval	text, image	eng
NIGHTSI2IRetrieval	Any2AnyRetrieval	image	eng
OVENIT2ITRetrieval	Any2AnyRetrieval	image, text	eng
OVENIT2TRetrieval	Any2AnyRetrieval	text, image	eng
ROxfordEasyI2IRetrieval	Any2AnyRetrieval	image	eng
ROxfordMediumI2IRetrieval	Any2AnyRetrieval	image	eng
ROxfordHardI2IRetrieval	Any2AnyRetrieval	image	eng
RP2kI2IRetrieval	Any2AnyRetrieval	image	eng
RParisEasyI2IRetrieval	Any2AnyRetrieval	image	eng
RParisMediumI2IRetrieval	Any2AnyRetrieval	image	eng
RParisHardI2IRetrieval	Any2AnyRetrieval	image	eng
SciMMIRI2TRetrieval	Any2AnyRetrieval	text, image	eng
SciMMIRT2IRetrieval	Any2AnyRetrieval	text, image	eng
SketchyI2IRetrieval	Any2AnyRetrieval	image	eng
SOPI2IRetrieval	Any2AnyRetrieval	image	eng
StanfordCarsI2IRetrieval	Any2AnyRetrieval	image	eng
TUBerlinT2IRetrieval	Any2AnyRetrieval	text, image	eng
VidoreArxivQARetrieval	DocumentUnderstanding	text, image	eng
VidoreDocVQARetrieval	DocumentUnderstanding	text, image	eng
VidoreInfoVQARetrieval	DocumentUnderstanding	text, image	eng
VidoreTabfquadRetrieval	DocumentUnderstanding	text, image	eng
VidoreTatdqaRetrieval	DocumentUnderstanding	text, image	eng
VidoreShiftProjectRetrieval	DocumentUnderstanding	text, image	eng
VidoreSyntheticDocQAAIRetrieval	DocumentUnderstanding	text, image	eng
VidoreSyntheticDocQAEnergyRetrieval	DocumentUnderstanding	text, image	eng
VidoreSyntheticDocQAGovernmentReportsRetrieval	DocumentUnderstanding	text, image	eng
VidoreSyntheticDocQAHealthcareIndustryRetrieval	DocumentUnderstanding	text, image	eng
VisualNewsI2TRetrieval	Any2AnyRetrieval	image, text	eng
VisualNewsT2IRetrieval	Any2AnyRetrieval	image, text	eng
VizWizIT2TRetrieval	Any2AnyRetrieval	text, image	eng
VQA2IT2TRetrieval	Any2AnyRetrieval	text, image	eng
WebQAT2ITRetrieval	Any2AnyRetrieval	image, text	eng
WebQAT2TRetrieval	Any2AnyRetrieval	text	eng
VisualSTS17Eng	VisualSTS(eng)	image	ara, deu, eng, fra, ita, ... (9)
VisualSTS-b-Eng	VisualSTS(eng)	image	cmn, deu, eng, fra, ita, ... (10)

Citation

@inproceedings{xiao2025mieb,
  author = {Xiao, Chenghao and Chung, Isaac and Kerboua, Imene and Stirling, Jamie and Zhang, Xin and Kardos, M\'arton and Solomatin, Roman and Al Moubayed, Noura and Enevoldsen, Kenneth and Muennighoff, Niklas},
  booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
  month = {October},
  pages = {22187-22198},
  title = {MIEB: Massive Image Embedding Benchmark},
  year = {2025},
}

`MIEB(lite)`¶

Multilingual image embedding quality across the same task types as MIEB(Multilingual), designed to be run at a fraction of the cost while maintaining relative model rankings.

Learn more →

Tasks

name	type	modalities	languages
Country211	ImageClassification	image	eng
DTD	ImageClassification	image	eng
EuroSAT	ImageClassification	image	eng
GTSRB	ImageClassification	image	eng
OxfordPets	ImageClassification	image	eng
PatchCamelyon	ImageClassification	image	eng
RESISC45	ImageClassification	image	eng
SUN397	ImageClassification	image	eng
ImageNetDog15Clustering	ImageClustering	image	eng
TinyImageNetClustering	ImageClustering	image	eng
CIFAR100ZeroShot	ZeroShotClassification	text, image	eng
Country211ZeroShot	ZeroShotClassification	image, text	eng
FER2013ZeroShot	ZeroShotClassification	image, text	eng
FGVCAircraftZeroShot	ZeroShotClassification	text, image	eng
Food101ZeroShot	ZeroShotClassification	text, image	eng
OxfordPetsZeroShot	ZeroShotClassification	text, image	eng
StanfordCarsZeroShot	ZeroShotClassification	image, text	eng
BLINKIT2IMultiChoice	VisionCentricQA	text, image	eng
CVBenchCount	VisionCentricQA	image, text	eng
CVBenchRelation	VisionCentricQA	text, image	eng
CVBenchDepth	VisionCentricQA	text, image	eng
CVBenchDistance	VisionCentricQA	text, image	eng
AROCocoOrder	Compositionality	text, image	eng
AROFlickrOrder	Compositionality	text, image	eng
AROVisualAttribution	Compositionality	text, image	eng
AROVisualRelation	Compositionality	text, image	eng
Winoground	Compositionality	text, image	eng
ImageCoDe	Compositionality	text, image	eng
STS13VisualSTS	VisualSTS(eng)	image	eng
STS15VisualSTS	VisualSTS(eng)	image	eng
VisualSTS17Multilingual	VisualSTS(multi)	image	ara, deu, eng, fra, ita, ... (9)
VisualSTS-b-Multilingual	VisualSTS(multi)	image	cmn, deu, eng, fra, ita, ... (10)
CIRRIT2IRetrieval	Any2AnyRetrieval	text, image	eng
CUB200I2IRetrieval	Any2AnyRetrieval	image	eng
Fashion200kI2TRetrieval	Any2AnyRetrieval	text, image	eng
HatefulMemesI2TRetrieval	Any2AnyRetrieval	text, image	eng
InfoSeekIT2TRetrieval	Any2AnyRetrieval	text, image	eng
NIGHTSI2IRetrieval	Any2AnyRetrieval	image	eng
OVENIT2TRetrieval	Any2AnyRetrieval	text, image	eng
RP2kI2IRetrieval	Any2AnyRetrieval	image	eng
VidoreDocVQARetrieval	DocumentUnderstanding	text, image	eng
VidoreInfoVQARetrieval	DocumentUnderstanding	text, image	eng
VidoreTabfquadRetrieval	DocumentUnderstanding	text, image	eng
VidoreTatdqaRetrieval	DocumentUnderstanding	text, image	eng
VidoreShiftProjectRetrieval	DocumentUnderstanding	text, image	eng
VidoreSyntheticDocQAAIRetrieval	DocumentUnderstanding	text, image	eng
VisualNewsI2TRetrieval	Any2AnyRetrieval	image, text	eng
VQA2IT2TRetrieval	Any2AnyRetrieval	text, image	eng
WebQAT2ITRetrieval	Any2AnyRetrieval	image, text	eng
WITT2IRetrieval	Any2AnyMultilingualRetrieval	text, image	ara, bul, dan, ell, eng, ... (11)
XM3600T2IRetrieval	Any2AnyMultilingualRetrieval	text, image	ara, ben, ces, dan, deu, ... (38)

Citation

@inproceedings{xiao2025mieb,
  author = {Xiao, Chenghao and Chung, Isaac and Kerboua, Imene and Stirling, Jamie and Zhang, Xin and Kardos, M\'arton and Solomatin, Roman and Al Moubayed, Noura and Enevoldsen, Kenneth and Muennighoff, Niklas},
  booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
  month = {October},
  pages = {22187-22198},
  title = {MIEB: Massive Image Embedding Benchmark},
  year = {2025},
}

`MINERSBitextMining`¶

Multilingual bitext mining quality across diverse language pairs, drawn from the MINERS benchmark for evaluating semantic retrieval in multilingual settings.

Learn more →

Tasks

name	type	modalities	languages
BUCC	BitextMining	text	cmn, deu, eng, fra, rus
LinceMTBitextMining	BitextMining	text	eng, hin
NollySentiBitextMining	BitextMining	text	eng, hau, ibo, pcm, yor
NusaXBitextMining	BitextMining	text	ace, ban, bbc, bjn, bug, ... (12)
NusaTranslationBitextMining	BitextMining	text	abs, bbc, bew, bhp, ind, ... (12)
PhincBitextMining	BitextMining	text	eng, hin
Tatoeba	BitextMining	text	afr, amh, ang, ara, arq, ... (113)

Citation

@article{winata2024miners,
  author = {Winata, Genta Indra and Zhang, Ruochen and Adelani, David Ifeoluwa},
  journal = {arXiv preprint arXiv:2406.07424},
  title = {MINERS: Multilingual Language Models as Semantic Retrievers},
  year = {2024},
}

`MTEB(Code, v1)`¶

Code retrieval quality across a wide range of popular programming languages, covering code search, text-to-SQL, and code feedback tasks.

Tasks

name	type	modalities	languages
AppsRetrieval	Retrieval	text	eng, python
CodeEditSearchRetrieval	Retrieval	text	c, c++, go, java, javascript, ... (13)
CodeFeedbackMT	Retrieval	text	eng
CodeFeedbackST	Retrieval	text	eng
CodeSearchNetCCRetrieval	Retrieval	text	go, java, javascript, php, python, ... (6)
CodeSearchNetRetrieval	Retrieval	text	go, java, javascript, php, python, ... (6)
CodeTransOceanContest	Retrieval	text	c++, python
CodeTransOceanDL	Retrieval	text	python
CosQA	Retrieval	text	eng, python
COIRCodeSearchNetRetrieval	Retrieval	text	go, java, javascript, php, python, ... (6)
StackOverflowQA	Retrieval	text	eng
SyntheticText2SQL	Retrieval	text	eng, sql

Citation

@article{enevoldsen2025mmtebmassivemultilingualtext,
  author = {Kenneth Enevoldsen and Isaac Chung and Imene Kerboua and Márton Kardos and Ashwin Mathur and David Stap and Jay Gala and Wissam Siblini and Dominik Krzemiński and Genta Indra Winata and Saba Sturua and Saiteja Utpala and Mathieu Ciancone and Marion Schaeffer and Gabriel Sequeira and Diganta Misra and Shreeya Dhakal and Jonathan Rystrøm and Roman Solomatin and Ömer Çağatan and Akash Kundu and Martin Bernstorff and Shitao Xiao and Akshita Sukhlecha and Bhavish Pahwa and Rafał Poświata and Kranthi Kiran GV and Shawon Ashraf and Daniel Auras and Björn Plüster and Jan Philipp Harries and Loïc Magne and Isabelle Mohr and Mariya Hendriksen and Dawei Zhu and Hippolyte Gisserot-Boukhlef and Tom Aarsen and Jan Kostkan and Konrad Wojtasik and Taemin Lee and Marek Šuppa and Crystina Zhang and Roberta Rocca and Mohammed Hamdy and Andrianos Michail and John Yang and Manuel Faysse and Aleksei Vatolin and Nandan Thakur and Manan Dey and Dipam Vasani and Pranjal Chitale and Simone Tedeschi and Nguyen Tai and Artem Snegirev and Michael Günther and Mengzhou Xia and Weijia Shi and Xing Han Lù and Jordan Clive and Gayatri Krishnakumar and Anna Maksimova and Silvan Wehrli and Maria Tikhonova and Henil Panchal and Aleksandr Abramov and Malte Ostendorff and Zheng Liu and Simon Clematide and Lester James Miranda and Alena Fenogenova and Guangyu Song and Ruqiya Bin Safi and Wen-Ding Li and Alessia Borghini and Federico Cassano and Hongjin Su and Jimmy Lin and Howard Yen and Lasse Hansen and Sara Hooker and Chenghao Xiao and Vaibhav Adlakha and Orion Weller and Siva Reddy and Niklas Muennighoff},
  doi = {10.48550/arXiv.2502.13595},
  journal = {arXiv preprint arXiv:2502.13595},
  publisher = {arXiv},
  title = {MMTEB: Massive Multilingual Text Embedding Benchmark},
  url = {https://arxiv.org/abs/2502.13595},
  year = {2025},
}

`MTEB(Europe, v1)`¶

Text embedding quality across European languages spanning bitext mining, classification, clustering, pair classification, retrieval, reranking, and semantic similarity.

Tasks

name	type	modalities	languages
BornholmBitextMining	BitextMining	text	dan
BibleNLPBitextMining	BitextMining	text	aai, aak, aau, aaz, abt, ... (829)
BUCC.v2	BitextMining	text	cmn, deu, eng, fra, rus
DiaBlaBitextMining	BitextMining	text	eng, fra
FloresBitextMining	BitextMining	text	ace, acm, acq, aeb, afr, ... (196)
NorwegianCourtsBitextMining	BitextMining	text	nno, nob
NTREXBitextMining	BitextMining	text	afr, amh, arb, aze, bak, ... (119)
BulgarianStoreReviewSentimentClassfication	Classification	text	bul
CzechProductReviewSentimentClassification	Classification	text	ces
GreekLegalCodeClassification	Classification	text	ell
DBpediaClassification	Classification	text	eng
FinancialPhrasebankClassification	Classification	text	eng
PoemSentimentClassification	Classification	text	eng
ToxicChatClassification	Classification	text	eng
ToxicConversationsClassification	Classification	text	eng
EstonianValenceClassification	Classification	text	est
ItaCaseholdClassification	Classification	text	ita
AmazonCounterfactualClassification	Classification	text	deu, eng, jpn
MassiveScenarioClassification	Classification	text	afr, amh, ara, aze, ben, ... (50)
MultiHateClassification	Classification	text	ara, cmn, deu, eng, fra, ... (11)
ScalaClassification	Classification	text	dan, nno, nob, swe
SwissJudgementClassification	Classification	text	deu, fra, ita
TweetSentimentClassification	Classification	text	ara, deu, eng, fra, hin, ... (8)
CBD	Classification	text	pol
PolEmo2.0-OUT	Classification	text	pol
CSFDSKMovieReviewSentimentClassification	Classification	text	slk
DalajClassification	Classification	text	swe
WikiCitiesClustering	Clustering	text	eng
RomaniBibleClustering	Clustering	text	rom
BigPatentClustering.v2	Clustering	text	eng
BiorxivClusteringP2P.v2	Clustering	text	eng
AlloProfClusteringS2S.v2	Clustering	text	fra
HALClusteringS2S.v2	Clustering	text	fra
SIB200ClusteringS2S	Clustering	text	ace, acm, acq, aeb, afr, ... (197)
WikiClusteringP2P.v2	Clustering	text	bos, cat, ces, dan, eus, ... (14)
StackOverflowQA	Retrieval	text	eng
TwitterHjerneRetrieval	Retrieval	text	dan
LegalQuAD	Retrieval	text	deu
ArguAna	Retrieval	text	eng
HagridRetrieval	Retrieval	text	eng
LegalBenchCorporateLobbying	Retrieval	text	eng
LEMBPasskeyRetrieval	Retrieval	text	eng
SCIDOCS	Retrieval	text	eng
SpartQA	Retrieval	text	eng
TempReasonL1	Retrieval	text	eng
WinoGrande	Retrieval	text	eng
AlloprofRetrieval	Retrieval	text	fra
BelebeleRetrieval	Retrieval	text	acm, afr, als, amh, apc, ... (115)
StatcanDialogueDatasetRetrieval	Retrieval	text	eng, fra
WikipediaRetrievalMultilingual	Retrieval	text	ben, bul, ces, dan, deu, ... (16)
Core17InstructionRetrieval	InstructionReranking	text	eng
News21InstructionRetrieval	InstructionReranking	text	eng
Robust04InstructionRetrieval	InstructionReranking	text	eng
MalteseNewsClassification	MultilabelClassification	text	mlt
MultiEURLEXMultilabelClassification	MultilabelClassification	text	bul, ces, dan, deu, ell, ... (23)
CTKFactsNLI	PairClassification	text	ces
SprintDuplicateQuestions	PairClassification	text	eng
OpusparcusPC	PairClassification	text	deu, eng, fin, fra, rus, ... (6)
RTE3	PairClassification	text	deu, eng, fra, ita
XNLI	PairClassification	text	ara, bul, deu, ell, eng, ... (14)
PSC	PairClassification	text	pol
WebLINXCandidatesReranking	Reranking	text	eng
AlloprofReranking	Reranking	text	fra
WikipediaRerankingMultilingual	Reranking	text	ben, bul, ces, dan, deu, ... (18)
SICK-R	STS	text	eng
STS12	STS	text	eng
STS14	STS	text	eng
STS15	STS	text	eng
STSBenchmark	STS	text	eng
FinParaSTS	STS	text	fin
STS17	STS	text	ara, deu, eng, fra, ita, ... (9)
SICK-R-PL	STS	text	pol
STSES	STS	text	spa

Citation

@article{enevoldsen2025mmtebmassivemultilingualtext,
  author = {Kenneth Enevoldsen and Isaac Chung and Imene Kerboua and Márton Kardos and Ashwin Mathur and David Stap and Jay Gala and Wissam Siblini and Dominik Krzemiński and Genta Indra Winata and Saba Sturua and Saiteja Utpala and Mathieu Ciancone and Marion Schaeffer and Gabriel Sequeira and Diganta Misra and Shreeya Dhakal and Jonathan Rystrøm and Roman Solomatin and Ömer Çağatan and Akash Kundu and Martin Bernstorff and Shitao Xiao and Akshita Sukhlecha and Bhavish Pahwa and Rafał Poświata and Kranthi Kiran GV and Shawon Ashraf and Daniel Auras and Björn Plüster and Jan Philipp Harries and Loïc Magne and Isabelle Mohr and Mariya Hendriksen and Dawei Zhu and Hippolyte Gisserot-Boukhlef and Tom Aarsen and Jan Kostkan and Konrad Wojtasik and Taemin Lee and Marek Šuppa and Crystina Zhang and Roberta Rocca and Mohammed Hamdy and Andrianos Michail and John Yang and Manuel Faysse and Aleksei Vatolin and Nandan Thakur and Manan Dey and Dipam Vasani and Pranjal Chitale and Simone Tedeschi and Nguyen Tai and Artem Snegirev and Michael Günther and Mengzhou Xia and Weijia Shi and Xing Han Lù and Jordan Clive and Gayatri Krishnakumar and Anna Maksimova and Silvan Wehrli and Maria Tikhonova and Henil Panchal and Aleksandr Abramov and Malte Ostendorff and Zheng Liu and Simon Clematide and Lester James Miranda and Alena Fenogenova and Guangyu Song and Ruqiya Bin Safi and Wen-Ding Li and Alessia Borghini and Federico Cassano and Hongjin Su and Jimmy Lin and Howard Yen and Lasse Hansen and Sara Hooker and Chenghao Xiao and Vaibhav Adlakha and Orion Weller and Siva Reddy and Niklas Muennighoff},
  doi = {10.48550/arXiv.2502.13595},
  journal = {arXiv preprint arXiv:2502.13595},
  publisher = {arXiv},
  title = {MMTEB: Massive Multilingual Text Embedding Benchmark},
  url = {https://arxiv.org/abs/2502.13595},
  year = {2025},
}

`MTEB(Indic, v1)`¶

Text embedding quality across Indic languages spanning bitext mining, classification, clustering, pair classification, retrieval, reranking, and semantic similarity.

Tasks

name	type	modalities	languages
IN22ConvBitextMining	BitextMining	text	asm, ben, brx, doi, eng, ... (23)
IN22GenBitextMining	BitextMining	text	asm, ben, brx, doi, eng, ... (23)
SIB200ClusteringS2S	Clustering	text	ace, acm, acq, aeb, afr, ... (197)
BengaliSentimentAnalysis	Classification	text	ben
GujaratiNewsClassification	Classification	text	guj
HindiDiscourseClassification	Classification	text	hin
SentimentAnalysisHindi	Classification	text	hin
MalayalamNewsClassification	Classification	text	mal
MTOPIntentClassification	Classification	text	deu, eng, fra, hin, spa, ... (6)
MultiHateClassification	Classification	text	ara, cmn, deu, eng, fra, ... (11)
TweetSentimentClassification	Classification	text	ara, deu, eng, fra, hin, ... (8)
NepaliNewsClassification	Classification	text	nep
PunjabiNewsClassification	Classification	text	pan
SanskritShlokasClassification	Classification	text	san
UrduRomanSentimentClassification	Classification	text	urd
XNLI	PairClassification	text	ara, bul, deu, ell, eng, ... (14)
BelebeleRetrieval	Retrieval	text	acm, afr, als, amh, apc, ... (115)
XQuADRetrieval	Retrieval	text	arb, deu, ell, eng, hin, ... (12)
WikipediaRerankingMultilingual	Reranking	text	ben, bul, ces, dan, deu, ... (18)
IndicCrosslingualSTS	STS	text	asm, ben, eng, guj, hin, ... (13)

Citation

@article{enevoldsen2025mmtebmassivemultilingualtext,
  author = {Kenneth Enevoldsen and Isaac Chung and Imene Kerboua and Márton Kardos and Ashwin Mathur and David Stap and Jay Gala and Wissam Siblini and Dominik Krzemiński and Genta Indra Winata and Saba Sturua and Saiteja Utpala and Mathieu Ciancone and Marion Schaeffer and Gabriel Sequeira and Diganta Misra and Shreeya Dhakal and Jonathan Rystrøm and Roman Solomatin and Ömer Çağatan and Akash Kundu and Martin Bernstorff and Shitao Xiao and Akshita Sukhlecha and Bhavish Pahwa and Rafał Poświata and Kranthi Kiran GV and Shawon Ashraf and Daniel Auras and Björn Plüster and Jan Philipp Harries and Loïc Magne and Isabelle Mohr and Mariya Hendriksen and Dawei Zhu and Hippolyte Gisserot-Boukhlef and Tom Aarsen and Jan Kostkan and Konrad Wojtasik and Taemin Lee and Marek Šuppa and Crystina Zhang and Roberta Rocca and Mohammed Hamdy and Andrianos Michail and John Yang and Manuel Faysse and Aleksei Vatolin and Nandan Thakur and Manan Dey and Dipam Vasani and Pranjal Chitale and Simone Tedeschi and Nguyen Tai and Artem Snegirev and Michael Günther and Mengzhou Xia and Weijia Shi and Xing Han Lù and Jordan Clive and Gayatri Krishnakumar and Anna Maksimova and Silvan Wehrli and Maria Tikhonova and Henil Panchal and Aleksandr Abramov and Malte Ostendorff and Zheng Liu and Simon Clematide and Lester James Miranda and Alena Fenogenova and Guangyu Song and Ruqiya Bin Safi and Wen-Ding Li and Alessia Borghini and Federico Cassano and Hongjin Su and Jimmy Lin and Howard Yen and Lasse Hansen and Sara Hooker and Chenghao Xiao and Vaibhav Adlakha and Orion Weller and Siva Reddy and Niklas Muennighoff},
  doi = {10.48550/arXiv.2502.13595},
  journal = {arXiv preprint arXiv:2502.13595},
  publisher = {arXiv},
  title = {MMTEB: Massive Multilingual Text Embedding Benchmark},
  url = {https://arxiv.org/abs/2502.13595},
  year = {2025},
}

`MTEB(Law, v1)`¶

Legal document retrieval across case documents, statutes, legal Q&A, and legal summarization in multiple languages.

Tasks

name	type	modalities	languages
AILACasedocs	Retrieval	text	eng
AILAStatutes	Retrieval	text	eng
LegalSummarization	Retrieval	text	eng
GerDaLIRSmall	Retrieval	text	deu
LeCaRDv2	Retrieval	text	zho
LegalBenchConsumerContractsQA	Retrieval	text	eng
LegalBenchCorporateLobbying	Retrieval	text	eng
LegalQuAD	Retrieval	text	deu

`MTEB(Medical, v1)`¶

Medical information retrieval across clinical, biomedical, and consumer health domains, spanning retrieval, reranking, and clustering tasks.

Tasks

name	type	modalities	languages
CUREv1	Retrieval	text	eng, fra, spa
NFCorpus	Retrieval	text	eng
TRECCOVID	Retrieval	text	eng
TRECCOVID-PL	Retrieval	text	pol
SciFact	Retrieval	text	eng
SciFact-PL	Retrieval	text	pol
MedicalQARetrieval	Retrieval	text	eng
PublicHealthQA	Retrieval	text	ara, eng, fra, kor, rus, ... (8)
MedrxivClusteringP2P.v2	Clustering	text	eng
MedrxivClusteringS2S.v2	Clustering	text	eng
CmedqaRetrieval	Retrieval	text	cmn
CMedQAv2-reranking	Reranking	text	cmn

`MTEB(Multilingual, v1)`¶

Multilingual text embedding quality across 250+ languages spanning bitext mining, classification, clustering, retrieval, reranking, and semantic similarity. Superseded by MTEB(Multilingual, v2) after SNLHierarchicalClustering was removed from Hugging Face Hub.

Learn more →

Tasks

name	type	modalities	languages
BornholmBitextMining	BitextMining	text	dan
BibleNLPBitextMining	BitextMining	text	aai, aak, aau, aaz, abt, ... (829)
BUCC.v2	BitextMining	text	cmn, deu, eng, fra, rus
DiaBlaBitextMining	BitextMining	text	eng, fra
FloresBitextMining	BitextMining	text	ace, acm, acq, aeb, afr, ... (196)
IN22GenBitextMining	BitextMining	text	asm, ben, brx, doi, eng, ... (23)
IndicGenBenchFloresBitextMining	BitextMining	text	asm, awa, ben, bgc, bho, ... (30)
NollySentiBitextMining	BitextMining	text	eng, hau, ibo, pcm, yor
NorwegianCourtsBitextMining	BitextMining	text	nno, nob
NTREXBitextMining	BitextMining	text	afr, amh, arb, aze, bak, ... (119)
NusaTranslationBitextMining	BitextMining	text	abs, bbc, bew, bhp, ind, ... (12)
NusaXBitextMining	BitextMining	text	ace, ban, bbc, bjn, bug, ... (12)
Tatoeba	BitextMining	text	afr, amh, ang, ara, arq, ... (113)
BulgarianStoreReviewSentimentClassfication	Classification	text	bul
CzechProductReviewSentimentClassification	Classification	text	ces
GreekLegalCodeClassification	Classification	text	ell
DBpediaClassification	Classification	text	eng
FinancialPhrasebankClassification	Classification	text	eng
PoemSentimentClassification	Classification	text	eng
ToxicConversationsClassification	Classification	text	eng
TweetTopicSingleClassification	Classification	text	eng
EstonianValenceClassification	Classification	text	est
FilipinoShopeeReviewsClassification	Classification	text	fil
GujaratiNewsClassification	Classification	text	guj
SentimentAnalysisHindi	Classification	text	hin
IndonesianIdClickbaitClassification	Classification	text	ind
ItaCaseholdClassification	Classification	text	ita
KorSarcasmClassification	Classification	text	kor
KurdishSentimentClassification	Classification	text	kur
MacedonianTweetSentimentClassification	Classification	text	mkd
AfriSentiClassification	Classification	text	amh, arq, ary, hau, ibo, ... (12)
AmazonCounterfactualClassification	Classification	text	deu, eng, jpn
CataloniaTweetClassification	Classification	text	cat, spa
CyrillicTurkicLangClassification	Classification	text	bak, chv, kaz, kir, krc, ... (9)
IndicLangClassification	Classification	text	asm, ben, brx, doi, gom, ... (22)
MasakhaNEWSClassification	Classification	text	amh, eng, fra, hau, ibo, ... (16)
MassiveIntentClassification	Classification	text	afr, amh, ara, aze, ben, ... (50)
MultiHateClassification	Classification	text	ara, cmn, deu, eng, fra, ... (11)
NordicLangClassification	Classification	text	dan, fao, isl, nno, nob, ... (6)
NusaParagraphEmotionClassification	Classification	text	bbc, bew, bug, jav, mad, ... (10)
NusaX-senti	Classification	text	ace, ban, bbc, bjn, bug, ... (12)
ScalaClassification	Classification	text	dan, nno, nob, swe
SwissJudgementClassification	Classification	text	deu, fra, ita
NepaliNewsClassification	Classification	text	nep
OdiaNewsClassification	Classification	text	ory
PunjabiNewsClassification	Classification	text	pan
PolEmo2.0-OUT	Classification	text	pol
PAC	Classification	text	pol
SinhalaNewsClassification	Classification	text	sin
CSFDSKMovieReviewSentimentClassification	Classification	text	slk
SiswatiNewsClassification	Classification	text	ssw
SlovakMovieReviewSentimentClassification	Classification	text	slk
SwahiliNewsClassification	Classification	text	swa
DalajClassification	Classification	text	swe
TswanaNewsClassification	Classification	text	tsn
IsiZuluNewsClassification	Classification	text	zul
WikiCitiesClustering	Clustering	text	eng
MasakhaNEWSClusteringS2S	Clustering	text	amh, eng, fra, hau, ibo, ... (16)
RomaniBibleClustering	Clustering	text	rom
ArXivHierarchicalClusteringP2P	Clustering	text	eng
ArXivHierarchicalClusteringS2S	Clustering	text	eng
BigPatentClustering.v2	Clustering	text	eng
BiorxivClusteringP2P.v2	Clustering	text	eng
MedrxivClusteringP2P.v2	Clustering	text	eng
StackExchangeClustering.v2	Clustering	text	eng
AlloProfClusteringS2S.v2	Clustering	text	fra
HALClusteringS2S.v2	Clustering	text	fra
SIB200ClusteringS2S	Clustering	text	ace, acm, acq, aeb, afr, ... (197)
WikiClusteringP2P.v2	Clustering	text	bos, cat, ces, dan, eus, ... (14)
PlscClusteringP2P.v2	Clustering	text	pol
SwednClusteringP2P	Clustering	text	swe
CLSClusteringP2P.v2	Clustering	text	cmn
StackOverflowQA	Retrieval	text	eng
TwitterHjerneRetrieval	Retrieval	text	dan
AILAStatutes	Retrieval	text	eng
ArguAna	Retrieval	text	eng
HagridRetrieval	Retrieval	text	eng
LegalBenchCorporateLobbying	Retrieval	text	eng
LEMBPasskeyRetrieval	Retrieval	text	eng
SCIDOCS	Retrieval	text	eng
SpartQA	Retrieval	text	eng
TempReasonL1	Retrieval	text	eng
TRECCOVID	Retrieval	text	eng
WinoGrande	Retrieval	text	eng
BelebeleRetrieval	Retrieval	text	acm, afr, als, amh, apc, ... (115)
MLQARetrieval	Retrieval	text	ara, deu, eng, hin, spa, ... (7)
StatcanDialogueDatasetRetrieval	Retrieval	text	eng, fra
WikipediaRetrievalMultilingual	Retrieval	text	ben, bul, ces, dan, deu, ... (16)
CovidRetrieval	Retrieval	text	cmn
Core17InstructionRetrieval	InstructionReranking	text	eng
News21InstructionRetrieval	InstructionReranking	text	eng
Robust04InstructionRetrieval	InstructionReranking	text	eng
KorHateSpeechMLClassification	MultilabelClassification	text	kor
MalteseNewsClassification	MultilabelClassification	text	mlt
MultiEURLEXMultilabelClassification	MultilabelClassification	text	bul, ces, dan, deu, ell, ... (23)
BrazilianToxicTweetsClassification	MultilabelClassification	text	por
CEDRClassification	MultilabelClassification	text	rus
CTKFactsNLI	PairClassification	text	ces
SprintDuplicateQuestions	PairClassification	text	eng
TwitterURLCorpus	PairClassification	text	eng
ArmenianParaphrasePC	PairClassification	text	hye
indonli	PairClassification	text	ind
OpusparcusPC	PairClassification	text	deu, eng, fin, fra, rus, ... (6)
PawsXPairClassification	PairClassification	text	cmn, deu, eng, fra, jpn, ... (7)
RTE3	PairClassification	text	deu, eng, fra, ita
XNLI	PairClassification	text	ara, bul, deu, ell, eng, ... (14)
PpcPC	PairClassification	text	pol
TERRa	PairClassification	text	rus
WebLINXCandidatesReranking	Reranking	text	eng
AlloprofReranking	Reranking	text	fra
VoyageMMarcoReranking	Reranking	text	jpn
WikipediaRerankingMultilingual	Reranking	text	ben, bul, ces, dan, deu, ... (18)
RuBQReranking	Reranking	text	rus
T2Reranking	Reranking	text	cmn
GermanSTSBenchmark	STS	text	deu
SICK-R	STS	text	eng
STS12	STS	text	eng
STS13	STS	text	eng
STS14	STS	text	eng
STS15	STS	text	eng
STSBenchmark	STS	text	eng
FaroeseSTS	STS	text	fao
FinParaSTS	STS	text	fin
JSICK	STS	text	jpn
IndicCrosslingualSTS	STS	text	asm, ben, eng, guj, hin, ... (13)
SemRel24STS	STS	text	afr, amh, arb, arq, ary, ... (12)
STS17	STS	text	ara, deu, eng, fra, ita, ... (9)
STS22.v2	STS	text	ara, cmn, deu, eng, fra, ... (10)
STSES	STS	text	spa
STSB	STS	text	cmn
MIRACLRetrievalHardNegatives	Retrieval	text	ara, ben, deu, eng, fas, ... (18)
SNLHierarchicalClusteringP2P	Clustering	text	nob

Citation

@article{enevoldsen2025mmtebmassivemultilingualtext,
  author = {Kenneth Enevoldsen and Isaac Chung and Imene Kerboua and Márton Kardos and Ashwin Mathur and David Stap and Jay Gala and Wissam Siblini and Dominik Krzemiński and Genta Indra Winata and Saba Sturua and Saiteja Utpala and Mathieu Ciancone and Marion Schaeffer and Gabriel Sequeira and Diganta Misra and Shreeya Dhakal and Jonathan Rystrøm and Roman Solomatin and Ömer Çağatan and Akash Kundu and Martin Bernstorff and Shitao Xiao and Akshita Sukhlecha and Bhavish Pahwa and Rafał Poświata and Kranthi Kiran GV and Shawon Ashraf and Daniel Auras and Björn Plüster and Jan Philipp Harries and Loïc Magne and Isabelle Mohr and Mariya Hendriksen and Dawei Zhu and Hippolyte Gisserot-Boukhlef and Tom Aarsen and Jan Kostkan and Konrad Wojtasik and Taemin Lee and Marek Šuppa and Crystina Zhang and Roberta Rocca and Mohammed Hamdy and Andrianos Michail and John Yang and Manuel Faysse and Aleksei Vatolin and Nandan Thakur and Manan Dey and Dipam Vasani and Pranjal Chitale and Simone Tedeschi and Nguyen Tai and Artem Snegirev and Michael Günther and Mengzhou Xia and Weijia Shi and Xing Han Lù and Jordan Clive and Gayatri Krishnakumar and Anna Maksimova and Silvan Wehrli and Maria Tikhonova and Henil Panchal and Aleksandr Abramov and Malte Ostendorff and Zheng Liu and Simon Clematide and Lester James Miranda and Alena Fenogenova and Guangyu Song and Ruqiya Bin Safi and Wen-Ding Li and Alessia Borghini and Federico Cassano and Hongjin Su and Jimmy Lin and Howard Yen and Lasse Hansen and Sara Hooker and Chenghao Xiao and Vaibhav Adlakha and Orion Weller and Siva Reddy and Niklas Muennighoff},
  doi = {10.48550/arXiv.2502.13595},
  journal = {arXiv preprint arXiv:2502.13595},
  publisher = {arXiv},
  title = {MMTEB: Massive Multilingual Text Embedding Benchmark},
  url = {https://arxiv.org/abs/2502.13595},
  year = {2025},
}

`MTEB(Multilingual, v2)`¶

MMTEB measures multilingual text embedding quality across 250+ languages spanning classification, clustering, retrieval semantic similarity and more, driven by curated community contributions.

Learn more →

Tasks

name	type	modalities	languages
BornholmBitextMining	BitextMining	text	dan
BibleNLPBitextMining	BitextMining	text	aai, aak, aau, aaz, abt, ... (829)
BUCC.v2	BitextMining	text	cmn, deu, eng, fra, rus
DiaBlaBitextMining	BitextMining	text	eng, fra
FloresBitextMining	BitextMining	text	ace, acm, acq, aeb, afr, ... (196)
IN22GenBitextMining	BitextMining	text	asm, ben, brx, doi, eng, ... (23)
IndicGenBenchFloresBitextMining	BitextMining	text	asm, awa, ben, bgc, bho, ... (30)
NollySentiBitextMining	BitextMining	text	eng, hau, ibo, pcm, yor
NorwegianCourtsBitextMining	BitextMining	text	nno, nob
NTREXBitextMining	BitextMining	text	afr, amh, arb, aze, bak, ... (119)
NusaTranslationBitextMining	BitextMining	text	abs, bbc, bew, bhp, ind, ... (12)
NusaXBitextMining	BitextMining	text	ace, ban, bbc, bjn, bug, ... (12)
Tatoeba	BitextMining	text	afr, amh, ang, ara, arq, ... (113)
BulgarianStoreReviewSentimentClassfication	Classification	text	bul
CzechProductReviewSentimentClassification	Classification	text	ces
GreekLegalCodeClassification	Classification	text	ell
DBpediaClassification	Classification	text	eng
FinancialPhrasebankClassification	Classification	text	eng
PoemSentimentClassification	Classification	text	eng
ToxicConversationsClassification	Classification	text	eng
TweetTopicSingleClassification	Classification	text	eng
EstonianValenceClassification	Classification	text	est
FilipinoShopeeReviewsClassification	Classification	text	fil
GujaratiNewsClassification	Classification	text	guj
SentimentAnalysisHindi	Classification	text	hin
IndonesianIdClickbaitClassification	Classification	text	ind
ItaCaseholdClassification	Classification	text	ita
KorSarcasmClassification	Classification	text	kor
KurdishSentimentClassification	Classification	text	kur
MacedonianTweetSentimentClassification	Classification	text	mkd
AfriSentiClassification	Classification	text	amh, arq, ary, hau, ibo, ... (12)
AmazonCounterfactualClassification	Classification	text	deu, eng, jpn
CataloniaTweetClassification	Classification	text	cat, spa
CyrillicTurkicLangClassification	Classification	text	bak, chv, kaz, kir, krc, ... (9)
IndicLangClassification	Classification	text	asm, ben, brx, doi, gom, ... (22)
MasakhaNEWSClassification	Classification	text	amh, eng, fra, hau, ibo, ... (16)
MassiveIntentClassification	Classification	text	afr, amh, ara, aze, ben, ... (50)
MultiHateClassification	Classification	text	ara, cmn, deu, eng, fra, ... (11)
NordicLangClassification	Classification	text	dan, fao, isl, nno, nob, ... (6)
NusaParagraphEmotionClassification	Classification	text	bbc, bew, bug, jav, mad, ... (10)
NusaX-senti	Classification	text	ace, ban, bbc, bjn, bug, ... (12)
ScalaClassification	Classification	text	dan, nno, nob, swe
SwissJudgementClassification	Classification	text	deu, fra, ita
NepaliNewsClassification	Classification	text	nep
OdiaNewsClassification	Classification	text	ory
PunjabiNewsClassification	Classification	text	pan
PolEmo2.0-OUT	Classification	text	pol
PAC	Classification	text	pol
SinhalaNewsClassification	Classification	text	sin
CSFDSKMovieReviewSentimentClassification	Classification	text	slk
SiswatiNewsClassification	Classification	text	ssw
SlovakMovieReviewSentimentClassification	Classification	text	slk
SwahiliNewsClassification	Classification	text	swa
DalajClassification	Classification	text	swe
TswanaNewsClassification	Classification	text	tsn
IsiZuluNewsClassification	Classification	text	zul
WikiCitiesClustering	Clustering	text	eng
MasakhaNEWSClusteringS2S	Clustering	text	amh, eng, fra, hau, ibo, ... (16)
RomaniBibleClustering	Clustering	text	rom
ArXivHierarchicalClusteringP2P	Clustering	text	eng
ArXivHierarchicalClusteringS2S	Clustering	text	eng
BigPatentClustering.v2	Clustering	text	eng
BiorxivClusteringP2P.v2	Clustering	text	eng
MedrxivClusteringP2P.v2	Clustering	text	eng
StackExchangeClustering.v2	Clustering	text	eng
AlloProfClusteringS2S.v2	Clustering	text	fra
HALClusteringS2S.v2	Clustering	text	fra
SIB200ClusteringS2S	Clustering	text	ace, acm, acq, aeb, afr, ... (197)
WikiClusteringP2P.v2	Clustering	text	bos, cat, ces, dan, eus, ... (14)
PlscClusteringP2P.v2	Clustering	text	pol
SwednClusteringP2P	Clustering	text	swe
CLSClusteringP2P.v2	Clustering	text	cmn
StackOverflowQA	Retrieval	text	eng
TwitterHjerneRetrieval	Retrieval	text	dan
AILAStatutes	Retrieval	text	eng
ArguAna	Retrieval	text	eng
HagridRetrieval	Retrieval	text	eng
LegalBenchCorporateLobbying	Retrieval	text	eng
LEMBPasskeyRetrieval	Retrieval	text	eng
SCIDOCS	Retrieval	text	eng
SpartQA	Retrieval	text	eng
TempReasonL1	Retrieval	text	eng
TRECCOVID	Retrieval	text	eng
WinoGrande	Retrieval	text	eng
BelebeleRetrieval	Retrieval	text	acm, afr, als, amh, apc, ... (115)
MLQARetrieval	Retrieval	text	ara, deu, eng, hin, spa, ... (7)
StatcanDialogueDatasetRetrieval	Retrieval	text	eng, fra
WikipediaRetrievalMultilingual	Retrieval	text	ben, bul, ces, dan, deu, ... (16)
CovidRetrieval	Retrieval	text	cmn
Core17InstructionRetrieval	InstructionReranking	text	eng
News21InstructionRetrieval	InstructionReranking	text	eng
Robust04InstructionRetrieval	InstructionReranking	text	eng
KorHateSpeechMLClassification	MultilabelClassification	text	kor
MalteseNewsClassification	MultilabelClassification	text	mlt
MultiEURLEXMultilabelClassification	MultilabelClassification	text	bul, ces, dan, deu, ell, ... (23)
BrazilianToxicTweetsClassification	MultilabelClassification	text	por
CEDRClassification	MultilabelClassification	text	rus
CTKFactsNLI	PairClassification	text	ces
SprintDuplicateQuestions	PairClassification	text	eng
TwitterURLCorpus	PairClassification	text	eng
ArmenianParaphrasePC	PairClassification	text	hye
indonli	PairClassification	text	ind
OpusparcusPC	PairClassification	text	deu, eng, fin, fra, rus, ... (6)
PawsXPairClassification	PairClassification	text	cmn, deu, eng, fra, jpn, ... (7)
RTE3	PairClassification	text	deu, eng, fra, ita
XNLI	PairClassification	text	ara, bul, deu, ell, eng, ... (14)
PpcPC	PairClassification	text	pol
TERRa	PairClassification	text	rus
WebLINXCandidatesReranking	Reranking	text	eng
AlloprofReranking	Reranking	text	fra
VoyageMMarcoReranking	Reranking	text	jpn
WikipediaRerankingMultilingual	Reranking	text	ben, bul, ces, dan, deu, ... (18)
RuBQReranking	Reranking	text	rus
T2Reranking	Reranking	text	cmn
GermanSTSBenchmark	STS	text	deu
SICK-R	STS	text	eng
STS12	STS	text	eng
STS13	STS	text	eng
STS14	STS	text	eng
STS15	STS	text	eng
STSBenchmark	STS	text	eng
FaroeseSTS	STS	text	fao
FinParaSTS	STS	text	fin
JSICK	STS	text	jpn
IndicCrosslingualSTS	STS	text	asm, ben, eng, guj, hin, ... (13)
SemRel24STS	STS	text	afr, amh, arb, arq, ary, ... (12)
STS17	STS	text	ara, deu, eng, fra, ita, ... (9)
STS22.v2	STS	text	ara, cmn, deu, eng, fra, ... (10)
STSES	STS	text	spa
STSB	STS	text	cmn
MIRACLRetrievalHardNegatives	Retrieval	text	ara, ben, deu, eng, fas, ... (18)

Citation

@article{enevoldsen2025mmtebmassivemultilingualtext,
  author = {Kenneth Enevoldsen and Isaac Chung and Imene Kerboua and Márton Kardos and Ashwin Mathur and David Stap and Jay Gala and Wissam Siblini and Dominik Krzemiński and Genta Indra Winata and Saba Sturua and Saiteja Utpala and Mathieu Ciancone and Marion Schaeffer and Gabriel Sequeira and Diganta Misra and Shreeya Dhakal and Jonathan Rystrøm and Roman Solomatin and Ömer Çağatan and Akash Kundu and Martin Bernstorff and Shitao Xiao and Akshita Sukhlecha and Bhavish Pahwa and Rafał Poświata and Kranthi Kiran GV and Shawon Ashraf and Daniel Auras and Björn Plüster and Jan Philipp Harries and Loïc Magne and Isabelle Mohr and Mariya Hendriksen and Dawei Zhu and Hippolyte Gisserot-Boukhlef and Tom Aarsen and Jan Kostkan and Konrad Wojtasik and Taemin Lee and Marek Šuppa and Crystina Zhang and Roberta Rocca and Mohammed Hamdy and Andrianos Michail and John Yang and Manuel Faysse and Aleksei Vatolin and Nandan Thakur and Manan Dey and Dipam Vasani and Pranjal Chitale and Simone Tedeschi and Nguyen Tai and Artem Snegirev and Michael Günther and Mengzhou Xia and Weijia Shi and Xing Han Lù and Jordan Clive and Gayatri Krishnakumar and Anna Maksimova and Silvan Wehrli and Maria Tikhonova and Henil Panchal and Aleksandr Abramov and Malte Ostendorff and Zheng Liu and Simon Clematide and Lester James Miranda and Alena Fenogenova and Guangyu Song and Ruqiya Bin Safi and Wen-Ding Li and Alessia Borghini and Federico Cassano and Hongjin Su and Jimmy Lin and Howard Yen and Lasse Hansen and Sara Hooker and Chenghao Xiao and Vaibhav Adlakha and Orion Weller and Siva Reddy and Niklas Muennighoff},
  doi = {10.48550/arXiv.2502.13595},
  journal = {arXiv preprint arXiv:2502.13595},
  publisher = {arXiv},
  title = {MMTEB: Massive Multilingual Text Embedding Benchmark},
  url = {https://arxiv.org/abs/2502.13595},
  year = {2025},
}

`MTEB(Scandinavian, v1)`¶

Scandinavian text embedding quality covering Danish, Swedish, Norwegian Bokmål, and Nynorsk and spanning classification, clustering, retrieval as well as bitext tasks across dialects or written forms.

Learn more →

Tasks

name	type	modalities	languages
BornholmBitextMining	BitextMining	text	dan
NorwegianCourtsBitextMining	BitextMining	text	nno, nob
AngryTweetsClassification	Classification	text	dan
DanishPoliticalCommentsClassification	Classification	text	dan
DalajClassification	Classification	text	swe
DKHateClassification	Classification	text	dan
LccSentimentClassification	Classification	text	dan
MassiveIntentClassification	Classification	text	afr, amh, ara, aze, ben, ... (50)
MassiveScenarioClassification	Classification	text	afr, amh, ara, aze, ben, ... (50)
NordicLangClassification	Classification	text	dan, fao, isl, nno, nob, ... (6)
NoRecClassification	Classification	text	nob
NorwegianParliamentClassification	Classification	text	nob
ScalaClassification	Classification	text	dan, nno, nob, swe
SwedishSentimentClassification	Classification	text	swe
SweRecClassification	Classification	text	swe
DanFeverRetrieval	Retrieval	text	dan
NorQuadRetrieval	Retrieval	text	nob
SNLRetrieval	Retrieval	text	nob
SwednRetrieval	Retrieval	text	swe
SweFaqRetrieval	Retrieval	text	swe
TV2Nordretrieval	Retrieval	text	dan
TwitterHjerneRetrieval	Retrieval	text	dan
SNLHierarchicalClusteringS2S	Clustering	text	nob
SNLHierarchicalClusteringP2P	Clustering	text	nob
SwednClusteringP2P	Clustering	text	swe
SwednClusteringS2S	Clustering	text	swe
VGHierarchicalClusteringS2S	Clustering	text	nob
VGHierarchicalClusteringP2P	Clustering	text	nob

Citation

@article{enevoldsenScandinavianEmbeddingBenchmarks2024,
  author = {Enevoldsen, Kenneth and Kardos, Márton and Muennighoff, Niklas and Nielbo, Kristoffer},
  language = {en},
  month = feb,
  shorttitle = {The {Scandinavian} {Embedding} {Benchmarks}},
  title = {The {Scandinavian} {Embedding} {Benchmarks}: {Comprehensive} {Assessment} of {Multilingual} and {Monolingual} {Text} {Embedding}},
  url = {https://openreview.net/forum?id=pJl_i7HIA72},
  urldate = {2024-04-12},
  year = {2024},
}

`MTEB(cmn, v1)`¶

Chinese text embedding quality across retrieval, reranking, pair classification, clustering, classification, and semantic similarity.

Learn more →

Tasks

name	type	modalities	languages
T2Retrieval	Retrieval	text	cmn
MMarcoRetrieval	Retrieval	text	cmn
DuRetrieval	Retrieval	text	cmn
CovidRetrieval	Retrieval	text	cmn
CmedqaRetrieval	Retrieval	text	cmn
EcomRetrieval	Retrieval	text	cmn
MedicalRetrieval	Retrieval	text	cmn
VideoRetrieval	Retrieval	text	cmn
T2Reranking	Reranking	text	cmn
MMarcoReranking	Reranking	text	cmn
CMedQAv1-reranking	Reranking	text	cmn
CMedQAv2-reranking	Reranking	text	cmn
Ocnli	PairClassification	text	cmn
Cmnli	PairClassification	text	cmn
CLSClusteringS2S	Clustering	text	cmn
CLSClusteringP2P	Clustering	text	cmn
ThuNewsClusteringS2S	Clustering	text	cmn
ThuNewsClusteringP2P	Clustering	text	cmn
LCQMC	STS	text	cmn
PAWSX	STS	text	cmn
AFQMC	STS	text	cmn
QBQTC	STS	text	cmn
TNews	Classification	text	cmn
IFlyTek	Classification	text	cmn
Waimai	Classification	text	cmn
OnlineShopping	Classification	text	cmn
JDReview	Classification	text	cmn
MultilingualSentiment	Classification	text	cmn
ATEC	STS	text	cmn
BQ	STS	text	cmn
STSB	STS	text	cmn

Citation

@misc{xiao2024cpackpackagedresourcesadvance,
  archiveprefix = {arXiv},
  author = {Shitao Xiao and Zheng Liu and Peitian Zhang and Niklas Muennighoff and Defu Lian and Jian-Yun Nie},
  eprint = {2309.07597},
  primaryclass = {cs.CL},
  title = {C-Pack: Packaged Resources To Advance General Chinese Embedding},
  url = {https://arxiv.org/abs/2309.07597},
  year = {2024},
}

`MTEB(deu, v1)`¶

German text embedding quality across classification, clustering, pair classification, reranking, retrieval, and semantic similarity.

Learn more →

Tasks

name	type	modalities	languages
AmazonCounterfactualClassification	Classification	text	deu, eng, jpn
AmazonReviewsClassification	Classification	text	cmn, deu, eng, fra, jpn, ... (6)
MTOPDomainClassification	Classification	text	deu, eng, fra, hin, spa, ... (6)
MTOPIntentClassification	Classification	text	deu, eng, fra, hin, spa, ... (6)
MassiveIntentClassification	Classification	text	afr, amh, ara, aze, ben, ... (50)
MassiveScenarioClassification	Classification	text	afr, amh, ara, aze, ben, ... (50)
BlurbsClusteringP2P	Clustering	text	deu
BlurbsClusteringS2S	Clustering	text	deu
TenKGnadClusteringP2P	Clustering	text	deu
TenKGnadClusteringS2S	Clustering	text	deu
FalseFriendsGermanEnglish	PairClassification	text	deu
PawsXPairClassification	PairClassification	text	cmn, deu, eng, fra, jpn, ... (7)
MIRACLReranking	Reranking	text	ara, ben, deu, eng, fas, ... (18)
GermanQuAD-Retrieval	Retrieval	text	deu
GermanDPR	Retrieval	text	deu
XMarket	Retrieval	text	deu, eng, spa
GerDaLIR	Retrieval	text	deu
GermanSTSBenchmark	STS	text	deu
STS22	STS	text	ara, cmn, deu, eng, fra, ... (10)

Citation

@misc{wehrli2024germantextembeddingclustering,
  archiveprefix = {arXiv},
  author = {Silvan Wehrli and Bert Arnrich and Christopher Irrgang},
  eprint = {2401.02709},
  primaryclass = {cs.CL},
  title = {German Text Embedding Clustering Benchmark},
  url = {https://arxiv.org/abs/2401.02709},
  year = {2024},
}

`MTEB(eng, v1)`¶

English text embedding quality across classification, clustering, retrieval, reranking, pair classification, and semantic similarity. We recommend using MTEB(eng, v2) instead, which resolves a known scoring bug, uses updated task versions, and removes common fine-tuning datasets such as MSMARCO for more comparable scores.

Tasks

name	type	modalities	languages
AmazonPolarityClassification	Classification	text	eng
AmazonReviewsClassification	Classification	text	cmn, deu, eng, fra, jpn, ... (6)
ArguAna	Retrieval	text	eng
ArxivClusteringP2P	Clustering	text	eng
ArxivClusteringS2S	Clustering	text	eng
AskUbuntuDupQuestions	Reranking	text	eng
BIOSSES	STS	text	eng
Banking77Classification	Classification	text	eng
BiorxivClusteringP2P	Clustering	text	eng
BiorxivClusteringS2S	Clustering	text	eng
CQADupstackRetrieval	Retrieval	text	eng
ClimateFEVER	Retrieval	text	eng
DBPedia	Retrieval	text	eng
EmotionClassification	Classification	text	eng
FEVER	Retrieval	text	eng
FiQA2018	Retrieval	text	eng
HotpotQA	Retrieval	text	eng
ImdbClassification	Classification	text	eng
MTOPDomainClassification	Classification	text	deu, eng, fra, hin, spa, ... (6)
MTOPIntentClassification	Classification	text	deu, eng, fra, hin, spa, ... (6)
MassiveIntentClassification	Classification	text	afr, amh, ara, aze, ben, ... (50)
MassiveScenarioClassification	Classification	text	afr, amh, ara, aze, ben, ... (50)
MedrxivClusteringP2P	Clustering	text	eng
MedrxivClusteringS2S	Clustering	text	eng
MindSmallReranking	Reranking	text	eng
NFCorpus	Retrieval	text	eng
NQ	Retrieval	text	eng
QuoraRetrieval	Retrieval	text	eng
RedditClustering	Clustering	text	eng
RedditClusteringP2P	Clustering	text	eng
SCIDOCS	Retrieval	text	eng
SICK-R	STS	text	eng
STS12	STS	text	eng
STS13	STS	text	eng
STS14	STS	text	eng
STS15	STS	text	eng
STS16	STS	text	eng
STSBenchmark	STS	text	eng
SciDocsRR	Reranking	text	eng
SciFact	Retrieval	text	eng
SprintDuplicateQuestions	PairClassification	text	eng
StackExchangeClustering	Clustering	text	eng
StackExchangeClusteringP2P	Clustering	text	eng
StackOverflowDupQuestions	Reranking	text	eng
SummEval	Summarization	text	eng
TRECCOVID	Retrieval	text	eng
Touche2020	Retrieval	text	eng
ToxicConversationsClassification	Classification	text	eng
TweetSentimentExtractionClassification	Classification	text	eng
TwentyNewsgroupsClustering	Clustering	text	eng
TwitterSemEval2015	PairClassification	text	eng
TwitterURLCorpus	PairClassification	text	eng
MSMARCO	Retrieval	text	eng
AmazonCounterfactualClassification	Classification	text	deu, eng, jpn
STS17	STS	text	ara, deu, eng, fra, ita, ... (9)
STS22	STS	text	ara, cmn, deu, eng, fra, ... (10)

Citation

@article{muennighoff2022mteb,
  author = {Muennighoff, Niklas and Tazi, Nouamane and Magne, Loïc and Reimers, Nils},
  doi = {10.48550/ARXIV.2210.07316},
  journal = {arXiv preprint arXiv:2210.07316},
  publisher = {arXiv},
  title = {MTEB: Massive Text Embedding Benchmark},
  url = {https://arxiv.org/abs/2210.07316},
  year = {2022},
}

`MTEB(eng, v2)`¶

English text embedding quality across classification, clustering, retrieval, reranking, pair classification, and semantic similarity, prioritizing tasks not commonly used for fine-tuning to give a more realistic estimate of generalization performance. The original v1 leaderboard is available under MTEB(eng, v1).

Tasks

name	type	modalities	languages
ArguAna	Retrieval	text	eng
ArXivHierarchicalClusteringP2P	Clustering	text	eng
ArXivHierarchicalClusteringS2S	Clustering	text	eng
AskUbuntuDupQuestions	Reranking	text	eng
BIOSSES	STS	text	eng
Banking77Classification	Classification	text	eng
BiorxivClusteringP2P.v2	Clustering	text	eng
CQADupstackGamingRetrieval	Retrieval	text	eng
CQADupstackUnixRetrieval	Retrieval	text	eng
ClimateFEVERHardNegatives	Retrieval	text	eng
FEVERHardNegatives	Retrieval	text	eng
FiQA2018	Retrieval	text	eng
HotpotQAHardNegatives	Retrieval	text	eng
ImdbClassification	Classification	text	eng
MTOPDomainClassification	Classification	text	deu, eng, fra, hin, spa, ... (6)
MassiveIntentClassification	Classification	text	afr, amh, ara, aze, ben, ... (50)
MassiveScenarioClassification	Classification	text	afr, amh, ara, aze, ben, ... (50)
MedrxivClusteringP2P.v2	Clustering	text	eng
MedrxivClusteringS2S.v2	Clustering	text	eng
MindSmallReranking	Reranking	text	eng
SCIDOCS	Retrieval	text	eng
SICK-R	STS	text	eng
STS12	STS	text	eng
STS13	STS	text	eng
STS14	STS	text	eng
STS15	STS	text	eng
STSBenchmark	STS	text	eng
SprintDuplicateQuestions	PairClassification	text	eng
StackExchangeClustering.v2	Clustering	text	eng
StackExchangeClusteringP2P.v2	Clustering	text	eng
TRECCOVID	Retrieval	text	eng
Touche2020Retrieval.v3	Retrieval	text	eng
ToxicConversationsClassification	Classification	text	eng
TweetSentimentExtractionClassification	Classification	text	eng
TwentyNewsgroupsClustering.v2	Clustering	text	eng
TwitterSemEval2015	PairClassification	text	eng
TwitterURLCorpus	PairClassification	text	eng
SummEvalSummarization.v2	Summarization	text	eng
AmazonCounterfactualClassification	Classification	text	deu, eng, jpn
STS17	STS	text	ara, deu, eng, fra, ita, ... (9)
STS22.v2	STS	text	ara, cmn, deu, eng, fra, ... (10)

Citation

@article{enevoldsen2025mmtebmassivemultilingualtext,
  author = {Kenneth Enevoldsen and Isaac Chung and Imene Kerboua and Márton Kardos and Ashwin Mathur and David Stap and Jay Gala and Wissam Siblini and Dominik Krzemiński and Genta Indra Winata and Saba Sturua and Saiteja Utpala and Mathieu Ciancone and Marion Schaeffer and Gabriel Sequeira and Diganta Misra and Shreeya Dhakal and Jonathan Rystrøm and Roman Solomatin and Ömer Çağatan and Akash Kundu and Martin Bernstorff and Shitao Xiao and Akshita Sukhlecha and Bhavish Pahwa and Rafał Poświata and Kranthi Kiran GV and Shawon Ashraf and Daniel Auras and Björn Plüster and Jan Philipp Harries and Loïc Magne and Isabelle Mohr and Mariya Hendriksen and Dawei Zhu and Hippolyte Gisserot-Boukhlef and Tom Aarsen and Jan Kostkan and Konrad Wojtasik and Taemin Lee and Marek Šuppa and Crystina Zhang and Roberta Rocca and Mohammed Hamdy and Andrianos Michail and John Yang and Manuel Faysse and Aleksei Vatolin and Nandan Thakur and Manan Dey and Dipam Vasani and Pranjal Chitale and Simone Tedeschi and Nguyen Tai and Artem Snegirev and Michael Günther and Mengzhou Xia and Weijia Shi and Xing Han Lù and Jordan Clive and Gayatri Krishnakumar and Anna Maksimova and Silvan Wehrli and Maria Tikhonova and Henil Panchal and Aleksandr Abramov and Malte Ostendorff and Zheng Liu and Simon Clematide and Lester James Miranda and Alena Fenogenova and Guangyu Song and Ruqiya Bin Safi and Wen-Ding Li and Alessia Borghini and Federico Cassano and Hongjin Su and Jimmy Lin and Howard Yen and Lasse Hansen and Sara Hooker and Chenghao Xiao and Vaibhav Adlakha and Orion Weller and Siva Reddy and Niklas Muennighoff},
  doi = {10.48550/arXiv.2502.13595},
  journal = {arXiv preprint arXiv:2502.13595},
  publisher = {arXiv},
  title = {MMTEB: Massive Multilingual Text Embedding Benchmark},
  url = {https://arxiv.org/abs/2502.13595},
  year = {2025},
}

`MTEB(fas, v1)`¶

Persian text embedding quality across classification, clustering, pair classification, reranking, retrieval, semantic similarity, and summarization retrieval.

Learn more →

Tasks

name	type	modalities	languages
PersianFoodSentimentClassification	Classification	text	fas
SynPerChatbotConvSAClassification	Classification	text	fas
SynPerChatbotConvSAToneChatbotClassification	Classification	text	fas
SynPerChatbotConvSAToneUserClassification	Classification	text	fas
SynPerChatbotSatisfactionLevelClassification	Classification	text	fas
SynPerChatbotRAGToneChatbotClassification	Classification	text	fas
SynPerChatbotRAGToneUserClassification	Classification	text	fas
SynPerChatbotToneChatbotClassification	Classification	text	fas
SynPerChatbotToneUserClassification	Classification	text	fas
SynPerTextToneClassification	Classification	text	fas
SIDClassification	Classification	text	fas
DeepSentiPers	Classification	text	fas
PersianTextEmotion	Classification	text	fas
SentimentDKSF	Classification	text	fas
NLPTwitterAnalysisClassification	Classification	text	fas
DigikalamagClassification	Classification	text	fas
MassiveIntentClassification	Classification	text	afr, amh, ara, aze, ben, ... (50)
MassiveScenarioClassification	Classification	text	afr, amh, ara, aze, ben, ... (50)
BeytooteClustering	Clustering	text	fas
DigikalamagClustering	Clustering	text	fas
HamshahriClustring	Clustering	text	fas
NLPTwitterAnalysisClustering	Clustering	text	fas
SIDClustring	Clustering	text	fas
FarsTail	PairClassification	text	fas
CExaPPC	PairClassification	text	fas
SynPerChatbotRAGFAQPC	PairClassification	text	fas
FarsiParaphraseDetection	PairClassification	text	fas
SynPerTextKeywordsPC	PairClassification	text	fas
SynPerQAPC	PairClassification	text	fas
ParsinluEntail	PairClassification	text	fas
ParsinluQueryParaphPC	PairClassification	text	fas
MIRACLReranking	Reranking	text	ara, ben, deu, eng, fas, ... (18)
WikipediaRerankingMultilingual	Reranking	text	ben, bul, ces, dan, deu, ... (18)
SynPerQARetrieval	Retrieval	text	fas
SynPerChatbotTopicsRetrieval	Retrieval	text	fas
SynPerChatbotRAGTopicsRetrieval	Retrieval	text	fas
SynPerChatbotRAGFAQRetrieval	Retrieval	text	fas
PersianWebDocumentRetrieval	Retrieval	text	fas
WikipediaRetrievalMultilingual	Retrieval	text	ben, bul, ces, dan, deu, ... (16)
MIRACLRetrieval	Retrieval	text	ara, ben, deu, eng, fas, ... (18)
ClimateFEVER-Fa	Retrieval	text	fas
DBPedia-Fa	Retrieval	text	fas
HotpotQA-Fa	Retrieval	text	fas
MSMARCO-Fa	Retrieval	text	fas
NQ-Fa	Retrieval	text	fas
ArguAna-Fa	Retrieval	text	fas
CQADupstackRetrieval-Fa	Retrieval	text	fas
FiQA2018-Fa	Retrieval	text	fas
NFCorpus-Fa	Retrieval	text	fas
QuoraRetrieval-Fa	Retrieval	text	fas
SCIDOCS-Fa	Retrieval	text	fas
SciFact-Fa	Retrieval	text	fas
TRECCOVID-Fa	Retrieval	text	fas
Touche2020-Fa	Retrieval	text	fas
Farsick	STS	text	fas
SynPerSTS	STS	text	fas
Query2Query	STS	text	fas
SAMSumFa	BitextMining	text	fas
SynPerChatbotSumSRetrieval	BitextMining	text	fas
SynPerChatbotRAGSumSRetrieval	BitextMining	text	fas

Citation

@article{zinvandi2025famteb,
  author = {Zinvandi, Erfan and Alikhani, Morteza and Sarmadi, Mehran and Pourbahman, Zahra and Arvin, Sepehr and Kazemi, Reza and Amini, Arash},
  journal = {arXiv preprint arXiv:2502.11571},
  title = {Famteb: Massive text embedding benchmark in persian language},
  year = {2025},
}

`MTEB(fas, v2)`¶

Persian text embedding quality across classification, clustering, pair classification, reranking, retrieval, semantic similarity, and summarization retrieval. In v2, large datasets were optimized for accessibility, low-quality datasets were removed, and higher-quality data was added; see the main PR for details.

Learn more →

Tasks

name	type	modalities	languages
PersianFoodSentimentClassification	Classification	text	fas
SynPerChatbotConvSAClassification	Classification	text	fas
SynPerChatbotConvSAToneChatbotClassification	Classification	text	fas
SynPerChatbotConvSAToneUserClassification	Classification	text	fas
SynPerChatbotSatisfactionLevelClassification	Classification	text	fas
SynPerTextToneClassification.v3	Classification	text	fas
SIDClassification.v2	Classification	text	fas
DeepSentiPers.v2	Classification	text	fas
PersianTextEmotion.v2	Classification	text	fas
NLPTwitterAnalysisClassification.v2	Classification	text	fas
DigikalamagClassification	Classification	text	fas
MassiveIntentClassification	Classification	text	afr, amh, ara, aze, ben, ... (50)
MassiveScenarioClassification	Classification	text	afr, amh, ara, aze, ben, ... (50)
StyleClassification	Classification	text	fas
PerShopDomainClassification	Classification	text	fas
PerShopIntentClassification	Classification	text	fas
BeytooteClustering	Clustering	text	fas
DigikalamagClustering	Clustering	text	fas
HamshahriClustring	Clustering	text	fas
NLPTwitterAnalysisClustering	Clustering	text	fas
SIDClustring	Clustering	text	fas
FarsTail	PairClassification	text	fas
SynPerChatbotRAGFAQPC	PairClassification	text	fas
FarsiParaphraseDetection	PairClassification	text	fas
SynPerTextKeywordsPC	PairClassification	text	fas
SynPerQAPC	PairClassification	text	fas
ParsinluEntail	PairClassification	text	fas
ParsinluQueryParaphPC	PairClassification	text	fas
MIRACLReranking	Reranking	text	ara, ben, deu, eng, fas, ... (18)
WikipediaRerankingMultilingual	Reranking	text	ben, bul, ces, dan, deu, ... (18)
SynPerQARetrieval	Retrieval	text	fas
SynPerChatbotRAGFAQRetrieval	Retrieval	text	fas
PersianWebDocumentRetrieval	Retrieval	text	fas
WikipediaRetrievalMultilingual	Retrieval	text	ben, bul, ces, dan, deu, ... (16)
MIRACLRetrievalHardNegatives	Retrieval	text	ara, ben, deu, eng, fas, ... (18)
HotpotQA-FaHardNegatives	Retrieval	text	fas
MSMARCO-FaHardNegatives	Retrieval	text	fas
NQ-FaHardNegatives	Retrieval	text	fas
ArguAna-Fa.v2	Retrieval	text	fas
FiQA2018-Fa.v2	Retrieval	text	fas
QuoraRetrieval-Fa.v2	Retrieval	text	fas
SCIDOCS-Fa.v2	Retrieval	text	fas
SciFact-Fa.v2	Retrieval	text	fas
TRECCOVID-Fa.v2	Retrieval	text	fas
FEVER-FaHardNegatives	Retrieval	text	fas
NeuCLIR2023RetrievalHardNegatives	Retrieval	text	fas, rus, zho
WebFAQRetrieval	Retrieval	text	ara, aze, ben, bul, cat, ... (51)
Farsick	STS	text	fas
SynPerSTS	STS	text	fas
SAMSumFa	BitextMining	text	fas
SynPerChatbotSumSRetrieval	BitextMining	text	fas
SynPerChatbotRAGSumSRetrieval	BitextMining	text	fas

Citation

@article{zinvandi2025famteb,
  author = {Zinvandi, Erfan and Alikhani, Morteza and Sarmadi, Mehran and Pourbahman, Zahra and Arvin, Sepehr and Kazemi, Reza and Amini, Arash},
  journal = {arXiv preprint arXiv:2502.11571},
  title = {Famteb: Massive text embedding benchmark in persian language},
  year = {2025},
}

`MTEB(fra, v1)`¶

French text embedding quality across classification, clustering, pair classification, reranking, retrieval, and semantic similarity, using high-quality native French datasets.

Learn more →

Tasks

name	type	modalities	languages
AmazonReviewsClassification	Classification	text	cmn, deu, eng, fra, jpn, ... (6)
MasakhaNEWSClassification	Classification	text	amh, eng, fra, hau, ibo, ... (16)
MassiveIntentClassification	Classification	text	afr, amh, ara, aze, ben, ... (50)
MassiveScenarioClassification	Classification	text	afr, amh, ara, aze, ben, ... (50)
MTOPDomainClassification	Classification	text	deu, eng, fra, hin, spa, ... (6)
MTOPIntentClassification	Classification	text	deu, eng, fra, hin, spa, ... (6)
AlloProfClusteringP2P	Clustering	text	fra
AlloProfClusteringS2S	Clustering	text	fra
HALClusteringS2S	Clustering	text	fra
MasakhaNEWSClusteringP2P	Clustering	text	amh, eng, fra, hau, ibo, ... (16)
MasakhaNEWSClusteringS2S	Clustering	text	amh, eng, fra, hau, ibo, ... (16)
MLSUMClusteringP2P	Clustering	text	deu, fra, rus, spa
MLSUMClusteringS2S	Clustering	text	deu, fra, rus, spa
PawsXPairClassification	PairClassification	text	cmn, deu, eng, fra, jpn, ... (7)
AlloprofReranking	Reranking	text	fra
SyntecReranking	Reranking	text	fra
AlloprofRetrieval	Retrieval	text	fra
BSARDRetrieval	Retrieval	text	fra
MintakaRetrieval	Retrieval	text	ara, deu, fra, hin, ita, ... (8)
SyntecRetrieval	Retrieval	text	fra
XPQARetrieval	Retrieval	text	ara, cmn, deu, eng, fra, ... (13)
SICKFr	STS	text	fra
STSBenchmarkMultilingualSTS	STS	text	cmn, deu, eng, fra, ita, ... (10)
SummEvalFr	Summarization	text	fra
STS22	STS	text	ara, cmn, deu, eng, fra, ... (10)

Citation

@misc{ciancone2024mtebfrenchresourcesfrenchsentence,
  archiveprefix = {arXiv},
  author = {Mathieu Ciancone and Imene Kerboua and Marion Schaeffer and Wissam Siblini},
  eprint = {2405.20468},
  primaryclass = {cs.CL},
  title = {MTEB-French: Resources for French Sentence Embedding Evaluation and Analysis},
  url = {https://arxiv.org/abs/2405.20468},
  year = {2024},
}

`MTEB(jpn, v1)`¶

Japanese text embedding quality across clustering, classification, semantic similarity, pair classification, retrieval, and reranking.

Learn more →

Tasks

name	type	modalities	languages
LivedoorNewsClustering.v2	Clustering	text	jpn
MewsC16JaClustering	Clustering	text	jpn
AmazonReviewsClassification	Classification	text	cmn, deu, eng, fra, jpn, ... (6)
AmazonCounterfactualClassification	Classification	text	deu, eng, jpn
MassiveIntentClassification	Classification	text	afr, amh, ara, aze, ben, ... (50)
MassiveScenarioClassification	Classification	text	afr, amh, ara, aze, ben, ... (50)
JSTS	STS	text	jpn
JSICK	STS	text	jpn
PawsXPairClassification	PairClassification	text	cmn, deu, eng, fra, jpn, ... (7)
JaqketRetrieval	Retrieval	text	jpn
MrTidyRetrieval	Retrieval	text	ara, ben, eng, fin, ind, ... (11)
JaGovFaqsRetrieval	Retrieval	text	jpn
NLPJournalTitleAbsRetrieval	Retrieval	text	jpn
NLPJournalAbsIntroRetrieval	Retrieval	text	jpn
NLPJournalTitleIntroRetrieval	Retrieval	text	jpn
ESCIReranking	Reranking	text	eng, jpn, spa

`MTEB(kor, v1)`¶

Korean text embedding quality across classification, reranking, retrieval, and semantic similarity.

Tasks

name	type	modalities	languages
KLUE-TC	Classification	text	kor
MIRACLReranking	Reranking	text	ara, ben, deu, eng, fas, ... (18)
MIRACLRetrieval	Retrieval	text	ara, ben, deu, eng, fas, ... (18)
Ko-StrategyQA	Retrieval	text	kor
KLUE-STS	STS	text	kor
KorSTS	STS	text	kor

`MTEB(nld, v1)`¶

Dutch text embedding quality across classification, clustering, pair classification, multilabel classification, reranking, retrieval, and semantic similarity.

Learn more →

Tasks

name	type	modalities	languages
DutchBookReviewSentimentClassification.v2	Classification	text	nld
MassiveIntentClassification	Classification	text	afr, amh, ara, aze, ben, ... (50)
MassiveScenarioClassification	Classification	text	afr, amh, ara, aze, ben, ... (50)
SIB200Classification	Classification	text	ace, acm, acq, aeb, afr, ... (197)
MultiHateClassification	Classification	text	ara, cmn, deu, eng, fra, ... (11)
VaccinChatNLClassification	Classification	text	nld
DutchColaClassification	Classification	text	nld
DutchGovernmentBiasClassification	Classification	text	nld
DutchSarcasticHeadlinesClassification	Classification	text	nld
DutchNewsArticlesClassification	Classification	text	nld
OpenTenderClassification	Classification	text	nld
IconclassClassification	Classification	text	nld
SICKNLPairClassification	PairClassification	text	nld
XLWICNLPairClassification	PairClassification	text	nld
CovidDisinformationNLMultiLabelClassification	MultilabelClassification	text	nld
MultiEURLEXMultilabelClassification	MultilabelClassification	text	bul, ces, dan, deu, ell, ... (23)
VABBMultiLabelClassification	MultilabelClassification	text	nld
DutchNewsArticlesClusteringS2S	Clustering	text	nld
DutchNewsArticlesClusteringP2P	Clustering	text	nld
SIB200ClusteringS2S	Clustering	text	ace, acm, acq, aeb, afr, ... (197)
VABBClusteringS2S	Clustering	text	nld
VABBClusteringP2P	Clustering	text	nld
OpenTenderClusteringS2S	Clustering	text	nld
OpenTenderClusteringP2P	Clustering	text	nld
IconclassClusteringS2S	Clustering	text	nld
WikipediaRerankingMultilingual	Reranking	text	ben, bul, ces, dan, deu, ... (18)
ArguAna-NL.v2	Retrieval	text	nld
SCIDOCS-NL.v2	Retrieval	text	nld
SciFact-NL.v2	Retrieval	text	nld
NFCorpus-NL.v2	Retrieval	text	nld
BelebeleRetrieval	Retrieval	text	acm, afr, als, amh, apc, ... (115)
WebFAQRetrieval	Retrieval	text	ara, aze, ben, bul, cat, ... (51)
DutchNewsArticlesRetrieval	Retrieval	text	nld
bBSARDNLRetrieval	Retrieval	text	nld
LegalQANLRetrieval	Retrieval	text	nld
OpenTenderRetrieval	Retrieval	text	nld
VABBRetrieval	Retrieval	text	nld
WikipediaRetrievalMultilingual	Retrieval	text	ben, bul, ces, dan, deu, ... (16)
SICK-NL-STS	STS	text	nld
STSBenchmarkMultilingualSTS	STS	text	cmn, deu, eng, fra, ita, ... (10)

Citation

@misc{banar2025mtebnle5nlembeddingbenchmark,
  archiveprefix = {arXiv},
  author = {Nikolay Banar and Ehsan Lotfi and Jens Van Nooten and Cristina Arhiliuc and Marija Kliocaite and Walter Daelemans},
  eprint = {22509.12340},
  primaryclass = {cs.CL},
  title = {MTEB-NL and E5-NL: Embedding Benchmark and Models for Dutch},
  url = {https://arxiv.org/abs/2509.12340},
  year = {2025},
}

`MTEB(pol, v1)`¶

Polish text embedding quality across classification, clustering, pair classification, retrieval, and semantic similarity, combining adapted community datasets with a novel Polish scientific literature corpus (PLSC).

Learn more →

Tasks

name	type	modalities	languages
AllegroReviews	Classification	text	pol
CBD	Classification	text	pol
MassiveIntentClassification	Classification	text	afr, amh, ara, aze, ben, ... (50)
MassiveScenarioClassification	Classification	text	afr, amh, ara, aze, ben, ... (50)
PolEmo2.0-IN	Classification	text	pol
PolEmo2.0-OUT	Classification	text	pol
PAC	Classification	text	pol
EightTagsClustering	Clustering	text	pol
PlscClusteringS2S	Clustering	text	pol
PlscClusteringP2P	Clustering	text	pol
CDSC-E	PairClassification	text	pol
PpcPC	PairClassification	text	pol
PSC	PairClassification	text	pol
SICK-E-PL	PairClassification	text	pol
CDSC-R	STS	text	pol
SICK-R-PL	STS	text	pol
STS22	STS	text	ara, cmn, deu, eng, fra, ... (10)

Citation

@article{poswiata2024plmteb,
  author = {Rafał Poświata and Sławomir Dadas and Michał Perełkiewicz},
  journal = {arXiv preprint arXiv:2405.10138},
  title = {PL-MTEB: Polish Massive Text Embedding Benchmark},
  year = {2024},
}

`MTEB(por, v1)`¶

Portuguese text embedding quality benchmark across semantic text similarity, classification, reranking and retrieval.

Tasks

name	type	modalities	languages
MultiHateClassification	Classification	text	ara, cmn, deu, eng, fra, ... (11)
TweetSentimentClassification	Classification	text	ara, deu, eng, fra, hin, ... (8)
WebFAQRetrieval	Retrieval	text	ara, aze, ben, bul, cat, ... (51)

`MTEB(rus, v1)`¶

Russian text embedding quality across classification, clustering, reranking, pair classification, retrieval, and semantic similarity, including novel Russian-specific tasks in each category.

Learn more →

Tasks

name	type	modalities	languages
GeoreviewClassification	Classification	text	rus
HeadlineClassification	Classification	text	rus
InappropriatenessClassification	Classification	text	rus
KinopoiskClassification	Classification	text	rus
MassiveIntentClassification	Classification	text	afr, amh, ara, aze, ben, ... (50)
MassiveScenarioClassification	Classification	text	afr, amh, ara, aze, ben, ... (50)
RuReviewsClassification	Classification	text	rus
RuSciBenchGRNTIClassification	Classification	text	rus
RuSciBenchOECDClassification	Classification	text	rus
GeoreviewClusteringP2P	Clustering	text	rus
RuSciBenchGRNTIClusteringP2P	Clustering	text	rus
RuSciBenchOECDClusteringP2P	Clustering	text	rus
CEDRClassification	MultilabelClassification	text	rus
SensitiveTopicsClassification	MultilabelClassification	text	rus
TERRa	PairClassification	text	rus
MIRACLReranking	Reranking	text	ara, ben, deu, eng, fas, ... (18)
RuBQReranking	Reranking	text	rus
MIRACLRetrieval	Retrieval	text	ara, ben, deu, eng, fas, ... (18)
RiaNewsRetrieval	Retrieval	text	rus
RuBQRetrieval	Retrieval	text	rus
RUParaPhraserSTS	STS	text	rus
STS22	STS	text	ara, cmn, deu, eng, fra, ... (10)
RuSTSBenchmarkSTS	STS	text	rus

Citation

@misc{snegirev2024russianfocusedembeddersexplorationrumteb,
  archiveprefix = {arXiv},
  author = {Artem Snegirev and Maria Tikhonova and Anna Maksimova and Alena Fenogenova and Alexander Abramov},
  eprint = {2408.12503},
  primaryclass = {cs.CL},
  title = {The Russian-focused embedders' exploration: ruMTEB benchmark and Russian embedding model design},
  url = {https://arxiv.org/abs/2408.12503},
  year = {2024},
}

`MTEB(rus, v1.1)`¶

Russian text embedding quality across classification, clustering, reranking, pair classification, retrieval, and semantic similarity. In v1.1, MIRACLRetrieval and RiaNewsRetrieval were replaced with their HardNegatives variants (v2), which include improved default prompts.

Learn more →

Tasks

name	type	modalities	languages
GeoreviewClassification	Classification	text	rus
HeadlineClassification	Classification	text	rus
InappropriatenessClassification	Classification	text	rus
KinopoiskClassification	Classification	text	rus
MassiveIntentClassification	Classification	text	afr, amh, ara, aze, ben, ... (50)
MassiveScenarioClassification	Classification	text	afr, amh, ara, aze, ben, ... (50)
RuReviewsClassification	Classification	text	rus
RuSciBenchGRNTIClassification	Classification	text	rus
RuSciBenchOECDClassification	Classification	text	rus
GeoreviewClusteringP2P	Clustering	text	rus
RuSciBenchGRNTIClusteringP2P	Clustering	text	rus
RuSciBenchOECDClusteringP2P	Clustering	text	rus
CEDRClassification	MultilabelClassification	text	rus
SensitiveTopicsClassification	MultilabelClassification	text	rus
TERRa	PairClassification	text	rus
MIRACLReranking	Reranking	text	ara, ben, deu, eng, fas, ... (18)
RuBQReranking	Reranking	text	rus
MIRACLRetrievalHardNegatives.v2	Retrieval	text	ara, ben, deu, eng, fas, ... (18)
RiaNewsRetrievalHardNegatives.v2	Retrieval	text	rus
RuBQRetrieval	Retrieval	text	rus
RUParaPhraserSTS	STS	text	rus
STS22	STS	text	ara, cmn, deu, eng, fra, ... (10)
RuSTSBenchmarkSTS	STS	text	rus

Citation

@misc{snegirev2024russianfocusedembeddersexplorationrumteb,
  archiveprefix = {arXiv},
  author = {Artem Snegirev and Maria Tikhonova and Anna Maksimova and Alena Fenogenova and Alexander Abramov},
  eprint = {2408.12503},
  primaryclass = {cs.CL},
  title = {The Russian-focused embedders' exploration: ruMTEB benchmark and Russian embedding model design},
  url = {https://arxiv.org/abs/2408.12503},
  year = {2024},
}

`MTEB(spa, v1)`¶

Spanish text embedding quality across classification, clustering, pair classification, reranking, retrieval, and semantic similarity. For discussion on benchmark construction, see the original submission.

Tasks

name	type	modalities	languages
SpanishNewsClassification.v2	Classification	text	spa
SpanishSentimentClassification.v2	Classification	text	spa
MLSUMClusteringP2P	Clustering	text	deu, fra, rus, spa
MLSUMClusteringS2S	Clustering	text	deu, fra, rus, spa
PawsXPairClassification	PairClassification	text	cmn, deu, eng, fra, jpn, ... (7)
XNLI	PairClassification	text	ara, bul, deu, ell, eng, ... (14)
MIRACLReranking	Reranking	text	ara, ben, deu, eng, fas, ... (18)
MIRACLRetrievalHardNegatives.v2	Retrieval	text	ara, ben, deu, eng, fas, ... (18)
MintakaRetrieval	Retrieval	text	ara, deu, fra, hin, ita, ... (8)
SpanishPassageRetrievalS2P	Retrieval	text	spa
SpanishPassageRetrievalS2S	Retrieval	text	spa
XPQARetrieval	Retrieval	text	ara, cmn, deu, eng, fra, ... (13)
STSES	STS	text	spa
STSBenchmarkMultilingualSTS	STS	text	cmn, deu, eng, fra, ita, ... (10)
STS17	STS	text	ara, deu, eng, fra, ita, ... (9)
STS22	STS	text	ara, cmn, deu, eng, fra, ... (10)

`MTEB(tha, v1)`¶

Thai text embedding quality across classification, clustering, pair classification, reranking, and retrieval. Tasks are native Thai or high-quality human translations; machine-translated and cross-lingual tasks are excluded.

Tasks

name	type	modalities	languages
MTOPDomainClassification	Classification	text	deu, eng, fra, hin, spa, ... (6)
MTOPIntentClassification	Classification	text	deu, eng, fra, hin, spa, ... (6)
SIB200Classification	Classification	text	ace, acm, acq, aeb, afr, ... (197)
WisesightSentimentClassification.v2	Classification	text	tha
SIB200ClusteringS2S	Clustering	text	ace, acm, acq, aeb, afr, ... (197)
XNLI	PairClassification	text	ara, bul, deu, ell, eng, ... (14)
MIRACLReranking	Reranking	text	ara, ben, deu, eng, fas, ... (18)
MultiLongDocReranking	Reranking	text	ara, deu, eng, fra, hin, ... (13)
BelebeleRetrieval	Retrieval	text	acm, afr, als, amh, apc, ... (115)
MIRACLRetrievalHardNegatives.v2	Retrieval	text	ara, ben, deu, eng, fas, ... (18)
MKQARetrieval	Retrieval	text	ara, dan, deu, eng, fin, ... (26)
MrTidyRetrieval	Retrieval	text	ara, ben, eng, fin, ind, ... (11)
MultiLongDocRetrieval	Retrieval	text	ara, cmn, deu, eng, fra, ... (13)
WebFAQRetrieval	Retrieval	text	ara, aze, ben, bul, cat, ... (51)
XQuADRetrieval	Retrieval	text	arb, deu, ell, eng, hin, ... (12)

`MVEB(beta)`¶

Audio-visual video embedding quality across retrieval, classification, clustering, pair classification, zero-shot classification, and video-centric QA, with tasks selected to maximize coverage of audio-video joint modality inputs.

Tasks

name	type	modalities	languages
AVMemeExamAT2VRetrieval	Any2AnyRetrieval	audio, text, video	eng
ActivityNetCaptionsT2VRetrieval	Any2AnyRetrieval	video, text	eng
AudioCapsAVVA2TRetrieval	Any2AnyRetrieval	audio, video, text	eng
AudioCapsAVVT2ARetrieval	Any2AnyRetrieval	video, text, audio	eng
MSVDT2VRetrieval	Any2AnyRetrieval	text, video	eng
VALOR32KT2VARetrieval	Any2AnyRetrieval	text, audio, video	eng
VATEXV2ARetrieval	Any2AnyRetrieval	video, audio	eng
VATEXVA2TRetrieval	Any2AnyRetrieval	audio, video, text	eng
VGGSoundAVA2VRetrieval	Any2AnyRetrieval	audio, video	eng
YouCook2T2VARetrieval	Any2AnyRetrieval	text, audio, video	eng
EgoSchemaVideoCentricQA	VideoCentricQA	video, text	eng
AVEDatasetClassification	VideoClassification	video, audio	eng
AVMemeAudioVideoClassification	VideoClassification	video, audio	bos, bre, deu, eng, fas, ... (16)
BreakfastClassification	VideoClassification	video	eng
Kinetics700VA	VideoClassification	video, audio	eng
RAVDESSAVClassification	VideoClassification	video, audio	eng
UCF101VideoAudioClassification	VideoClassification	video, audio	eng
MELDEmotionAudioVideoClustering	VideoClustering	video, audio	eng
MusicAVQACLSAudioVideoClustering	VideoClustering	video, audio	eng
HumanAnimalCartoonVAPairClassification	VideoPairClassification	video, audio	eng
MusicAVQAVAPairClassification	VideoPairClassification	video, audio	eng
HMDB51ZeroShot	VideoZeroshotClassification	video, text	eng
WorldSenseAudioVideoZeroShot	VideoZeroshotClassification	video, audio, text	eng

`MVEB(text, video, beta)`¶

Text and video embedding quality across retrieval, classification, clustering, pair classification, zero-shot classification, and video-centric QA, for models without an audio encoder.

Tasks

name	type	modalities	languages
AVMemeExamT2VRetrieval	Any2AnyRetrieval	text, video	eng
ActivityNetCaptionsT2VRetrieval	Any2AnyRetrieval	video, text	eng
AudioCapsAVT2VRetrieval	Any2AnyRetrieval	text, video	eng
DiDeMoV2TRetrieval	Any2AnyRetrieval	video, text	eng
MSVDV2TRetrieval	Any2AnyRetrieval	video, text	eng
Panda70MT2VRetrieval	Any2AnyRetrieval	text, video	eng
VALOR32KT2VRetrieval	Any2AnyRetrieval	text, video	eng
VATEXT2VRetrieval	Any2AnyRetrieval	text, video	eng
OmniVideoBenchVideoCentricQA	VideoCentricQA	video, text	eng
AVMemeVideoClassification	VideoClassification	video	bos, bre, deu, eng, fas, ... (16)
BreakfastClassification	VideoClassification	video	eng
Kinetics700V	VideoClassification	video	eng
VGGSoundV	VideoClassification	video	eng
RAVDESSVideoClustering	VideoClustering	video	eng
HumanAnimalCartoonVPairClassification	VideoPairClassification	video	eng
Kinetics400ZeroShot	VideoZeroshotClassification	video, text	eng
MELDVideoZeroShot	VideoZeroshotClassification	video, text	eng
UCF101VideoZeroShotClassification	VideoZeroshotClassification	video, text	eng
WorldSenseVideoZeroShot	VideoZeroshotClassification	video, text	eng

`MVEB(video, beta)`¶

Video-only embedding quality across classification and pair classification, for encoders without a text component. Retrieval, QA, and zero-shot tasks are excluded as they require a text encoder.

Tasks

name	type	modalities	languages
AVMemeVideoClassification	VideoClassification	video	bos, bre, deu, eng, fas, ... (16)
BreakfastClassification	VideoClassification	video	eng
HMDB51Classification	VideoClassification	video	eng
Kinetics600V	VideoClassification	video	eng
MELDVideoClassification	VideoClassification	video	eng
WorldSenseVideoClassification	VideoClassification	video	eng
HumanAnimalCartoonVPairClassification	VideoPairClassification	video	eng
MusicAVQAVPairClassification	VideoPairClassification	video	eng
RAVDESSAVVPairClassification	VideoPairClassification	video	eng

`NanoBEIR`¶

Zero-shot retrieval quality using subsets of the BEIR datasets, designed for faster evaluation with reduced computational cost.

Learn more →

Tasks

name	type	modalities	languages
NanoArguAnaRetrieval	Retrieval	text	eng
NanoClimateFeverRetrieval	Retrieval	text	eng
NanoDBPediaRetrieval	Retrieval	text	eng
NanoFEVERRetrieval	Retrieval	text	eng
NanoFiQA2018Retrieval	Retrieval	text	eng
NanoHotpotQARetrieval	Retrieval	text	eng
NanoMSMARCORetrieval	Retrieval	text	eng
NanoNFCorpusRetrieval	Retrieval	text	eng
NanoNQRetrieval	Retrieval	text	eng
NanoQuoraRetrieval	Retrieval	text	eng
NanoSCIDOCSRetrieval	Retrieval	text	eng
NanoSciFactRetrieval	Retrieval	text	eng
NanoTouche2020Retrieval	Retrieval	text	eng

`R2MED`¶

Reasoning-driven medical retrieval quality across biology, bioinformatics, medical sciences, clinical, and treatment scenarios, requiring models to perform multi-step reasoning over medical literature.

Learn more →

Tasks

name	type	modalities	languages
R2MEDBiologyRetrieval	Retrieval	text	eng
R2MEDBioinformaticsRetrieval	Retrieval	text	eng
R2MEDMedicalSciencesRetrieval	Retrieval	text	eng
R2MEDMedXpertQAExamRetrieval	Retrieval	text	eng
R2MEDMedQADiagRetrieval	Retrieval	text	eng
R2MEDPMCTreatmentRetrieval	Retrieval	text	eng
R2MEDPMCClinicalRetrieval	Retrieval	text	eng
R2MEDIIYiClinicalRetrieval	Retrieval	text	eng

Citation

@article{li2025r2med,
  author = {Li, Lei and Zhou, Xiao and Liu, Zheng},
  journal = {arXiv preprint arXiv:2505.14558},
  title = {R2MED: A Benchmark for Reasoning-Driven Medical Retrieval},
  year = {2025},
}

`RAR-b`¶

Reasoning capabilities of retrieval models, framing commonsense, temporal, and domain-specific reasoning tasks as retrieval problems.

Learn more →

Tasks

name	type	modalities	languages
ARCChallenge	Retrieval	text	eng
AlphaNLI	Retrieval	text	eng
HellaSwag	Retrieval	text	eng
WinoGrande	Retrieval	text	eng
PIQA	Retrieval	text	eng
SIQA	Retrieval	text	eng
Quail	Retrieval	text	eng
SpartQA	Retrieval	text	eng
TempReasonL1	Retrieval	text	eng
TempReasonL2Pure	Retrieval	text	eng
TempReasonL2Fact	Retrieval	text	eng
TempReasonL2Context	Retrieval	text	eng
TempReasonL3Pure	Retrieval	text	eng
TempReasonL3Fact	Retrieval	text	eng
TempReasonL3Context	Retrieval	text	eng
RARbCode	Retrieval	text	eng
RARbMath	Retrieval	text	eng

Citation

@article{xiao2024rar,
  author = {Xiao, Chenghao and Hudson, G Thomas and Al Moubayed, Noura},
  journal = {arXiv preprint arXiv:2404.06347},
  title = {RAR-b: Reasoning as Retrieval Benchmark},
  year = {2024},
}

`RTEB(Code, beta)`¶

Retrieval quality in the code domain across algorithmic problems, data science tasks, code evaluation, SQL retrieval, and multilingual code retrieval, with tasks representative of real-world production retrieval demands. A domain-specific subset of RTEB. Includes both open and closed datasets; to submit results on private tasks, please open an issue.

Note: We have temporarily removed the 'Private' column to read more about this decision out the announcement.

Tasks

name	type	modalities	languages
AppsRetrieval	Retrieval	text	eng, python
DS1000Retrieval	Retrieval	text	eng, python
HumanEvalRetrieval	Retrieval	text	eng, python
MBPPRetrieval	Retrieval	text	eng, python
WikiSQLRetrieval	Retrieval	text	eng, sql
FreshStackRetrieval	Retrieval	text	eng, go, javascript, python
SWEbenchCodeRetrieval	Retrieval	text	eng, python
Code1Retrieval	Retrieval	text	eng
JapaneseCode1Retrieval	Retrieval	text	jpn

Citation

@article{rteb2025,
  author = {Liu, Frank and Enevoldsen, Kenneth and Solomatin, Roman and Chung, Isaac and Aarsen, Tom and Fődi, Zoltán},
  title = {Introducing RTEB: A New Standard for Retrieval Evaluation},
  year = {2025},
}

`RTEB(Health, beta)`¶

Retrieval quality in the healthcare and medical domain across medical Q&A, healthcare information retrieval, and multilingual medical consultation, with tasks representative of real-world production retrieval demands. A domain-specific subset of RTEB. Includes both open and closed datasets; to submit results on private tasks, please open an issue.

Note: We have temporarily removed the 'Private' column to read more about this decision out the announcement.

Tasks

name	type	modalities	languages
ChatDoctorRetrieval	Retrieval	text	eng
CUREv1	Retrieval	text	eng, fra, spa
EnglishHealthcare1Retrieval	Retrieval	text	eng
GermanHealthcare1Retrieval	Retrieval	text	deu

Citation

@article{rteb2025,
  author = {Liu, Frank and Enevoldsen, Kenneth and Solomatin, Roman and Chung, Isaac and Aarsen, Tom and Fődi, Zoltán},
  title = {Introducing RTEB: A New Standard for Retrieval Evaluation},
  year = {2025},
}

`RTEB(Law, beta)`¶

Retrieval quality in the legal domain across case documents, statutes, legal summarization, and multilingual legal Q&A, with tasks representative of real-world production retrieval demands. A domain-specific subset of RTEB. Includes both open and closed datasets; to submit results on private tasks, please open an issue.

Note: We have temporarily removed the 'Private' column to read more about this decision out the announcement.

Tasks

name	type	modalities	languages
AILACasedocs	Retrieval	text	eng
AILAStatutes	Retrieval	text	eng
LegalSummarization	Retrieval	text	eng
LegalQuAD	Retrieval	text	deu
FrenchLegal1Retrieval	Retrieval	text	fra
GermanLegal1Retrieval	Retrieval	text	deu
JapaneseLegal1Retrieval	Retrieval	text	jpn

Citation

@article{rteb2025,
  author = {Liu, Frank and Enevoldsen, Kenneth and Solomatin, Roman and Chung, Isaac and Aarsen, Tom and Fődi, Zoltán},
  title = {Introducing RTEB: A New Standard for Retrieval Evaluation},
  year = {2025},
}

`RTEB(beta)`¶

Retrieval quality across specialized domains including legal, finance, code, and healthcare in multiple languages, with tasks representative of real-world production retrieval demands. Includes both open and closed datasets; to submit results on private tasks, please open an issue.

Note: We have temporarily removed the 'Private' column to read more about this decision out the announcement.

Tasks

name	type	modalities	languages
AILACasedocs	Retrieval	text	eng
AILAStatutes	Retrieval	text	eng
LegalSummarization	Retrieval	text	eng
LegalQuAD	Retrieval	text	deu
FinanceBenchRetrieval	Retrieval	text	eng
HC3FinanceRetrieval	Retrieval	text	eng
FinQARetrieval	Retrieval	text	eng
AppsRetrieval	Retrieval	text	eng, python
DS1000Retrieval	Retrieval	text	eng, python
HumanEvalRetrieval	Retrieval	text	eng, python
MBPPRetrieval	Retrieval	text	eng, python
WikiSQLRetrieval	Retrieval	text	eng, sql
FreshStackRetrieval	Retrieval	text	eng, go, javascript, python
SWEbenchCodeRetrieval	Retrieval	text	eng, python
ChatDoctorRetrieval	Retrieval	text	eng
CUREv1	Retrieval	text	eng, fra, spa
MIRACLRetrievalHardNegatives	Retrieval	text	ara, ben, deu, eng, fas, ... (18)
Code1Retrieval	Retrieval	text	eng
JapaneseCode1Retrieval	Retrieval	text	jpn
EnglishFinance1Retrieval	Retrieval	text	eng
EnglishFinance2Retrieval	Retrieval	text	eng
EnglishFinance3Retrieval	Retrieval	text	eng
EnglishFinance4Retrieval	Retrieval	text	eng
EnglishHealthcare1Retrieval	Retrieval	text	eng
French1Retrieval	Retrieval	text	fra
FrenchLegal1Retrieval	Retrieval	text	fra
German1Retrieval	Retrieval	text	deu
GermanHealthcare1Retrieval	Retrieval	text	deu
GermanLegal1Retrieval	Retrieval	text	deu
JapaneseLegal1Retrieval	Retrieval	text	jpn

Citation

@article{rteb2025,
  author = {Liu, Frank and Enevoldsen, Kenneth and Solomatin, Roman and Chung, Isaac and Aarsen, Tom and Fődi, Zoltán},
  title = {Introducing RTEB: A New Standard for Retrieval Evaluation},
  year = {2025},
}

`RTEB(deu, beta)`¶

Retrieval quality in German across legal, healthcare, and business domains, with tasks representative of real-world production retrieval demands. A German-language subset of RTEB. Includes both open and closed datasets; to submit results on private tasks, please open an issue.

Note: We have temporarily removed the 'Private' column to read more about this decision out the announcement.

Tasks

name	type	modalities	languages
LegalQuAD	Retrieval	text	deu
German1Retrieval	Retrieval	text	deu
GermanHealthcare1Retrieval	Retrieval	text	deu
GermanLegal1Retrieval	Retrieval	text	deu

Citation

@article{rteb2025,
  author = {Liu, Frank and Enevoldsen, Kenneth and Solomatin, Roman and Chung, Isaac and Aarsen, Tom and Fődi, Zoltán},
  title = {Introducing RTEB: A New Standard for Retrieval Evaluation},
  year = {2025},
}

`RTEB(eng, beta)`¶

Retrieval quality in English across legal, finance, code, and healthcare domains, with tasks representative of real-world production retrieval demands. An English-only subset of RTEB. Includes both open and closed datasets; to submit results on private tasks, please open an issue.

Note: We have temporarily removed the 'Private' column to read more about this decision out the announcement.

Tasks

name	type	modalities	languages
AILACasedocs	Retrieval	text	eng
AILAStatutes	Retrieval	text	eng
LegalSummarization	Retrieval	text	eng
FinanceBenchRetrieval	Retrieval	text	eng
HC3FinanceRetrieval	Retrieval	text	eng
FinQARetrieval	Retrieval	text	eng
AppsRetrieval	Retrieval	text	eng, python
DS1000Retrieval	Retrieval	text	eng, python
HumanEvalRetrieval	Retrieval	text	eng, python
MBPPRetrieval	Retrieval	text	eng, python
WikiSQLRetrieval	Retrieval	text	eng, sql
FreshStackRetrieval	Retrieval	text	eng, go, javascript, python
SWEbenchCodeRetrieval	Retrieval	text	eng, python
ChatDoctorRetrieval	Retrieval	text	eng
Code1Retrieval	Retrieval	text	eng
EnglishFinance1Retrieval	Retrieval	text	eng
EnglishFinance2Retrieval	Retrieval	text	eng
EnglishFinance3Retrieval	Retrieval	text	eng
EnglishFinance4Retrieval	Retrieval	text	eng
EnglishHealthcare1Retrieval	Retrieval	text	eng
CUREv1	Retrieval	text	eng, fra, spa

Citation

@article{rteb2025,
  author = {Liu, Frank and Enevoldsen, Kenneth and Solomatin, Roman and Chung, Isaac and Aarsen, Tom and Fődi, Zoltán},
  title = {Introducing RTEB: A New Standard for Retrieval Evaluation},
  year = {2025},
}

`RTEB(fin, beta)`¶

Retrieval quality in the financial domain across finance benchmarks, Q&A, financial document retrieval, and corporate governance, with tasks representative of real-world production retrieval demands. A domain-specific subset of RTEB. Includes both open and closed datasets; to submit results on private tasks, please open an issue.

Note: We have temporarily removed the 'Private' column to read more about this decision out the announcement.

Tasks

name	type	modalities	languages
FinanceBenchRetrieval	Retrieval	text	eng
HC3FinanceRetrieval	Retrieval	text	eng
FinQARetrieval	Retrieval	text	eng
EnglishFinance1Retrieval	Retrieval	text	eng
EnglishFinance2Retrieval	Retrieval	text	eng
EnglishFinance3Retrieval	Retrieval	text	eng
EnglishFinance4Retrieval	Retrieval	text	eng

Citation

@article{rteb2025,
  author = {Liu, Frank and Enevoldsen, Kenneth and Solomatin, Roman and Chung, Isaac and Aarsen, Tom and Fődi, Zoltán},
  title = {Introducing RTEB: A New Standard for Retrieval Evaluation},
  year = {2025},
}

`RTEB(fra, beta)`¶

Retrieval quality in French across legal and general knowledge domains, with tasks representative of real-world production retrieval demands. A French-language subset of RTEB. Includes both open and closed datasets; to submit results on private tasks, please open an issue.

Note: We have temporarily removed the 'Private' column to read more about this decision out the announcement.

Tasks

name	type	modalities	languages
CUREv1	Retrieval	text	eng, fra, spa
French1Retrieval	Retrieval	text	fra
FrenchLegal1Retrieval	Retrieval	text	fra

Citation

@article{rteb2025,
  author = {Liu, Frank and Enevoldsen, Kenneth and Solomatin, Roman and Chung, Isaac and Aarsen, Tom and Fődi, Zoltán},
  title = {Introducing RTEB: A New Standard for Retrieval Evaluation},
  year = {2025},
}

`RTEB(jpn, beta)`¶

Retrieval quality in Japanese across legal and code domains, with tasks representative of real-world production retrieval demands. A Japanese-language subset of RTEB. Includes both open and closed datasets; to submit results on private tasks, please open an issue.

Note: We have temporarily removed the 'Private' column to read more about this decision out the announcement.

Tasks

name	type	modalities	languages
JapaneseCode1Retrieval	Retrieval	text	jpn
JapaneseLegal1Retrieval	Retrieval	text	jpn

Citation

@article{rteb2025,
  author = {Liu, Frank and Enevoldsen, Kenneth and Solomatin, Roman and Chung, Isaac and Aarsen, Tom and Fődi, Zoltán},
  title = {Introducing RTEB: A New Standard for Retrieval Evaluation},
  year = {2025},
}

`RuSciBench`¶

Scientific text embedding quality in Russian and English across bitext mining, classification, retrieval, and regression tasks, using data sourced from eLibrary, Russia's largest electronic library of scientific publications.

Learn more →

Tasks

name	type	modalities	languages
RuSciBenchBitextMining.v2	BitextMining	text	eng, rus
RuSciBenchCoreRiscClassification	Classification	text	eng, rus
RuSciBenchGRNTIClassification.v2	Classification	text	eng, rus
RuSciBenchOECDClassification.v2	Classification	text	eng, rus
RuSciBenchPubTypeClassification	Classification	text	eng, rus
RuSciBenchCiteRetrieval	Retrieval	text	eng, rus
RuSciBenchCociteRetrieval	Retrieval	text	eng, rus
RuSciBenchCitedCountRegression	Regression	text	eng, rus
RuSciBenchYearPublRegression	Regression	text	eng, rus

Citation

@article{vatolin2024ruscibench,
  author = {Vatolin, A. and Gerasimenko, N. and Ianina, A. and Vorontsov, K.},
  doi = {10.1134/S1064562424602191},
  issn = {1531-8362},
  journal = {Doklady Mathematics},
  month = {12},
  number = {1},
  pages = {S251--S260},
  title = {RuSciBench: Open Benchmark for Russian and English Scientific Document Representations},
  url = {https://doi.org/10.1134/S1064562424602191},
  volume = {110},
  year = {2024},
}

`VN-MTEB (vie, v1)`¶

Vietnamese text embedding quality across retrieval, classification, pair classification, clustering, reranking, and semantic similarity.

Learn more →

Tasks

name	type	modalities	languages
ArguAna-VN	Retrieval	text	vie
SciFact-VN	Retrieval	text	vie
ClimateFEVER-VN	Retrieval	text	vie
FEVER-VN	Retrieval	text	vie
DBPedia-VN	Retrieval	text	vie
NQ-VN	Retrieval	text	vie
HotpotQA-VN	Retrieval	text	vie
MSMARCO-VN	Retrieval	text	vie
TRECCOVID-VN	Retrieval	text	vie
FiQA2018-VN	Retrieval	text	vie
NFCorpus-VN	Retrieval	text	vie
SCIDOCS-VN	Retrieval	text	vie
Touche2020-VN	Retrieval	text	vie
Quora-VN	Retrieval	text	vie
CQADupstackAndroid-VN	Retrieval	text	vie
CQADupstackGis-VN	Retrieval	text	vie
CQADupstackMathematica-VN	Retrieval	text	vie
CQADupstackPhysics-VN	Retrieval	text	vie
CQADupstackProgrammers-VN	Retrieval	text	vie
CQADupstackStats-VN	Retrieval	text	vie
CQADupstackTex-VN	Retrieval	text	vie
CQADupstackUnix-VN	Retrieval	text	vie
CQADupstackWebmasters-VN	Retrieval	text	vie
CQADupstackWordpress-VN	Retrieval	text	vie
Banking77VNClassification	Classification	text	vie
EmotionVNClassification	Classification	text	vie
AmazonCounterfactualVNClassification	Classification	text	vie
MTOPDomainVNClassification	Classification	text	vie
TweetSentimentExtractionVNClassification	Classification	text	vie
ToxicConversationsVNClassification	Classification	text	vie
ImdbVNClassification	Classification	text	vie
MTOPIntentVNClassification	Classification	text	vie
MassiveScenarioVNClassification	Classification	text	vie
MassiveIntentVNClassification	Classification	text	vie
AmazonReviewsVNClassification	Classification	text	vie
AmazonPolarityVNClassification	Classification	text	vie
SprintDuplicateQuestions-VN	PairClassification	text	vie
TwitterSemEval2015-VN	PairClassification	text	vie
TwitterURLCorpus-VN	PairClassification	text	vie
TwentyNewsgroupsClustering-VN	Clustering	text	vie
RedditClusteringP2P-VN	Clustering	text	vie
StackExchangeClusteringP2P-VN	Clustering	text	vie
StackExchangeClustering-VN	Clustering	text	vie
RedditClustering-VN	Clustering	text	vie
SciDocsRR-VN	Reranking	text	vie
AskUbuntuDupQuestions-VN	Reranking	text	vie
StackOverflowDupQuestions-VN	Reranking	text	vie
BIOSSES-VN	STS	text	vie
SICK-R-VN	STS	text	vie
STSBenchmark-VN	STS	text	vie

Citation

@misc{pham2025vnmtebvietnamesemassivetext,
  archiveprefix = {arXiv},
  author = {Loc Pham and Tung Luu and Thu Vo and Minh Nguyen and Viet Hoang},
  eprint = {2507.21500},
  primaryclass = {cs.CL},
  title = {VN-MTEB: Vietnamese Massive Text Embedding Benchmark},
  url = {https://arxiv.org/abs/2507.21500},
  year = {2025},
}

`ViDoRe(v1&v2)`¶

Visual document retrieval across diverse document types and domains, combining the ViDoRe v1 and v2 task sets.

Learn more →

Tasks

name	type	modalities	languages
VidoreArxivQARetrieval	DocumentUnderstanding	text, image	eng
VidoreDocVQARetrieval	DocumentUnderstanding	text, image	eng
VidoreInfoVQARetrieval	DocumentUnderstanding	text, image	eng
VidoreTabfquadRetrieval	DocumentUnderstanding	text, image	eng
VidoreTatdqaRetrieval	DocumentUnderstanding	text, image	eng
VidoreShiftProjectRetrieval	DocumentUnderstanding	text, image	eng
VidoreSyntheticDocQAAIRetrieval	DocumentUnderstanding	text, image	eng
VidoreSyntheticDocQAEnergyRetrieval	DocumentUnderstanding	text, image	eng
VidoreSyntheticDocQAGovernmentReportsRetrieval	DocumentUnderstanding	text, image	eng
VidoreSyntheticDocQAHealthcareIndustryRetrieval	DocumentUnderstanding	text, image	eng
Vidore2ESGReportsRetrieval	DocumentUnderstanding	text, image	deu, eng, fra, spa
Vidore2EconomicsReportsRetrieval	DocumentUnderstanding	text, image	deu, eng, fra, spa
Vidore2BioMedicalLecturesRetrieval	DocumentUnderstanding	text, image	deu, eng, fra, spa
Vidore2ESGReportsHLRetrieval	DocumentUnderstanding	text, image	eng

Citation

@article{mace2025vidorev2,
  author = {Macé, Quentin and Loison António and Faysse, Manuel},
  journal = {arXiv preprint arXiv:2505.17166},
  title = {ViDoRe Benchmark V2: Raising the Bar for Visual Retrieval},
  year = {2025},
}

`ViDoRe(v1)`¶

Visual document retrieval across diverse document types and domains, matching natural language queries to document page images.

Learn more →

Tasks

name	type	modalities	languages
VidoreArxivQARetrieval	DocumentUnderstanding	text, image	eng
VidoreDocVQARetrieval	DocumentUnderstanding	text, image	eng
VidoreInfoVQARetrieval	DocumentUnderstanding	text, image	eng
VidoreTabfquadRetrieval	DocumentUnderstanding	text, image	eng
VidoreTatdqaRetrieval	DocumentUnderstanding	text, image	eng
VidoreShiftProjectRetrieval	DocumentUnderstanding	text, image	eng
VidoreSyntheticDocQAAIRetrieval	DocumentUnderstanding	text, image	eng
VidoreSyntheticDocQAEnergyRetrieval	DocumentUnderstanding	text, image	eng
VidoreSyntheticDocQAGovernmentReportsRetrieval	DocumentUnderstanding	text, image	eng
VidoreSyntheticDocQAHealthcareIndustryRetrieval	DocumentUnderstanding	text, image	eng

Citation

@article{faysse2024colpali,
  author = {Faysse, Manuel and Sibille, Hugues and Wu, Tony and Viaud, Gautier and Hudelot, C{\'e}line and Colombo, Pierre},
  journal = {arXiv preprint arXiv:2407.01449},
  title = {ColPali: Efficient Document Retrieval with Vision Language Models},
  year = {2024},
}

`ViDoRe(v2)`¶

Visual document retrieval across ESG reports, economics reports, biomedical lectures, and related enterprise document types, matching natural language queries to document page images.

Learn more →

Tasks

name	type	modalities	languages
Vidore2ESGReportsRetrieval	DocumentUnderstanding	text, image	deu, eng, fra, spa
Vidore2EconomicsReportsRetrieval	DocumentUnderstanding	text, image	deu, eng, fra, spa
Vidore2BioMedicalLecturesRetrieval	DocumentUnderstanding	text, image	deu, eng, fra, spa
Vidore2ESGReportsHLRetrieval	DocumentUnderstanding	text, image	eng

Citation

@article{mace2025vidorev2,
  author = {Macé, Quentin and Loison António and Faysse, Manuel},
  journal = {arXiv preprint arXiv:2505.17166},
  title = {ViDoRe Benchmark V2: Raising the Bar for Visual Retrieval},
  year = {2025},
}

`ViDoRe(v3)`¶

Visual document retrieval across multi-modal enterprise documents spanning finance, industrial, computer science, pharmaceutical, and other professional domains. Includes both open and closed datasets; to submit results on private tasks, please open an issue.

Learn more →

Tasks

name	type	modalities	languages
Vidore3FinanceEnRetrieval	DocumentUnderstanding	text, image	deu, eng, fra, ita, por, ... (6)
Vidore3IndustrialRetrieval	DocumentUnderstanding	text, image	deu, eng, fra, ita, por, ... (6)
Vidore3ComputerScienceRetrieval	DocumentUnderstanding	text, image	deu, eng, fra, ita, por, ... (6)
Vidore3PharmaceuticalsRetrieval	DocumentUnderstanding	text, image	deu, eng, fra, ita, por, ... (6)
Vidore3HrRetrieval	DocumentUnderstanding	text, image	deu, eng, fra, ita, por, ... (6)
Vidore3FinanceFrRetrieval	DocumentUnderstanding	text, image	deu, eng, fra, ita, por, ... (6)
Vidore3PhysicsRetrieval	DocumentUnderstanding	text, image	deu, eng, fra, ita, por, ... (6)
Vidore3EnergyRetrieval	DocumentUnderstanding	text, image	deu, eng, fra, ita, por, ... (6)
Vidore3TelecomRetrieval	DocumentUnderstanding	text, image	deu, eng, fra, ita, por, ... (6)
Vidore3NuclearRetrieval	DocumentUnderstanding	text, image	deu, eng, fra, ita, por, ... (6)

Citation

@article{loison2026vidorev3comprehensiveevaluation,
  archiveprefix = {arXiv},
  author = {António Loison and Quentin Macé and Antoine Edy and Victor Xing and Tom Balough and Gabriel Moreira and Bo Liu and Manuel Faysse and Céline Hudelot and Gautier Viaud},
  eprint = {2601.08620},
  primaryclass = {cs.AI},
  title = {ViDoRe V3: A Comprehensive Evaluation of Retrieval Augmented Generation in Complex Real-World Scenarios},
  url = {https://arxiv.org/abs/2601.08620},
  year = {2026},
}

`ViDoRe(v3.1)`¶

Visual document retrieval across multi-modal enterprise documents spanning finance, industrial, computer science, pharmaceutical, and other professional domains. Includes both open and closed datasets; to submit results on private tasks, please open an issue. v3.1 adds markdown derived from OCR to support text-only and joint image-text baselines.

Learn more →

Tasks

name	type	modalities	languages
Vidore3FinanceEnRetrieval.v2	DocumentUnderstanding	text, image	deu, eng, fra, ita, por, ... (6)
Vidore3IndustrialRetrieval.v2	DocumentUnderstanding	text, image	deu, eng, fra, ita, por, ... (6)
Vidore3ComputerScienceRetrieval.v2	DocumentUnderstanding	text, image	deu, eng, fra, ita, por, ... (6)
Vidore3PharmaceuticalsRetrieval.v2	DocumentUnderstanding	text, image	deu, eng, fra, ita, por, ... (6)
Vidore3HrRetrieval.v2	DocumentUnderstanding	text, image	deu, eng, fra, ita, por, ... (6)
Vidore3FinanceFrRetrieval.v2	DocumentUnderstanding	text, image	deu, eng, fra, ita, por, ... (6)
Vidore3PhysicsRetrieval.v2	DocumentUnderstanding	text, image	deu, eng, fra, ita, por, ... (6)
Vidore3EnergyRetrieval.v2	DocumentUnderstanding	text, image	deu, eng, fra, ita, por, ... (6)
Vidore3TelecomRetrieval.v2	DocumentUnderstanding	text, image	deu, eng, fra, ita, por, ... (6)
Vidore3NuclearRetrieval.v2	DocumentUnderstanding	text, image	deu, eng, fra, ita, por, ... (6)

Citation

@article{loison2026vidorev3comprehensiveevaluation,
  archiveprefix = {arXiv},
  author = {António Loison and Quentin Macé and Antoine Edy and Victor Xing and Tom Balough and Gabriel Moreira and Bo Liu and Manuel Faysse and Céline Hudelot and Gautier Viaud},
  eprint = {2601.08620},
  primaryclass = {cs.AI},
  title = {ViDoRe V3: A Comprehensive Evaluation of Retrieval Augmented Generation in Complex Real-World Scenarios},
  url = {https://arxiv.org/abs/2601.08620},
  year = {2026},
}

Available Benchmarks¶

BEIR¶

BEIR-NL¶

BRIGHT¶

BRIGHT (long)¶

BRIGHT(v1.1)¶

BuiltBench(eng)¶

ChemTEB¶

ChemTEB(v1.1)¶

CoIR¶

CoREB(v1)¶

CodeRAG¶

Encodechka¶

FollowIR¶

HUME(v1)¶

JMTEB(v2)¶

JMTEB-lite(v1)¶

JinaVDR¶

KoViDoRe(v2)¶

LMEB¶

LongEmbed¶

MAEB(beta)¶

MAEB(beta, audio-only)¶

MIEB(Img)¶

MIEB(Multilingual)¶

MIEB(eng)¶

MIEB(lite)¶

MINERSBitextMining¶

MTEB(Code, v1)¶

MTEB(Europe, v1)¶

MTEB(Indic, v1)¶

MTEB(Law, v1)¶

MTEB(Medical, v1)¶

MTEB(Multilingual, v1)¶

MTEB(Multilingual, v2)¶

MTEB(Scandinavian, v1)¶

MTEB(cmn, v1)¶

MTEB(deu, v1)¶

MTEB(eng, v1)¶

MTEB(eng, v2)¶

MTEB(fas, v1)¶

MTEB(fas, v2)¶

MTEB(fra, v1)¶

MTEB(jpn, v1)¶

MTEB(kor, v1)¶

MTEB(nld, v1)¶

MTEB(pol, v1)¶

MTEB(por, v1)¶

MTEB(rus, v1)¶

MTEB(rus, v1.1)¶

MTEB(spa, v1)¶

MTEB(tha, v1)¶

MVEB(beta)¶

MVEB(text, video, beta)¶

MVEB(video, beta)¶

NanoBEIR¶

R2MED¶

RAR-b¶

RTEB(Code, beta)¶

RTEB(Health, beta)¶

RTEB(Law, beta)¶

RTEB(beta)¶

RTEB(deu, beta)¶

RTEB(eng, beta)¶

RTEB(fin, beta)¶

RTEB(fra, beta)¶

RTEB(jpn, beta)¶

RuSciBench¶

VN-MTEB (vie, v1)¶

ViDoRe(v1&v2)¶

ViDoRe(v1)¶

ViDoRe(v2)¶

ViDoRe(v3)¶

ViDoRe(v3.1)¶

`BEIR`¶

`BEIR-NL`¶

`BRIGHT`¶

`BRIGHT (long)`¶

`BRIGHT(v1.1)`¶

`BuiltBench(eng)`¶

`ChemTEB`¶

`ChemTEB(v1.1)`¶

`CoIR`¶

`CoREB(v1)`¶

`CodeRAG`¶

`Encodechka`¶

`FollowIR`¶

`HUME(v1)`¶

`JMTEB(v2)`¶

`JMTEB-lite(v1)`¶

`JinaVDR`¶

`KoViDoRe(v2)`¶

`LMEB`¶

`LongEmbed`¶

`MAEB(beta)`¶

`MAEB(beta, audio-only)`¶

`MIEB(Img)`¶

`MIEB(Multilingual)`¶

`MIEB(eng)`¶

`MIEB(lite)`¶

`MINERSBitextMining`¶

`MTEB(Code, v1)`¶

`MTEB(Europe, v1)`¶

`MTEB(Indic, v1)`¶

`MTEB(Law, v1)`¶

`MTEB(Medical, v1)`¶

`MTEB(Multilingual, v1)`¶

`MTEB(Multilingual, v2)`¶

`MTEB(Scandinavian, v1)`¶

`MTEB(cmn, v1)`¶

`MTEB(deu, v1)`¶

`MTEB(eng, v1)`¶

`MTEB(eng, v2)`¶

`MTEB(fas, v1)`¶

`MTEB(fas, v2)`¶

`MTEB(fra, v1)`¶

`MTEB(jpn, v1)`¶

`MTEB(kor, v1)`¶

`MTEB(nld, v1)`¶

`MTEB(pol, v1)`¶

`MTEB(por, v1)`¶

`MTEB(rus, v1)`¶

`MTEB(rus, v1.1)`¶

`MTEB(spa, v1)`¶

`MTEB(tha, v1)`¶

`MVEB(beta)`¶

`MVEB(text, video, beta)`¶

`MVEB(video, beta)`¶

`NanoBEIR`¶

`R2MED`¶

`RAR-b`¶

`RTEB(Code, beta)`¶

`RTEB(Health, beta)`¶

`RTEB(Law, beta)`¶

`RTEB(beta)`¶

`RTEB(deu, beta)`¶

`RTEB(eng, beta)`¶

`RTEB(fin, beta)`¶

`RTEB(fra, beta)`¶

`RTEB(jpn, beta)`¶

`RuSciBench`¶

`VN-MTEB (vie, v1)`¶

`ViDoRe(v1&v2)`¶

`ViDoRe(v1)`¶

`ViDoRe(v2)`¶

`ViDoRe(v3)`¶

`ViDoRe(v3.1)`¶