Tool-Retrieval benchmark leaderboard
Welcome to the ToolRet benchmark leaderboard!
- Search: Enter keywords for the model name in the search box. Use a semicolon (
;
) to separate multiple keywords. - Model Type: We provide a wide range of open-source models. Choose the model type(s) you're interested in.
- Model Size: Select the parameter count range to filter models accordingly.
Click the Filter Data button to update the display with the filtered data.
Rank | Model | Average | Comp@10 | Recall@10 | Prec@10 | NDCG@10 | Number of Parameters | Model Type |
---|---|---|---|---|---|---|---|---|
10 | jinaai/jina-reranker-v2-base-multilingual | 39.09 | 44.87 | 56.91 | 8.84 | 45.73 | Unkown | re-ranking model |
Rank | Model | Average | Comp@10 | Recall@10 | Prec@10 | NDCG@10 | Number of Parameters | Model Type |
---|---|---|---|---|---|---|---|---|
10 | jinaai/jina-reranker-v2-base-multilingual | 45.33 | 58.83 | 65.29 | 6.06 | 51.15 | Unkown | re-ranking model |
Rank | Model | Average | Comp@10 | Recall@10 | Prec@10 | NDCG@10 | Number of Parameters | Model Type |
---|---|---|---|---|---|---|---|---|
1 | BAAI/bge-reranker-v2-m3 | 45.33 | 58.83 | 65.29 | 6.06 | 51.15 | 568M | re-ranking model |
2 | jinaai/jina-reranker-v2-base-multilingual | 44.73 | 57.68 | 64.11 | 5.96 | 51.18 | 278M | re-ranking model |
3 | bzantium/NV-Embed-v1 | 42.07 | 55.26 | 60.83 | 5.85 | 46.35 | 7.85B | embedding model |
4 | BAAI/bge-large-en-v1.5 | 41.81 | 58.16 | 58.39 | 4.76 | 45.91 | 335M | embedding model |
5 | intfloat/e5-mistral-7b-instruct | 39.1 | 50.72 | 57.11 | 5.24 | 43.33 | 7B | embedding model |
6 | bm25 | 38.05 | 53.44 | 53.64 | 5.46 | 39.67 | Unkown | sparse retrieval |
7 | BAAI/bge-base-en-v1.5 | 37.93 | 52.91 | 53.14 | 4.23 | 41.45 | 109M | embedding model |
8 | Alibaba-NLP/gte-base-en-v1.5 | 35.87 | 45.9 | 52.31 | 4.75 | 40.52 | 137M | embedding model |
9 | castorini/monot5-base-msmarco | 35.86 | 51.29 | 51.6 | 4.07 | 36.46 | Unkown | re-ranking model |
10 | Alibaba-NLP/gte-large-en-v1.5 | 34.3 | 43.74 | 50.14 | 4.54 | 38.76 | 434M | embedding model |
11 | sentence-transformers/all-MiniLM-L6-v2 | 34.17 | 48.27 | 48.58 | 3.77 | 36.08 | 22M | embedding model |
12 | BAAI/bge-reranker-v2-gemma | 33.76 | 42.97 | 49.35 | 4.45 | 38.28 | 2.51B | re-ranking model |
13 | mixedbread-ai/mxbai-rerank-large-v1 | 33.23 | 46.42 | 47.94 | 5.14 | 33.4 | 435M | re-ranking model |
14 | sentence-transformers/gtr-t5-large | 33.07 | 42.25 | 48.61 | 4.36 | 37.07 | 335M | dense retrieval |
15 | intfloat/e5-large-v2 | 31.93 | 40.34 | 46.64 | 4.17 | 36.56 | 335M | embedding model |
16 | facebook/contriever-msmarco | 30.16 | 38.31 | 44.67 | 3.96 | 33.69 | Unkown | dense retrieval |
17 | sentence-transformers/gtr-t5-base | 30.01 | 37.65 | 43.96 | 3.9 | 34.52 | 110M | dense retrieval |
18 | intfloat/e5-base-v2 | 29.11 | 37.19 | 43.51 | 3.88 | 31.87 | 109M | embedding model |
19 | intfloat/e5-small-v2 | 28.87 | 35.89 | 42.17 | 3.72 | 33.69 | 33M | embedding model |
20 | colbert | 28.45 | 40.07 | 40.35 | 4.09 | 29.28 | 110M | dense retrieval |
Rank | Model | Average | Comp@10 | Recall@10 | Prec@10 | NDCG@10 | Number of Parameters | Model Type |
---|---|---|---|---|---|---|---|---|
10 | jinaai/jina-reranker-v2-base-multilingual | 33.43 | 32.39 | 49.67 | 10.32 | 41.34 | Unkown | re-ranking model |
Rank | Model | Average | Comp@10 | Recall@10 | Prec@10 | NDCG@10 | Number of Parameters | Model Type |
---|---|---|---|---|---|---|---|---|
1 | jinaai/jina-reranker-v2-base-multilingual | 33.43 | 32.39 | 49.67 | 10.32 | 41.34 | 278M | re-ranking model |
2 | BAAI/bge-reranker-v2-m3 | 29.78 | 30.75 | 44.46 | 8.72 | 35.2 | 568M | re-ranking model |
3 | BAAI/bge-reranker-v2-gemma | 27.93 | 27.05 | 42.06 | 8.75 | 33.86 | 2.51B | re-ranking model |
4 | intfloat/e5-mistral-7b-instruct | 25.59 | 25.69 | 39.18 | 7.53 | 29.95 | 7B | embedding model |
5 | bzantium/NV-Embed-v1 | 25.5 | 24.9 | 38.8 | 7.69 | 30.6 | 7.85B | embedding model |
6 | BAAI/bge-large-en-v1.5 | 25.43 | 24.94 | 39.14 | 7.34 | 30.29 | 335M | embedding model |
7 | Alibaba-NLP/gte-base-en-v1.5 | 24.19 | 23.72 | 37.09 | 6.86 | 29.06 | 137M | embedding model |
8 | bm25 | 23.93 | 23.4 | 36.63 | 6.94 | 28.74 | Unkown | sparse retrieval |
9 | BAAI/bge-base-en-v1.5 | 22.92 | 23.47 | 35.28 | 6.43 | 26.48 | 109M | embedding model |
10 | intfloat/e5-small-v2 | 21.83 | 20.75 | 33.67 | 6.38 | 26.54 | 33M | embedding model |
11 | intfloat/e5-base-v2 | 21.16 | 21.14 | 32.9 | 6 | 24.59 | 109M | embedding model |
12 | Alibaba-NLP/gte-large-en-v1.5 | 20.43 | 19.76 | 31.61 | 5.84 | 24.51 | 434M | embedding model |
13 | intfloat/e5-large-v2 | 20.35 | 20.58 | 31.5 | 5.8 | 23.51 | 335M | embedding model |
14 | sentence-transformers/gtr-t5-large | 20.25 | 20.55 | 30.66 | 5.39 | 24.4 | 335M | dense retrieval |
15 | facebook/contriever-msmarco | 18.96 | 18.69 | 28.82 | 5.31 | 23.04 | Unkown | dense retrieval |
16 | castorini/monot5-base-msmarco | 18.77 | 17.05 | 29.39 | 5.94 | 22.69 | Unkown | re-ranking model |
17 | sentence-transformers/gtr-t5-base | 18.03 | 18.95 | 27.54 | 4.78 | 20.85 | 110M | dense retrieval |
18 | mixedbread-ai/mxbai-rerank-large-v1 | 16.52 | 18.18 | 26.02 | 4.06 | 17.81 | 435M | re-ranking model |
19 | colbert | 13.42 | 14.47 | 20.05 | 2.98 | 16.18 | 110M | dense retrieval |
20 | sentence-transformers/all-MiniLM-L6-v2 | 11.96 | 12.98 | 18.88 | 3.27 | 12.71 | 22M | embedding model |
Rank | Model | Average | Comp@10 | Recall@10 | Prec@10 | NDCG@10 | Number of Parameters | Model Type |
---|---|---|---|---|---|---|---|---|
10 | jinaai/jina-reranker-v2-base-multilingual | 40.66 | 44.76 | 58.32 | 10.11 | 49.45 | Unkown | re-ranking model |
Rank | Model | Average | Comp@10 | Recall@10 | Prec@10 | NDCG@10 | Number of Parameters | Model Type |
---|---|---|---|---|---|---|---|---|
1 | bzantium/NV-Embed-v1 | 40.66 | 44.76 | 58.32 | 10.11 | 49.45 | 7.85B | embedding model |
2 | jinaai/jina-reranker-v2-base-multilingual | 39.1 | 44.53 | 56.94 | 10.25 | 44.67 | 278M | re-ranking model |
3 | BAAI/bge-base-en-v1.5 | 37.82 | 43.35 | 54.59 | 8.98 | 44.37 | 109M | embedding model |
4 | BAAI/bge-reranker-v2-m3 | 37.16 | 40.63 | 54.85 | 9.75 | 43.4 | 568M | re-ranking model |
5 | intfloat/e5-large-v2 | 37.12 | 41.95 | 53.01 | 8.62 | 44.9 | 335M | embedding model |
6 | BAAI/bge-large-en-v1.5 | 36.98 | 42.62 | 52.5 | 8.51 | 44.31 | 335M | embedding model |
7 | sentence-transformers/gtr-t5-large | 35.86 | 40.73 | 51.56 | 8.58 | 42.59 | 335M | dense retrieval |
8 | BAAI/bge-reranker-v2-gemma | 35.42 | 39.69 | 51.53 | 9.19 | 41.26 | 2.51B | re-ranking model |
9 | sentence-transformers/gtr-t5-base | 34.6 | 39.45 | 48.63 | 7.69 | 42.62 | 110M | dense retrieval |
10 | intfloat/e5-mistral-7b-instruct | 34.23 | 37.78 | 49.14 | 7.93 | 42.08 | 7B | embedding model |
11 | castorini/monot5-base-msmarco | 33.69 | 41.03 | 48.67 | 6.93 | 38.11 | Unkown | re-ranking model |
12 | intfloat/e5-base-v2 | 33.55 | 38.29 | 48.8 | 7.8 | 39.32 | 109M | embedding model |
13 | bm25 | 32.48 | 36.62 | 45.76 | 7.83 | 39.71 | Unkown | sparse retrieval |
14 | Alibaba-NLP/gte-base-en-v1.5 | 32.08 | 37.59 | 45.86 | 6.91 | 37.98 | 137M | embedding model |
15 | Alibaba-NLP/gte-large-en-v1.5 | 31.88 | 35.7 | 47.48 | 7.77 | 36.57 | 434M | embedding model |
16 | intfloat/e5-small-v2 | 30.17 | 33.68 | 43.4 | 7.03 | 36.58 | 33M | embedding model |
17 | sentence-transformers/all-MiniLM-L6-v2 | 29.95 | 34.05 | 45.07 | 7.3 | 33.37 | 22M | embedding model |
18 | mixedbread-ai/mxbai-rerank-large-v1 | 25.39 | 29.04 | 37.83 | 6.72 | 27.97 | 435M | re-ranking model |
19 | colbert | 22.03 | 26.59 | 31.87 | 4.7 | 24.96 | 110M | dense retrieval |
20 | facebook/contriever-msmarco | 18.66 | 22.63 | 26.91 | 3.78 | 21.32 | Unkown | dense retrieval |
Rank | Model | Average | Comp@10 | Recall@10 | Prec@10 | NDCG@10 | Number of Parameters | Model Type |
---|---|---|---|---|---|---|---|---|
10 | jinaai/jina-reranker-v2-base-multilingual | 29.92 | 33.51 | 40.91 | 7.68 | 34.88 | Unkown | re-ranking model |
Rank | Model | Average | Comp@10 | Recall@10 | Prec@10 | NDCG@10 | Number of Parameters | Model Type |
---|---|---|---|---|---|---|---|---|
1 | BAAI/bge-reranker-v2-gemma | 29.92 | 33.51 | 43.6 | 7.68 | 34.88 | 2.51B | re-ranking model |
2 | bzantium/NV-Embed-v1 | 28.52 | 30.8 | 40.91 | 6.86 | 35.5 | 7.85B | embedding model |
3 | jinaai/jina-reranker-v2-base-multilingual | 27.29 | 30.5 | 39.89 | 6.97 | 31.82 | 278M | re-ranking model |
4 | castorini/monot5-base-msmarco | 24.54 | 27.63 | 36.27 | 6.21 | 28.04 | Unkown | re-ranking model |
5 | BAAI/bge-reranker-v2-m3 | 24.52 | 27.04 | 35.94 | 6.48 | 28.62 | 568M | re-ranking model |
6 | mixedbread-ai/mxbai-rerank-large-v1 | 21.73 | 25.32 | 32.3 | 4.98 | 24.32 | 435M | re-ranking model |
7 | intfloat/e5-mistral-7b-instruct | 21.29 | 23.74 | 31.66 | 5.59 | 24.19 | 7B | embedding model |
8 | BAAI/bge-large-en-v1.5 | 19.16 | 21.69 | 28.48 | 4.85 | 21.61 | 335M | embedding model |
9 | BAAI/bge-base-en-v1.5 | 18.85 | 21.04 | 28.03 | 4.79 | 21.56 | 109M | embedding model |
10 | bm25 | 18.69 | 21.15 | 27.31 | 4.53 | 21.77 | Unkown | sparse retrieval |
11 | sentence-transformers/gtr-t5-large | 18.39 | 21.06 | 26.93 | 4.47 | 21.08 | 335M | dense retrieval |
12 | Alibaba-NLP/gte-base-en-v1.5 | 17.83 | 20.41 | 26.72 | 4.43 | 19.78 | 137M | embedding model |
13 | intfloat/e5-large-v2 | 17.73 | 19.76 | 26.15 | 4.53 | 20.48 | 335M | embedding model |
14 | Alibaba-NLP/gte-large-en-v1.5 | 17.46 | 19.14 | 26.32 | 4.68 | 19.71 | 434M | embedding model |
15 | intfloat/e5-small-v2 | 16.32 | 17.68 | 24.2 | 4.23 | 19.2 | 33M | embedding model |
16 | colbert | 16.13 | 18.38 | 23.93 | 3.77 | 18.43 | 110M | dense retrieval |
17 | sentence-transformers/gtr-t5-base | 16 | 18.51 | 23.59 | 3.83 | 18.07 | 110M | dense retrieval |
18 | intfloat/e5-base-v2 | 15.8 | 17.33 | 23.58 | 4.12 | 18.17 | 109M | embedding model |
19 | facebook/contriever-msmarco | 14.58 | 16.06 | 21.72 | 3.66 | 16.88 | Unkown | dense retrieval |
20 | sentence-transformers/all-MiniLM-L6-v2 | 13.63 | 15.51 | 20.44 | 3.37 | 15.18 | 22M | embedding model |
Rank | Model | Average | Comp@10 | Recall@10 | Prec@10 | NDCG@10 | Number of Parameters | Model Type |
---|---|---|---|---|---|---|---|---|
10 | jinaai/jina-reranker-v2-base-multilingual | 28.45 | 35.43 | 39.14 | 3.87 | 35.37 | Unkown | re-ranking model |
Rank | Model | Average | Comp@10 | Recall@10 | Prec@10 | NDCG@10 | Number of Parameters | Model Type |
---|---|---|---|---|---|---|---|---|
1 | bzantium/NV-Embed-v1 | 28.45 | 35.43 | 39.14 | 3.87 | 35.37 | 7.85B | embedding model |
2 | BAAI/bge-reranker-v2-gemma | 25.04 | 34.42 | 34.94 | 3.75 | 27.04 | 2.51B | re-ranking model |
3 | mixedbread-ai/mxbai-rerank-large-v1 | 21.83 | 30.74 | 31.39 | 3.3 | 21.88 | 435M | re-ranking model |
4 | BAAI/bge-reranker-v2-m3 | 21.63 | 30.12 | 30.69 | 3.29 | 22.42 | 568M | re-ranking model |
5 | jinaai/jina-reranker-v2-base-multilingual | 21.61 | 29.98 | 30.38 | 3.27 | 22.82 | 278M | re-ranking model |
6 | castorini/monot5-base-msmarco | 19.03 | 26.67 | 27.3 | 2.96 | 19.2 | Unkown | re-ranking model |
7 | bm25 | 18.27 | 25.24 | 25.63 | 2.78 | 19.44 | Unkown | sparse retrieval |
8 | BAAI/bge-large-en-v1.5 | 16.32 | 22.86 | 23.27 | 2.55 | 16.6 | 335M | embedding model |
9 | intfloat/e5-mistral-7b-instruct | 15.41 | 21.89 | 22.37 | 2.47 | 14.89 | 7B | embedding model |
10 | sentence-transformers/gtr-t5-large | 15.22 | 21.07 | 21.42 | 2.35 | 16.05 | 335M | dense retrieval |
11 | Alibaba-NLP/gte-base-en-v1.5 | 14.92 | 20.76 | 21.18 | 2.33 | 15.4 | 137M | embedding model |
12 | BAAI/bge-base-en-v1.5 | 14.84 | 20.62 | 21.12 | 2.34 | 15.28 | 109M | embedding model |
13 | Alibaba-NLP/gte-large-en-v1.5 | 14.81 | 20.98 | 21.37 | 2.35 | 14.55 | 434M | embedding model |
14 | intfloat/e5-large-v2 | 14.41 | 19.76 | 20.13 | 2.22 | 15.55 | 335M | embedding model |
15 | colbert | 14.22 | 20 | 20.26 | 2.19 | 14.44 | 110M | dense retrieval |
16 | sentence-transformers/gtr-t5-base | 13.09 | 18.18 | 18.5 | 2.03 | 13.66 | 110M | dense retrieval |
17 | intfloat/e5-small-v2 | 12.18 | 16.68 | 17.13 | 1.91 | 13 | 33M | embedding model |
18 | facebook/contriever-msmarco | 11.92 | 16.54 | 17.05 | 1.89 | 12.19 | Unkown | dense retrieval |
19 | intfloat/e5-base-v2 | 11.64 | 16.14 | 16.56 | 1.88 | 11.98 | 109M | embedding model |
20 | sentence-transformers/all-MiniLM-L6-v2 | 11.19 | 15.63 | 16.06 | 1.82 | 11.25 | 22M | embedding model |
Rank | Model | Average | Comp@10 | Recall@10 | Prec@10 | NDCG@10 | Number of Parameters | Model Type |
---|---|---|---|---|---|---|---|---|
10 | jinaai/jina-reranker-v2-base-multilingual | 29.77 | 27.03 | 45.37 | 10.02 | 36.68 | Unkown | re-ranking model |
Rank | Model | Average | Comp@10 | Recall@10 | Prec@10 | NDCG@10 | Number of Parameters | Model Type |
---|---|---|---|---|---|---|---|---|
1 | BAAI/bge-reranker-v2-gemma | 29.77 | 27.03 | 45.37 | 10.02 | 36.68 | 2.51B | re-ranking model |
2 | jinaai/jina-reranker-v2-base-multilingual | 27.86 | 25.39 | 42.75 | 9.43 | 33.85 | 278M | re-ranking model |
3 | BAAI/bge-reranker-v2-m3 | 26.64 | 24.15 | 40.87 | 9.09 | 32.47 | 568M | re-ranking model |
4 | bzantium/NV-Embed-v1 | 24.58 | 22.08 | 37.36 | 8.44 | 30.43 | 7.85B | embedding model |
5 | castorini/monot5-base-msmarco | 22.04 | 18.35 | 34.4 | 7.78 | 27.63 | Unkown | re-ranking model |
6 | intfloat/e5-mistral-7b-instruct | 21.52 | 19.47 | 33.34 | 7.45 | 25.81 | 7B | embedding model |
7 | mixedbread-ai/mxbai-rerank-large-v1 | 19.53 | 18.29 | 30.44 | 5.76 | 23.64 | 435M | re-ranking model |
8 | BAAI/bge-large-en-v1.5 | 19.17 | 17.41 | 30.25 | 6.44 | 22.59 | 335M | embedding model |
9 | BAAI/bge-base-en-v1.5 | 18.76 | 17.2 | 29.55 | 6.16 | 22.13 | 109M | embedding model |
10 | Alibaba-NLP/gte-large-en-v1.5 | 18.63 | 16.89 | 29.08 | 6.32 | 22.21 | 434M | embedding model |
11 | Alibaba-NLP/gte-base-en-v1.5 | 18.45 | 17.03 | 29.16 | 6.03 | 21.58 | 137M | embedding model |
12 | sentence-transformers/gtr-t5-large | 17.92 | 17.48 | 27.57 | 5.36 | 21.28 | 335M | dense retrieval |
13 | colbert | 17.04 | 14.95 | 26.37 | 5.3 | 21.55 | 110M | dense retrieval |
14 | intfloat/e5-base-v2 | 16.59 | 14.69 | 26.15 | 5.48 | 20.06 | 109M | embedding model |
15 | bm25 | 16.59 | 14.99 | 25.83 | 5.34 | 20.2 | Unkown | sparse retrieval |
16 | facebook/contriever-msmarco | 16.29 | 13.58 | 25.56 | 5.81 | 20.2 | Unkown | dense retrieval |
17 | intfloat/e5-small-v2 | 16.06 | 14.87 | 25 | 5.13 | 19.24 | 33M | embedding model |
18 | intfloat/e5-large-v2 | 15.9 | 15.13 | 24.97 | 5.09 | 18.42 | 335M | embedding model |
19 | sentence-transformers/gtr-t5-base | 14.75 | 14.93 | 22.95 | 4.36 | 16.75 | 110M | dense retrieval |
20 | sentence-transformers/all-MiniLM-L6-v2 | 9.88 | 9.86 | 15.58 | 3.04 | 11.04 | 22M | embedding model |
Rank | Model | Average | Comp@10 | Recall@10 | Prec@10 | NDCG@10 | Number of Parameters | Model Type |
---|---|---|---|---|---|---|---|---|
10 | jinaai/jina-reranker-v2-base-multilingual | 34.94 | 39.07 | 50.49 | 9.27 | 40.93 | Unkown | re-ranking model |
Rank | Model | Average | Comp@10 | Recall@10 | Prec@10 | NDCG@10 | Number of Parameters | Model Type |
---|---|---|---|---|---|---|---|---|
1 | BAAI/bge-reranker-v2-gemma | 34.94 | 39.07 | 50.49 | 9.27 | 40.93 | 2.51B | re-ranking model |
2 | castorini/monot5-base-msmarco | 32.55 | 37.88 | 47.12 | 7.89 | 37.29 | Unkown | re-ranking model |
3 | bzantium/NV-Embed-v1 | 32.52 | 34.89 | 46.24 | 8.27 | 40.68 | 7.85B | embedding model |
4 | jinaai/jina-reranker-v2-base-multilingual | 32.41 | 36.13 | 46.54 | 8.19 | 38.8 | 278M | re-ranking model |
5 | intfloat/e5-mistral-7b-instruct | 26.96 | 29.87 | 39.26 | 6.84 | 31.86 | 7B | embedding model |
6 | BAAI/bge-reranker-v2-m3 | 25.29 | 26.86 | 36.25 | 7.06 | 30.98 | 568M | re-ranking model |
7 | mixedbread-ai/mxbai-rerank-large-v1 | 23.84 | 26.94 | 35.07 | 5.89 | 27.45 | 435M | re-ranking model |
8 | BAAI/bge-base-en-v1.5 | 22.97 | 25.29 | 33.41 | 5.88 | 27.29 | 109M | embedding model |
9 | intfloat/e5-large-v2 | 22.88 | 24.39 | 33.36 | 6.27 | 27.49 | 335M | embedding model |
10 | sentence-transformers/gtr-t5-large | 22.02 | 24.62 | 31.82 | 5.7 | 25.92 | 335M | dense retrieval |
11 | BAAI/bge-large-en-v1.5 | 21.97 | 24.78 | 31.91 | 5.57 | 25.63 | 335M | embedding model |
12 | bm25 | 21.2 | 23.22 | 30.48 | 5.45 | 25.66 | Unkown | sparse retrieval |
13 | intfloat/e5-small-v2 | 20.73 | 21.48 | 30.46 | 5.64 | 25.36 | 33M | embedding model |
14 | sentence-transformers/gtr-t5-base | 20.16 | 22.41 | 29.32 | 5.12 | 23.78 | 110M | dense retrieval |
15 | Alibaba-NLP/gte-base-en-v1.5 | 20.14 | 23.44 | 29.82 | 4.92 | 22.37 | 137M | embedding model |
16 | sentence-transformers/all-MiniLM-L6-v2 | 19.81 | 21.05 | 29.69 | 5.25 | 23.25 | 22M | embedding model |
17 | intfloat/e5-base-v2 | 19.16 | 21.16 | 28.02 | 4.99 | 22.48 | 109M | embedding model |
18 | Alibaba-NLP/gte-large-en-v1.5 | 18.95 | 19.57 | 28.52 | 5.38 | 22.35 | 434M | embedding model |
19 | colbert | 17.11 | 20.2 | 25.14 | 3.81 | 19.29 | 110M | dense retrieval |
20 | facebook/contriever-msmarco | 15.54 | 18.06 | 22.56 | 3.28 | 18.23 | Unkown | dense retrieval |
Acknowledgement
This work present the first diverse tool retrieval benchmark to evaluate the tool retrieval performance of a wide range of information retrieval models. We sincerely thank prior work, such as MAIR and ToolBench, which inspire this project or provide strong technique reference.
Citation
@article{ToolRetrieval,
title = {Retrieval Models Aren't Tool-Savvy: Benchmarking Tool Retrieval for Large Language Models},
author = {Zhengliang Shi, Yuhan Wang, Lingyong Yan, Pengjie Ren, Shuaiqiang Wang, Dawei Yin, Zhaochun Ren},
year = 2025,
journal = {arXiv},
}
This demo is created by Gradio