Cloud-CV/EvalAI

每日信息看板 · 2026-03-04

返回当天 Daily Index

开源项目

AI 总结

Cloud-CV 开源了 EvalAI 评测平台，提供统一排行榜、远程与容器化评测及可扩展后端，重要性在于提升AI算法对比的可复现性、公平性与评测效率。

EvalAI定位为大规模ML/AI算法评估与对比平台，解决实现差异、数据划分和指标不统一带来的比较困难。
支持自定义评测协议与多阶段评测，可配置公开/私有排行榜，并兼容任意编程语言提交。
提供远程评测与集群扩展能力，主办方可接入自有计算节点处理高算力挑战任务。
支持提交Docker镜像在测试环境内执行评测，适用于智能体/环境交互类任务。
通过预热worker、内存预加载数据和多核分块并行等机制，在部分场景将评测耗时降低一个数量级。

#GitHub #repo #开源项目 #EvalAI #Docker

原链接

内容摘录

<p align="center"><img width="65%" src="docs/source/_static/img/evalai_logo.png"/></p>

------------------------------------------------------------------------------------------

Join the chat on Slack
Build Status
Coverage
Backend Coverage
Frontend Coverage
Code style: black
Documentation Status
GitHub commit activity
Twitter Follow

EvalAI is an open source platform for evaluating and comparing machine learning (ML) and artificial intelligence (AI) algorithms at scale.

In recent years, it has become increasingly difficult to compare an algorithm solving a given task with other existing approaches. These comparisons suffer from minor differences in algorithm implementation, use of non-standard dataset splits and different evaluation metrics. By providing a central leaderboard and submission interface, we make it easier for researchers to reproduce the results mentioned in the paper and perform reliable & accurate quantitative analysis. By providing swift and robust backends based on map-reduce frameworks that speed up evaluation on the fly, EvalAI aims to make it easier for researchers to reproduce results from technical papers and perform reliable and accurate analyses.
Features
**Custom evaluation protocols and phases**: We allow creation of an arbitrary number of evaluation phases and dataset splits, compatibility using any programming language, and organizing results in both public and private leaderboards.
**Remote evaluation**: Certain large-scale challenges need special compute capabilities for evaluation. If the challenge needs extra computational power, challenge organizers can easily add their own cluster of worker nodes to process participant submissions while we take care of hosting the challenge, handling user submissions, and maintaining the leaderboard.
**Evaluation inside environments**: EvalAI lets participants submit code for their agent in the form of docker images which are evaluated against test environments on the evaluation server. During evaluation, the worker fetches the image, test environment, and the model snapshot and spins up a new container to perform evaluation.
**CLI support**: <a href="https://github.com/Cloud-CV/evalai-cli" target="_blank">evalai-cli</a> is designed to extend the functionality of the EvalAI web application to your command line to make the platform more accessible and terminal-friendly.
**Portability**: EvalAI is designed with keeping in mind scalability and portability of such a system from the very inception of the idea. Most of the components rely heavily on open-source technologies – Docker, Django, Node.js, and PostgreSQL.
**Faster evaluation**: We warm-up the worker nodes at start-up by importing the challenge code and pre-loading the dataset in memory. We also split the dataset into small chunks that are simultaneously evaluated on multiple cores. These simple tricks result in faster evaluation and reduces the evaluation time by an order of magnitude in some cases.
Goal

Our ultimate goal is to build a centralized platform to host, participate and collaborate in AI challenges organized around the globe and we hope to help in benchmarking progress in AI.
Installation instructions

Setting up EvalAI on your local machine is really easy. You can setup EvalAI using docker:
The steps are:
Install <a href="https://docs.docker.com/install/" target="_blank">docker</a> and <a href="https://docs.docker.com/compose/install/" target="_blank">docker-compose</a> on your machine.
Get the source code on to your machine via git.
Build and run the Docker containers. This might take a while.

 

 By default, this starts only the required services (db, sqs, and django). 
 If you need **worker** services, start them using:
That's it. Open web browser and hit the URL <a href="http://127.0.0.1:8888" target="_blank">http://127.0.0.1:8888</a>. Three users will be created by default which are listed below -

 **SUPERUSER-** username: admin password: password 
 **HOST USER-** username: host password: password 
 **PARTICIPANT USER-** username: participant password: password

If you are facing any issue during installation, please see our <a href="https://evalai.readthedocs.io/en/latest/faq(developers).html#common-errors-during-installation" target="_blank">common errors during installation</a> page.
Setup Instructions for EvalAI Documentation

If you're looking to contribute to EvalAI Documentation, refer to the docs specific setup instructions in docs/README.md to set up the docs builder locally.
Citing EvalAI
If you are using EvalAI for hosting challenges, please cite the following technical report:

<p>
 <a href="http://learningsys.org/sosp19/assets/papers/23_CameraReadySubmission_EvalAI_SOSP_2019%20(8)%20(1).pdf" target="_blank"><img src="docs/source/_static/img/evalai-paper.jpg"/></a>
</p>
Team

EvalAI is maintained by <a href="https://rishabhjain.xyz/" target="_blank">Rishabh Jain</a>, <a href="https://www.linkedin.com/in/akanshajain231999/" target="_blank">Akansha Jain</a>, <a href="https://gchhablani.github.io/" target="_blank">Gunjan Chhablani</a> and <a href="https://www.cc.gatech.edu/~dbatra/" target="_blank">Dhruv Batra</a>.

A non-exhaustive list of past contributors includes: <a href="http://deshraj.xyz/" target="_blank">Deshraj Yadav</a>, <a href="https://ram81.github.io/" target="_blank">Ram Ramrakhya</a>,<a href="http://www.jainakash.in/" target="_blank">Akash Jain</a>, <a href="https://taranjeet.cc/" target="_blank">Taranjeet Singh</a>, <a href="https://github.com/spyshiv" target="_blank">Shiv Baran Singh</a>, <a href="https://dexter1691.github.io/" target="_blank">Harsh Agarwal</a>, <a href="https://prithv1.github.io/" target="_blank">Prithvijit Chattopadhyay</a>, and <a href="https://www.cc.gatech.edu/~parikh/" target="_blank">Devi Parikh</a>.
Contribution guidelines

If you are interested in contributing to EvalAI, follow our <a href="https://github.com/Cloud-CV/EvalAI/blob/master/.github/CONTRIBUTING.md" target="_blank">contribution…