Roadmap

Support more engines

Support multi-instance/multi-node frameworks

Dashboard: Live dashboard for engines and gpus (Show real time performances (metrics) of different inference engines in dashboards with Prometheus + grafana)

Dashboard: Muti-select sub-runs and generate one command line to run + data visualization + report

Leaderboard: Muti-select sub-runs and generate one command line to run + data visualization + report

Make leaderboard results shareable through URL

Combine leaderboard and dashboard(Click on one record of the leaderboard, go to the sub-run detailed page)

Generate reports(pdf/markdown) of all/selected experiments

Support local model path (but also need to track the hugging face repo name for leaderboard)

Benchmark: different types should have different config according to different models (mainly restricted on the context length)

Getting Started

Usage

Roadmap