Dashboard: Live dashboard for engines and gpus (Show real time performances (metrics) of different inference engines in dashboards with Prometheus + grafana)
Dashboard: Muti-select sub-runs and generate one command line to run + data visualization + report
Leaderboard: Muti-select sub-runs and generate one command line to run + data visualization + report
Make leaderboard results shareable through URL
Combine leaderboard and dashboard(Click on one record of the leaderboard, go to the sub-run detailed page)
Generate reports(pdf/markdown) of all/selected experiments
Support local model path (but also need to track the hugging face repo name for leaderboard)
Benchmark: different types should have different config according to different models (mainly restricted on the context length)
Assistant
Responses are generated using AI and may contain mistakes.