Stars
Evaluation
3 repositories
Evals is a framework for evaluating LLMs and LLM systems, and an open-source registry of benchmarks.
Code for the paper "Evaluating Large Language Models Trained on Code"