Environment integration: When evaluating code, various environments need to be pre-installed, such as JDK for Java, Node for JavaScript, various versions of numpy and torch in DS1000, etc. This ...
We introduce the Berkeley Function Leaderboard (BFCL), the first comprehensive and executable function call evaluation dedicated to assessing Large Language Models' (LLMs) ability to invoke functions.
Some results have been hidden because they may be inaccessible to you
Show inaccessible results
Feedback