← Back to Blog
S3 legal Evals on Fable 5: not perfect (screen 2) but better than the rest (screen 1)
News

S3 legal Evals on Fable 5: not perfect (screen 2) but better than the rest (screen 1)

June 15, 20261 minute

S3 legal Evals on Fable 5: not perfect (screen 2) but better than the rest (screen 1)

First, making objective benchmarks for legal work with LLM's is subjective. Especially for case law heavy jurisdictions like USA πŸ‡ΊπŸ‡Έ where benchmarks rely on 'choices' in 'perfect' answers.

Second, code law heavy jurisdictions that require exact references to legislation articles or cases are a more objective eval.

Ultimately, all legal reasoning relies on the 'correct' codes and cases πŸ˜‡..

..and with Fable 5, we were getting closer to this reality

Like to debate 'perfect' legal answers? Go here πŸ‘‰πŸ½ https://lnkd.in/eGRr4uMC

More to read