News
S3 legal Evals on Fable 5: not perfect (screen 2) but better than the rest (screen 1)
June 15, 20261 minute
S3 legal Evals on Fable 5: not perfect (screen 2) but better than the rest (screen 1)
First, making objective benchmarks for legal work with LLM's is subjective. Especially for case law heavy jurisdictions like USA πΊπΈ where benchmarks rely on 'choices' in 'perfect' answers.
Second, code law heavy jurisdictions that require exact references to legislation articles or cases are a more objective eval.
Ultimately, all legal reasoning relies on the 'correct' codes and cases π..
..and with Fable 5, we were getting closer to this reality
Like to debate 'perfect' legal answers? Go here ππ½ https://lnkd.in/eGRr4uMC


