Zenn

2026-05-21
Measuring the "emotional intelligence" of a local LLM — Benchmarking Qwen3.5-122B(int4) with EQ-Bench 3
To begin with, when evaluating LLM, knowledge and inference benchmarks such as MMLU and GSM8K are standard, but we also consider questions like, "How well can it interpret interpersonal nuances?" and "...