Jul, 2024
多约束复杂指令跟踪的基准测试
Benchmarking Complex Instruction-Following with Multiple Constraints
Composition
TL;DRLLMs' ability to follow complex instructions composed of multiple constraints is evaluated using ComplexBench, a new benchmark that exposes deficiencies in existing models.