WESR-Bench is an expert-annotated evaluation set (900+ utterances) designed for precise localization of 21 non-verbal vocal events. It features a novel position-aware protocol that separates ASR errors from event detection, enabling accurate measurement for both discrete and continuous events.
WESR-Bench is a new, expertly labeled dataset and evaluation method for precisely finding non-verbal sounds like laughter or crying within speech. It helps researchers accurately measure how well AI systems can detect these sounds, even when mixed with spoken words, by separating speech recognition errors from sound detection errors.
WESR
Was this definition helpful?