Speech Recognition

Proof pending

25papers

5.5viability

-33%30d

Proof pending

Proof pending. Core topic summary fields are still materializing.

State of the Field

Recent advancements in speech recognition technology focus on improving performance across diverse languages and contexts, particularly for low-resource languages. Innovations like Vividh-ASR and Ethio-ASR address challenges in multilingual models, while Whisper-CD enhances long-form transcription accuracy. These developments are crucial for builders aiming to create inclusive and robust speech applications that cater to underrepresented languages and dialects. By leveraging newly curated datasets and optimized training strategies, researchers are paving the way for more reliable and efficient speech recognition systems that can adapt to various linguistic nuances and real-world scenarios. This progress not only enhances user experience but also broadens accessibility in technology.

Last updated May 24, 2026

Topic-linked question coverage is still building for this proof surface.

Topic trend

Topic-specific paper and score movement from the daily diff ledger.

Papers

1-10 of 25

Research Paper·May 13, 2026

Vividh-ASR: A Complexity-Tiered Benchmark and Optimization Dynamics for Robust Indic Speech Recognition

Fine-tuning multilingual ASR models like Whisper for low-resource languages often improves read speech but degrades spontaneous audio performance, a phenomenon we term studio-bias. To diagnose this mi...

8.0 viability

Research Paper·Mar 6, 2026

Whisper-CD: Accurate Long-Form Speech Recognition using Multi-Negative Contrastive Decoding

Long-form speech recognition with large encoder-decoder models such as Whisper often exhibit hallucinations, repetition loops, and content omissions. These errors can accumulate and be further amplifi...

8.0 viability

Research Paper·Mar 8, 2026

Nwāchā Munā: A Devanagari Speech Corpus and Proximal Transfer Benchmark for Nepal Bhasha ASR

Nepal Bhasha (Newari), an endangered language of the Kathmandu Valley, remains digitally marginalized due to the severe scarcity of annotated speech resources. In this work, we introduce Nwāchā Munā, ...

7.0 viability

Research Paper·Mar 16, 2026

Tagarela - A Portuguese speech dataset from podcasts

Despite significant advances in speech processing, Portuguese remains under-resourced due to the scarcity of public, large-scale, and high-quality datasets. To address this gap, we present a new datas...

7.0 viabilityHas code

Research Paper·Mar 24, 2026

Ethio-ASR: Joint Multilingual Speech Recognition and Language Identification for Ethiopian Languages

We present Ethio-ASR, a suite of multilingual CTC-based automatic speech recognition (ASR) models jointly trained on five Ethiopian languages: Amharic, Tigrinya, Oromo, Sidaama, and Wolaytta. These la...

7.0 viability

Research Paper·Apr 23, 2026

Do LLM Decoders Listen Fairly? Benchmarking How Language Model Priors Shape Bias in Speech Recognition

As pretrained large language models replace task-specific decoders in speech recognition, a critical question arises: do their text-derived priors make recognition fairer or more biased across demogra...

7.0 viability

Research Paper·Mar 5, 2026

Exploring the potential and limitations of Model Merging for Multi-Domain Adaptation in ASR

Model merging is a scalable alternative to multi-task training that combines the capabilities of multiple specialised models into a single model. This is particularly attractive for large speech found...

7.0 viability

Research Paper·Mar 16, 2026

Vietnamese Automatic Speech Recognition: A Revisit

Automatic Speech Recognition (ASR) performance is heavily dependent on the availability of large-scale, high-quality datasets. For low-resource languages, existing open-source ASR datasets often suffe...

7.0 viabilityHas code

Research Paper·Mar 23, 2026

Cascade-Free Mandarin Visual Speech Recognition via Semantic-Guided Cross-Representation Alignment

Chinese mandarin visual speech recognition (VSR) is a task that has advanced in recent years, yet still lags behind the performance on non-tonal languages such as English. One primary challenge arises...

7.0 viability

Research Paper·Feb 26, 2026

Efficient Dialect-Aware Modeling and Conditioning for Low-Resource Taiwanese Hakka Speech Processing

Taiwanese Hakka is a low-resource, endangered language that poses significant challenges for automatic speech recognition (ASR), including high dialectal variability and the presence of two distinct w...

7.0 viability

Page 1 of 3

Speech Recognition

Proof pending

State of the Field

Topic trend

Papers

Vividh-ASR: A Complexity-Tiered Benchmark and Optimization Dynamics for Robust Indic Speech Recognition

Whisper-CD: Accurate Long-Form Speech Recognition using Multi-Negative Contrastive Decoding

Nwāchā Munā: A Devanagari Speech Corpus and Proximal Transfer Benchmark for Nepal Bhasha ASR

Tagarela - A Portuguese speech dataset from podcasts

Ethio-ASR: Joint Multilingual Speech Recognition and Language Identification for Ethiopian Languages

Do LLM Decoders Listen Fairly? Benchmarking How Language Model Priors Shape Bias in Speech Recognition

Exploring the potential and limitations of Model Merging for Multi-Domain Adaptation in ASR

Vietnamese Automatic Speech Recognition: A Revisit

Cascade-Free Mandarin Visual Speech Recognition via Semantic-Guided Cross-Representation Alignment

Efficient Dialect-Aware Modeling and Conditioning for Low-Resource Taiwanese Hakka Speech Processing

Filters

Topic proof surfaces

Speech Recognition

Use this topic page as a durable research-area proof surface