UMD Logo

Events, Challenges & Workshops

GAMMA Lab Logo

SALMA Workshop @ ICASSP 2025

Date: April 6-11, 2025

Location: Hyderabad, India

Description: The first Workshop on Speech and Audio Language Models (SALMA), co-located with IEEE ICASSP 2025, focuses on leveraging Large Language Models (LLMs) to advance speech and audio processing. The workshop aims to bring together researchers to explore effective methodologies for improving performance across various tasks in speech, audio, and music domains, including classification, generation, and retrieval.

JSALT Challenge and Summer School

Date: June 9 - August 1, 2025

Location: Brno, Czechia

Description: The 2025 Jelinek Workshop on Speech and Language Technologies (JSALT) is an eight-week residential summer research workshop. It brings together international teams to work intensively on challenging problems in speech and language engineering, machine learning, and artificial intelligence. The workshop fosters collaboration and has a lasting influence through publications, software, and data produced. We propose Advancing Expert-Level Reasoning and Understanding in Large Audio Language Models which is a workshop focused on advancing expert-level understanding and complex reasoning in audio-language models. The team, drawn from several universities and industry in the US, Europe and Asia, and with students and senior professionals from various disciplines, will allow us to achieve these goals.

DCASE 2025 Challenge

Challenge Period: April 1 - June 15, 2025

Workshop Dates: October 30-31, 2025

Location: Barcelona, Spain

Description: The IEEE AASP Challenge on Detection and Classification of Acoustic Scenes and Events (DCASE) 2025 focuses on developing signal processing methods to automatically extract information from everyday environmental sounds. The challenge includes tasks such as acoustic scene classification, anomalous sound detection, and audio question answering. The associated workshop provides a venue for researchers to present and discuss their results. We propose The Audio Question Answering (AQA) task which focuses on advancing question-answering capabilities in the realm of “interactive audio understanding,” covering both general acoustic events and knowledge-heavy sound information within a single track. The AQA task encourages participants to develop systems that can accurately interpret and respond to complex audio-based questions (i.e., (A), (B), or (C) option), requiring models to process and reason across diverse audio types. Such systems could be useful in many applications, including audio-text and multimodal model evaluation, as well as building interactive audio-agent models for the audio research communities and beyond. Reproducible baselines will include one resource-efficient computing setting (i.e., single 8GB RAM); direct using enterprise API for challenge entry is prohibited.