May 13, 2024, 4:46 a.m. | Vyas Raina, Rao Ma, Charles McGhee, Kate Knill, Mark Gales

cs.CL updates on arXiv.org arxiv.org

arXiv:2405.06134v1 Announce Type: new
Abstract: Recent developments in large speech foundation models like Whisper have led to their widespread use in many automatic speech recognition (ASR) applications. These systems incorporate `special tokens' in their vocabulary, such as $\texttt{}$, to guide their language generation process. However, we demonstrate that these tokens can be exploited by adversarial attacks to manipulate the model's behavior. We propose a simple yet effective method to learn a universal acoustic realization of Whisper's $\texttt{}$ token, which, when …

abstract adversarial applications arxiv asr automatic speech recognition cs.cl cs.sd eess.as foundation guide however language language generation process recognition speech speech foundation models speech recognition systems tokens type universal whisper

Senior Machine Learning Engineer

@ GPTZero | Toronto, Canada

ML/AI Engineer / NLP Expert - Custom LLM Development (x/f/m)

@ HelloBetter | Remote

Doctoral Researcher (m/f/div) in Automated Processing of Bioimages

@ Leibniz Institute for Natural Product Research and Infection Biology (Leibniz-HKI) | Jena

Seeking Developers and Engineers for AI T-Shirt Generator Project

@ Chevon Hicks | Remote

Principal Autonomy Applications

@ BHP | Chile

Quant Analytics Associate - Data Visualization

@ JPMorgan Chase & Co. | Bengaluru, Karnataka, India