Robust Spoken Language Understanding with Unsupervised ASR-Error Adaptation

Su Zhu, Ouyu Lan, and Kai Yu

ICASSP, 2018

Paper

Abstract

Robustness to errors produced by automatic speech recognition (ASR) is essential for Spoken Language Understanding (SLU). Traditional robust SLU typically needs ASR hypotheses with semantic annotations for training. However, semantic annotation is very expensive, and the corresponding ASR system may change frequently. Here, we propose a novel unsupervised ASR-error adaptation method, obviating the need of annotated ASR hypotheses. It only requires semantically annotated transcripts for the slot-tagging task and the transcripts paired with hypotheses for an input sentence reconstruction task. In this method, feature encoders which share part of the parameters are exploited to enforce the tasks in a similar feature space. Therefore, the transcript side slottagging model can be transferred to ASR hypotheses side easily.

Method

method

We are the first to investigate unsupervised ASR-error adaptation problem for slot tagging without annotation on ASR hypotheses. It would potentially be useful for the deployment of commercial dialogue systems. We propose an adversarial adaptation method for ASR-error adaptation problem in SLU, exploiting pairs of the utterance transcript and ASR hypotheses. This method only requires semantically annotated transcripts for slot tagging and raw transcripts paired with ASR hypotheses for the ASR-error adaptation, obviating the annotation on the hypotheses. The corresponding data sources used in this method are described as below: • tag: utterance transcripts with the annotations of slottag sequence. • tscp: utterance transcripts. • asr: utterance hypotheses given by ASR system.

Experiments

method

Experiments show that the proposed approach can yield significant improvement over strong baselines, and achieve performance very close to the oracle system.

To cite us

@INPROCEEDINGS{zhu2018robust,
  author={S. {Zhu} and O. {Lan} and K. {Yu}},
  booktitle={2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)}, 
  title={Robust Spoken Language Understanding with Unsupervised ASR-Error Adaptation}, 
  year={2018},
  pages={6179-6183},}