Download Fundamentals of Speech Recognition

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Fundamentals of Speech
Recognition
• Goal
– Automatic recognition of speech by machine
Fundamentals of Speech
Recognition
•
Disciplines applied to most of the speech
recognition problems:
 Signal Processing: the process of extracting relevant
information from the speech signal in an efficient and
robust manner.
 Physics: the science of understanding the relationship
between the physical speech signal and physiological
mechanisms that produces speech and with which the
speech is perceived.
 Pattern recognition: is the research area that studies the
operation and design of the systems that recognize
patterns in data.
Fundamentals of Speech
Recognition
 Communication and information theory: the methods for
detecting the presence of particular speech pattern.
 Linguistics: the relationship between sounds (phonology), words
in a language (syntax), meaning of spoken words (semantics),
and sense derived from the meaning (pragmatics).
 Physiology: understanding of the mechanisms within the human
central nervous system that account for speech production and
perception in human beings.
Fundamentals of Speech
Recognition
 Computer Science: the study of efficient algorithms for
implementing, in S/W and H/W, the various methods used
in a practical speech-recognition system.
 Psychology: the science of understanding the factors that
enable a technology to be used by human beings in
practical tasks.
The Paradigm Speech Recognition
The Paradigm Speech Recognition
• Word recognition model: (spoken o/p is recognized) Speech
signal is decoded into a series of words that are meaningful
according to syntax, semantics, and pragmatics.
• Higher-level processor: the meaning of the recognized words
is obtained. The processor uses a dynamic knowledge
representation to modify the syntax, semantics and the
pragmatics according to the context of what it has previously
recognized.
• The feedback limits the search for valid input sentences from
the user.
• The system responds to the user in the form of a voice output.
Go through a Brief History