useSpeechRecognition
Hook to use SpeechRecognition API.
See: Web Speech API.
Demo
INFO
This component use useSpeechRecognition hook to simulate a Speech color change app. When button start is clicked, you can say an HTML color keyword and the bordered div color will change to that color.
Show source code
import { useCallback, useMemo, useRef, useState } from "react";
import { SpeechGrammarList, SpeechRecognitionErrorEvent, usePerformAction, useSpeechRecognition } from "../../../.."
const SpeechGrammarListP = ((window as any).SpeechGrammarList || (window as any).webkitSpeechGrammarList);
export const UseSpeechRecognition = () => {
const colors = useRef(['aqua', 'azure', 'beige', 'bisque', 'black', 'blue', 'brown', 'chocolate', 'coral', 'crimson', 'cyan', 'fuchsia', 'ghostwhite', 'gold', 'goldenrod', 'gray', 'green', 'indigo', 'ivory', 'khaki', 'lavender', 'lime', 'linen', 'magenta', 'maroon', 'moccasin', 'navy', 'olive', 'orange', 'orchid', 'peru', 'pink', 'plum', 'purple', 'red', 'salmon', 'sienna', 'silver', 'snow', 'tan', 'teal', 'thistle', 'tomato', 'turquoise', 'violet', 'white', 'yellow', 'transparent']);
const grammar = `#JSGF V1.0; grammar colors; public <color> = ${colors.current.join(' | ')} ;`
const btnRef = useRef<HTMLButtonElement>(null);
const perform = usePerformAction(() => btnRef.current?.focus());
const [message, setMessage] = useState("Ready");
const [state, start] = useSpeechRecognition({
onStart: useCallback(() => {
setMessage("Listening...")
}, []),
onEnd: useCallback((ev, {stop}) => {
stop();
setMessage("Finish");
perform();
}, [perform]),
onNoMatch: useCallback(() => {
setMessage("Color not recognized.")
}, []),
onError: useCallback((ev: SpeechRecognitionErrorEvent) => {
setMessage(`Error occurred in recognition: ${ev.message ? ev.message : ev.error}`);
}, []),
});
const onStart = () => {
const grammars = new SpeechGrammarListP() as SpeechGrammarList;
grammars.addFromString(grammar, 1);
start({
lang: "en-US",
continuous: false,
interimResults: false,
maxAlternatives: 1,
grammars
});
}
const color = useMemo(() => {
let colr = "transparent";
if (state.result.results) {
const color = state.result.results[0][0].transcript;
if (colors.current.includes(color)) {
colr = color;
}
}
return colr;
}, [state.result.results]);
return <div style={{ display: "flex", flexDirection: "column", justifyContent: "center", gap: 10 }}>
{
state.isSupported
? <>
<div style={{ display: 'flex', flexDirection: 'column' }}>
<p>Click on start to say a color to change backgroundColor of bordered div. Try</p>
<div style={{ display: 'flex', flexWrap: "wrap", gap: 10 }}>
{
colors.current.map(el => <span key={el} style={{ color: el }}>{el}</span>)
}
</div>
</div>
<p>{message}</p>
<div style={{ border: "1px solid lightgray", width: 300, height: 150, backgroundColor: color, margin: '0 auto' }}>
{
state.result && <p>Color is: {color}</p>
}
</div>
<div style={{ display: 'flex', justifyContent: "center", gap: 10 }}>
<button ref={btnRef} onClick={onStart} disabled={state.isListening}>Start</button>
</div>
</>
: <p>Speech Recognition not supported</p>
}
</div>
}Types
SpeechRecognitionControls
Imperative controls for the speech recognition instance, injected as the second argument of every event callback in {@link UseSpeechRecognitionProps}. Using these controls inside a callback is always safe — they delegate to the latest internal implementation via stable refs, so no circular dependency or stale closure issues arise.
| Property | Type | Required | Description |
|---|---|---|---|
start | (config?: SpeechRecognitionConfig) => void | ✓ | Starts a new recognition session, optionally overriding the defaultConfig passed to the hook. |
stop | () => void | ✓ | Stops the current recognition session and attempts to return results for audio captured so far. |
reset | (resultAlso?: boolean) => void | ✓ | Aborts the current recognition session. Pass true to also clear the last result from state. |
UseSpeechRecognitionProps
Configuration object accepted by useSpeechRecognition.
All callback properties mirror the corresponding event handlers on the native SpeechRecognition interface.
| Property | Type | Required | Description |
|---|---|---|---|
alreadyStarted | boolean | When true, recognition starts automatically on mount (equivalent to calling start() immediately). Useful for "always listening" scenarios. | |
defaultConfig | SpeechRecognitionConfig | Default configuration applied to the SpeechRecognition instance. See {@link SpeechRecognitionConfig} for the available options. | |
onAudioStart | (this: SpeechRecognition, ev: Event, controls: SpeechRecognitionControls) => void | Fired when the user agent has started capturing audio. | |
onAudioEnd | (this: SpeechRecognition, ev: Event, controls: SpeechRecognitionControls) => void | Fired when the user agent has finished capturing audio. | |
onEnd | (this: SpeechRecognition, ev: Event, controls: SpeechRecognitionControls) => void | Fired when the speech recognition service disconnects. | |
onError | (this: SpeechRecognition, ev: SpeechRecognitionErrorEvent, controls: SpeechRecognitionControls) => void | Fired when a speech recognition error occurs. | |
onNoMatch | (this: SpeechRecognition, ev: SpeechRecognitionEvent, controls: SpeechRecognitionControls) => void | Fired when no significant recognition was returned. | |
onResult | (this: SpeechRecognition, ev: SpeechRecognitionEvent, controls: SpeechRecognitionControls) => void | Fired when a word or phrase has been positively recognised. | |
onSoundStart | (this: SpeechRecognition, ev: Event, controls: SpeechRecognitionControls) => void | Fired when any sound (recognisable or not) has been detected. | |
onSoundEnd | (this: SpeechRecognition, ev: Event, controls: SpeechRecognitionControls) => void | Fired when any sound has stopped being detected. | |
onSpeechStart | (this: SpeechRecognition, ev: Event, controls: SpeechRecognitionControls) => void | Fired when speech recognised by the service has been detected. | |
onSpeechEnd | (this: SpeechRecognition, ev: Event, controls: SpeechRecognitionControls) => void | Fired when speech recognised by the service has stopped being detected. | |
onStart | (this: SpeechRecognition, ev: Event, controls: SpeechRecognitionControls) => void | Fired when the speech recognition service has begun listening. |
SpeechRecognitionConfig
Initial configuration options for the SpeechRecognition instance.
| Property | Type | Required | Description |
|---|---|---|---|
grammars | SpeechGrammarList | Returns and sets a collection of SpeechGrammar objects that represent the grammars that will be understood by the current SpeechRecognition. | |
lang | LanguageBCP47Tags | BCP 47 language tag for recognition (e.g. "en-US", "it-IT"). Falls back to the <html lang> attribute or the user agent's language setting. | |
continuous | boolean | When true, continuous results are returned for each recognition phrase. When false (default), recognition stops after the first result. | |
interimResults | boolean | When true, interim (non-final) results are also returned. When false, only final results are delivered. | |
maxAlternatives | number | Maximum number of SpeechRecognitionAlternative objects per result. |
SpeechRecognitionErrorCode
export type SpeechRecognitionErrorCode = 'aborted' | 'audio-capture' | 'bad-grammar' | 'language-not-supported' | 'network' | 'no-speech' | 'not-allowed' | 'service-not-allowed'SpeechRecognitionErrorEvent
| Property | Type | Required | Description |
|---|---|---|---|
error | SpeechRecognitionErrorCode | ✓ | |
message | string |
SpeechRecognitionEvent
| Property | Type | Required | Description |
|---|---|---|---|
resultIndex | number | ✓ | Returns the lowest index value result in the SpeechRecognitionResultList "array" that has actually changed. |
results | SpeechRecognitionResultList | ✓ | Returns a SpeechRecognitionResultList object representing all the speech recognition results for the current session. |
SpeechGrammar
The SpeechGrammar interface of the Web Speech API represents a set of words or patterns of words that we want the recognition service to recognize.
| Property | Type | Required | Description |
|---|---|---|---|
src | string | ✓ | Sets and returns a string containing the grammar from within in the SpeechGrammar object instance. |
weight | number | Sets and returns the weight of the SpeechGrammar object. |
SpeechGrammarList
The SpeechGrammarList interface of the Web Speech API represents a list of SpeechGrammar objects containing words or patterns of words that we want the recognition service to recognize.
| Property | Type | Required | Description |
|---|---|---|---|
length | number | ✓ | Returns the number of SpeechGrammar objects contained in the SpeechGrammarList. |
item | (index: number) => SpeechGrammar | ✓ | Standard getter — allows individual SpeechGrammar objects to be retrieved from the SpeechGrammarList using array syntax. |
addFromURI | ( src: string, weight?: number ) => undefined | ✓ | Takes a grammar present at a specific URI and adds it to the SpeechGrammarList as a new SpeechGrammar object. |
addFromString | ( src: string, weight?: number ) => undefined | ✓ | Adds a grammar in a string to the SpeechGrammarList as a new SpeechGrammar object. |
| `` | any | ✓ |
SpeechRecognitionState
Reactive state snapshot returned by useSpeechRecognition.
| Property | Type | Required | Description |
|---|---|---|---|
isSupported | boolean | ✓ | true when the Web Speech API (SpeechRecognition or webkitSpeechRecognition) is available in the current browser. |
isListening | boolean | ✓ | true while the speech recognition service is actively listening for speech. Becomes false after stop() is called or recognition ends automatically. |
result | { results: SpeechRecognitionEvent["results"] \| null; resultIndex: SpeechRecognitionEvent["resultIndex"] \| null; } | ✓ | The most recent recognition result. - results — a SpeechRecognitionResultList containing all result alternatives. - resultIndex — index of the most recent result in the list. Both are null before the first result arrives. |
SpeechRecognition
The SpeechRecognition Web API interface.
| Property | Type | Required | Description |
|---|---|---|---|
grammars | SpeechGrammarList | ✓ | Grammar list used by this recognition instance. |
lang | LanguageBCP47Tags | ✓ | BCP 47 language tag. Falls back to the document's lang attribute or the user agent's default language. |
continuous | boolean | ✓ | When true, recognition continues returning results until stop() is called. |
interimResults | boolean | ✓ | When true, interim (non-final) results are delivered via onresult. |
maxAlternatives | number | ✓ | Maximum number of recognition alternatives per result. |
onaudiostart | ((this: SpeechRecognition, ev: Event) => void) \| null | ✓ | Called when audio capture starts. |
onaudioend | ((this: SpeechRecognition, ev: Event) => void) \| null | ✓ | Called when audio capture ends. |
onend | ((this: SpeechRecognition, ev: Event) => void) \| null | ✓ | Called when the service disconnects. |
onerror | ((this: SpeechRecognition, ev: SpeechRecognitionErrorEvent) => void) \| null | ✓ | Called when a recognition error occurs. |
onnomatch | ((this: SpeechRecognition, ev: SpeechRecognitionEvent) => void) \| null | ✓ | Called when no significant recognition was found. |
onresult | ((this: SpeechRecognition, ev: SpeechRecognitionEvent) => void) \| null | ✓ | Called when a recognition result is returned. |
onsoundstart | ((this: SpeechRecognition, ev: Event) => void) \| null | ✓ | Called when any detectable sound starts. |
onsoundend | ((this: SpeechRecognition, ev: Event) => void) \| null | ✓ | Called when detectable sound stops. |
onspeechstart | ((this: SpeechRecognition, ev: Event) => void) \| null | ✓ | Called when recognised speech starts. |
onspeechend | ((this: SpeechRecognition, ev: Event) => void) \| null | ✓ | Called when recognised speech ends. |
onstart | ((this: SpeechRecognition, ev: Event) => void) \| null | ✓ | Called when the recognition service starts listening. |
start | () => void | ✓ | Starts listening for speech. |
stop | () => void | ✓ | Stops listening; returns results for speech recognised so far. |
abort | () => void | ✓ | Stops listening without returning results. |
