JSAPI 2.0

javax.speech.recognition
Interface FinalResult

All Superinterfaces:
Result
All Known Subinterfaces:
FinalRuleResult

public interface FinalResult
extends Result

Provides information about a Result that has been finalized - that is, recognition is complete.

A finalized Result is a Result that has received a RESULT_ACCEPTED or RESULT_REJECTED event that puts it in either the ACCEPTED or REJECTED state as indicated by the getResultState method. If any method of the FinalResult interface is called on a Result in the UNFINALIZED state, a ResultStateException is thrown.

This interface provides training/correction capabilities and access to audio data. It additionally provides access to alternative ResultTokens. All three capabilities are optional because they are not all relevant to all Results or all recognition environments.

Training / Correction

Because Recognizer Results are not always correct, applications need to consider the possibility that a recognition error might occur. When an application detects an error, the application should inform the Recognizer so that it can learn from the mistake and try to improve future performance. The tokenCorrection method is provided for an application to provide feedback from user correction to the Recognizer.

Sometimes, but certainly not always, the correct Result is selected by a user from the Result's N-best alternatives. In other cases, a user may type the correct Result or the application may infer a correction from following user input.

Recognizers must store considerable information to support training from Results. Applications need to be involved in the management of that information so that it is not stored unnecessarily. The isTrainingInfoAvailable method tests whether training information is available for a finalized Result. When an application/user has finished correction/training for a Result, it should call releaseTrainingInfo to free up system resources. Also, a Recognizer may choose at any time to free up training information. In both cases, the application is notified of the release with a TRAINING_INFO_RELEASED event to ResultListeners.

Audio Data

Audio data for a finalized Result is optionally provided by Recognizers. Audio data can be stored for future use by an application or user and in certain circumstances can be provided by one Recognizer to another.

Since storing audio requires substantial system resources, audio data requires special treatment. If an application wants to use audio data, it indicate this with the setResultAudioProvided method.

Not all Recognizers provide access to audio data. For those Recognizers, setResultAudioProvided has no effect, isAudioAvailable always returns false, and the getAudio methods always return null.

Recognizers that provide access to audio data cannot always provide audio for every Result. Applications should test audio availability for every FinalResult and should always test for null on the getAudio methods.

See Also:
Result, getResultState, ACCEPTED, REJECTED, UNFINALIZED, ResultEvent, RESULT_ACCEPTED, RESULT_REJECTED, TRAINING_INFO_RELEASED, ResultListener, FinalRuleResult, setResultAudioProvided

Field Summary
static int DONT_KNOW
          Constant indicating that the application does not know whether a change is because of MISRECOGNITION or USER_CHANGE.
static int MISRECOGNITION
          Constant indicating that the change is a correction of an error made by the Recognizer.
static int USER_CHANGE
          Constant indicating that the user has modified the text that was returned by the Recognizer to something different from what they actually said.
 
Fields inherited from interface Result
ACCEPTED, REJECTED, UNFINALIZED
 
Method Summary
 ResultToken[] getAlternativeTokens(int nBest)
          Gets the token sequence for the Nth-best alternative.
 AudioSegment getAudio()
          Gets the audio for the complete utterance of this Result.
 AudioSegment getAudio(ResultToken fromToken, ResultToken toToken)
          Gets the audio for a token or sequence of tokens.
 int getConfidenceLevel()
          Gets the confidence level for the best token sequence.
 int getConfidenceLevel(int nBest)
          Gets the confidence level for the Nth-best alternative.
 Grammar getGrammar(int nBest)
          Returns the Grammar matched by the Nth-best alternative.
 int getNumberAlternatives()
          Returns the number of alternatives for this Result.
 Object[] getTags(int nBest)
          Returns the list of tags matched by the token sequence for the Nth-best alternative.
 boolean isAudioAvailable()
          Tests whether audio data is available for this Result.
 boolean isTrainingInfoAvailable()
          Returns true if the Recognizer has training information available for this Result.
 void releaseAudio()
          Releases the audio for this Result.
 void releaseTrainingInfo()
          Releases the training information for this Result.
 void tokenCorrection(String[] correctTokens, ResultToken fromToken, ResultToken toToken, int correctionType)
          Informs the Recognizer of a correction to one or more tokens in a finalized Result so that the Recognizer can improve itself.
 
Methods inherited from interface Result
addResultListener, getBestToken, getBestTokens, getGrammar, getNumTokens, getResultState, getUnfinalizedTokens, removeResultListener
 

Field Detail

MISRECOGNITION

static final int MISRECOGNITION
Constant indicating that the change is a correction of an error made by the Recognizer.

See Also:
tokenCorrection, USER_CHANGE, DONT_KNOW, Constant Field Values

USER_CHANGE

static final int USER_CHANGE
Constant indicating that the user has modified the text that was returned by the Recognizer to something different from what they actually said.

See Also:
tokenCorrection, MISRECOGNITION, DONT_KNOW, Constant Field Values

DONT_KNOW

static final int DONT_KNOW
Constant indicating that the application does not know whether a change is because of MISRECOGNITION or USER_CHANGE.

See Also:
tokenCorrection, MISRECOGNITION, USER_CHANGE, Constant Field Values
Method Detail

getAlternativeTokens

ResultToken[] getAlternativeTokens(int nBest)
                                   throws ResultStateException,
                                          IllegalArgumentException
Gets the token sequence for the Nth-best alternative.

The range for nBest is 0 to (getNumberAlternatives()-1), inclusive, where 0 represents the best alternative.

If nBest == 0, this method returns a token sequence identical to the token sequence returned by the getBestTokens method.

If nBest == 1(or 2, 3...) the method returns the token sequence for the 1st- (2nd-, 3rd- ...) best alternative.

The number of tokens returned may vary for each nBest alternative.

If the Result is in the ACCEPTED state (not rejected), then the best ResultTokens and all the alternatives are accepted. If the Result is in the REJECTED state (not accepted), the Recognizer is not confident that the best ResultTokens or any of the alternatives are what the user said.

Parameters:
nBest - the Nth-best index
Returns:
ResultTokens for the nBest alternative
Throws:
ResultStateException - if called before a Result is finalized
IllegalArgumentException - if nBest is not in range
See Also:
getNumberAlternatives, ACCEPTED, REJECTED

getAudio

AudioSegment getAudio()
                      throws ResultStateException
Gets the audio for the complete utterance of this Result. Returns null if audio is not available or if it has been released.

Returns:
the complete utterance audio if available or null
Throws:
ResultStateException - if called before a Result is finalized
See Also:
isAudioAvailable, getAudio(ResultToken,ResultToken)

getAudio

AudioSegment getAudio(ResultToken fromToken,
                      ResultToken toToken)
                      throws IllegalArgumentException,
                             ResultStateException
Gets the audio for a token or sequence of tokens. Recognizers make a best effort at determining the start and end of tokens, however, it is not unusual for chunks of surrounding audio to be included or for the start or end token to be chopped.

Returns null if Result audio is not available, if it cannot be obtained for the specified sequence of tokens, or if it has been released.

If toToken is null or if fromToken and toToken are the same, the method returns audio for fromToken. If both fromToken and toToken are null, it returns the audio for the entire Result (same as getAudio()).

Not all Recognizers can provide per-token audio, even if they can provide audio for a complete Result.

Parameters:
fromToken - the beginning ResultToken for audio
toToken - the ending ResultToken for audio
Returns:
the specified audio if available or null
Throws:
IllegalArgumentException - either ResultToken is not from this FinalResult and the same token sequence, or if toToken comes before fromToken
ResultStateException - if called before a Result is finalized
See Also:
isAudioAvailable, getAudio()

getConfidenceLevel

int getConfidenceLevel()
                       throws ResultStateException
Gets the confidence level for the best token sequence. This indicates the Recognizer's confidence in this token sequence.

This method is the same as getConfidenceLevel(0). See that method for details.

Returns:
the confidence level for the best token sequence.
Throws:
ResultStateException - if called before a Result is finalized
See Also:
getConfidenceLevel(int)

getConfidenceLevel

int getConfidenceLevel(int nBest)
                       throws IllegalArgumentException,
                              ResultStateException
Gets the confidence level for the Nth-best alternative. This indicates the Recognizer's confidence in this alternative.

For an ACCEPTED result, the value should be at or above the current value of RecognizerProperties.getConfidenceThreshold. Values lie in the range between MIN_CONFIDENCE and MAX_CONFIDENCE. A value of UNKNOWN_CONFIDENCE may be returned if the confidence level is unavailable or not supported.

For a REJECTED result, a useful confidence level may be returned, but this is application and platform dependent.

The range for nBest is 0 to (getNumberAlternatives()-1), inclusive, where 0 represents the best alternative.

Parameters:
nBest - the Nth-best index
Returns:
the confidence level for this alternative.
Throws:
IllegalArgumentException - if nBest is not in range
ResultStateException - if called before a Result is finalized
See Also:
getConfidenceLevel(), RecognizerProperties, setConfidenceThreshold, MIN_CONFIDENCE, MAX_CONFIDENCE, UNKNOWN_CONFIDENCE, getResultState, getNumberAlternatives, ACCEPTED, REJECTED

getGrammar

Grammar getGrammar(int nBest)
                   throws ResultStateException,
                          IllegalArgumentException
Returns the Grammar matched by the Nth-best alternative.

The range for nBest is 0 to (getNumberAlternatives()-1), inclusive, where 0 represents the best alternative.

Note that for a finalized Result, the following holds true:

 getGrammar(0) == getGrammar()
 

Parameters:
nBest - the Nth-Best index
Returns:
the Grammar matched by the nBest alternative
Throws:
ResultStateException - if called before a Result is finalized
IllegalArgumentException - if nBest is not in range
See Also:
getNumberAlternatives, getAlternativeTokens, getGrammar()

getNumberAlternatives

int getNumberAlternatives()
                          throws ResultStateException
Returns the number of alternatives for this Result. The alternatives are numbered from 0 up. Alternative 0 is the best alternative.

If only the best alternative is available (no other alternatives), the return value is 1. If the Result was REJECTED, the return value may be 0 if no tokens are available. If the best alternative and additional alternatives are available, the return value is greater than 1.

Returns:
the number of alternatives for this Result.
Throws:
ResultStateException - if called before a Result is finalized
See Also:
getBestTokens, REJECTED, getAlternativeTokens

getTags

Object[] getTags(int nBest)
                 throws ResultStateException,
                        IllegalArgumentException,
                        IllegalStateException
Returns the list of tags matched by the token sequence for the Nth-best alternative. Returns the empty list if no tags correspond to this alternative.

The range for nBest is 0 to (getNumberAlternatives()-1), inclusive, where 0 represents the best alternative.

A REJECTED Result is not guaranteed to return tags and an empty list may be returned instead.

This method uses Grammars committed with the last call to resume before Result finalization and throws an IllegalStateException if Grammar changes are subsequently committed by resume.

Parameters:
nBest - the Nth-Best index
Returns:
tags matched by the token sequence for the nBest alternative
Throws:
ResultStateException - if called before this Result is finalized
IllegalArgumentException - if nBest is not in range
IllegalStateException - if Grammar changes are subsequently committed by resume
See Also:
getNumberAlternatives, getBestTokens, REJECTED, resume

isAudioAvailable

boolean isAudioAvailable()
                         throws ResultStateException
Tests whether audio data is available for this Result. Audio is only available if: The availability of audio for a Result does not mean that all getAudio calls will return audio. For example, some Recognizers might provide audio data only for the entire Result, only for individual tokens, or not for sequences of more than one token.

Returns:
true if audio data is available for this Result
Throws:
ResultStateException - if called before a Result is finalized
See Also:
getAudio, getAudio, RecognizerProperties, setResultAudioProvided

isTrainingInfoAvailable

boolean isTrainingInfoAvailable()
                                throws ResultStateException
Returns true if the Recognizer has training information available for this Result. Training information is available if the following conditions are met: Calls to tokenCorrection have no effect if the training information is not available.

Returns:
true if training information is available for this Result
Throws:
ResultStateException - if called before a Result is finalized
See Also:
RecognizerProperties, isTrainingProvided, releaseTrainingInfo, tokenCorrection, TRAINING_INFO_RELEASED

releaseAudio

void releaseAudio()
                  throws ResultStateException
Releases the audio for this Result. After audio is released, isAudioAvailable will return false. This call is ignored if the audio is not available or has already been released.

This method is asynchronous - audio data is not necessarily released immediately. An AUDIO_RELEASED event is issued to the ResultListener when the audio is released by a call to this method. An AUDIO_RELEASED event is also issued if the Recognizer releases the audio for some other reason (for example, to reclaim memory).

Throws:
ResultStateException - if called before a Result is finalized
See Also:
isAudioAvailable, AUDIO_RELEASED, ResultListener

releaseTrainingInfo

void releaseTrainingInfo()
                         throws ResultStateException
Releases the training information for this Result. The release frees memory used for the training information - this information can be substantial.

After training information is released, isTrainingInfoAvailable will return false. It is not an error to call the method when training information is not available or has already been released.

This method is asynchronous - the training info is not necessarily released when the call returns. A TRAINING_INFO_RELEASED event is issued to the ResultListener once the information is released. The TRAINING_INFO_RELEASED event is also issued if the Recognizer releases the training information for any other reason (for example, to reclaim memory).

Throws:
ResultStateException - if called before a Result is finalized
See Also:
isTrainingInfoAvailable, TRAINING_INFO_RELEASED, ResultListener

tokenCorrection

void tokenCorrection(String[] correctTokens,
                     ResultToken fromToken,
                     ResultToken toToken,
                     int correctionType)
                     throws ResultStateException,
                            IllegalArgumentException,
                            SecurityException
Informs the Recognizer of a correction to one or more tokens in a finalized Result so that the Recognizer can improve itself. Training the Recognizer from its mistakes allows it to improve its performance and accuracy in future recognition.

Correction improvements apply to the Recognizer instance, but may persist longer. If the Recognizer uses a SpeakerManager, improvements may only apply to the current SpeakerProfile.

The fromToken and toToken parameters indicate the inclusive sequence of tokens that are being trained or corrected. If toToken is null or if fromToken and toToken are the same, the training applies to a single recognized token.

The correctTokens sequence may have the same or different length than the token sequence being corrected. Setting correctTokens to null indicates the deletion of tokens.

The correctionType parameter must be one of MISRECOGNITION, USER_CHANGE, or DONT_KNOW.

Note that tokenCorrection does not change the Result object. Future calls to the getBestToken, getBestTokens and getAlternativeTokens methods return exactly the same values as before this call.

Parameters:
correctTokens - replacement sequence for fromToken to toToken
fromToken - first token in the sequence being corrected
toToken - last token in the sequence being corrected
correctionType - type of correction
Throws:
IllegalArgumentException - either token is not from this FinalResult or if toToken comes before fromToken
ResultStateException - if called before a Result is finalized
SecurityException - if token correction is not allowed for the application
See Also:
MISRECOGNITION, USER_CHANGE, DONT_KNOW, getBestTokens, SpeakerManager, SpeakerProfile
Required permission:
javax.speech.recognition.FinalResult.tokenCorrection

JSAPI 2.0

JavaTM Speech API 2.0, Final Release v2.0.6.
© 2008, Conversay and Sun Microsystems.

Free Web Hosting