Hi,
I'm working on an app that is a kind of language tutorial that teaches students learning English how to pronounce difficult words.
One of the things I'd like to do is ask the user to speak into the app and use the speech framework to evaluate their speech.
It would be helpful if I could read out the results of the speech recognition as phonemes (syllables) instead of, or in addition to, rendered text. This would allow the app to pinpoint errors in pronounciation more accurately.
Is there a way to get this information out of the speech framework? I would imagine that speech recognition starts by recognizing phonemes, which are eventually translated into words, so I think the information must be present.
Thanks,
Frank