Speech on WebGL

Works in progress and technical screen shots.

Moderator: joepal

Speech on WebGL

Postby jcpalmer » Thu Sep 07, 2017 8:16 pm

I just completed a database (Carnegie Mellon University / DOD Arpabet) supported lipsync capability. I have show this scene before, but the "Talk" had not been implemented yet.

This is a also cross post of this.
jcpalmer
 
Posts: 115
Joined: Tue Dec 16, 2014 4:14 pm

Re: Speech on WebGL

Postby alex_farlie » Thu Sep 14, 2017 8:47 pm

jcpalmer wrote:I just completed a database (Carnegie Mellon University / DOD Arpabet) supported lipsync capability. I have show this scene before, but the "Talk" had not been implemented yet.

This is a also cross post of this.


Interesting. Does the audio for this have to pre-recorded? The reason I ask is the development of a Text to Speech system for Wikimedia (https://www.mediawiki.org/wiki/Wikispeech), and the existence of other Text to Speech systems like e-speak-ng(https://github.com/espeak-ng) which could within reason and bandwidth be of use to 'avatar' like concepts such as that presented.

Not that it was within the scope of your project (or that of MakeHuman), I was wondering if "free" storytelling avatars to read classic and now public domain novels( The model suitably 'dressed up' for the character of the novel. A WebGL remake on the speech boxes you got in old-school RPG/ and adventure games :lol:

My first thought on seeing that model was, that someone could apply the concept to write "story-teller" applets for phone or tablet that support WebGL. Is there an open standard for encoding a sequence of animated facial expressions for virtual actors, yet? (i.e adding a time domain.) If not, I am not too worried, but being able to give a "free" applet, a script and have it do it's thing would be awsome :).

(Aside: As the model shown is fed by audio, can it sing Daisy? ;) )
alex_farlie
 
Posts: 38
Joined: Thu Sep 07, 2017 9:44 pm

Re: Speech on WebGL

Postby jcpalmer » Sat Sep 16, 2017 5:10 pm

No. The tool I made to generate animation sequences is sort of running that way as I speak into the microphone. A quick search found that there is Web Speech APIWeb Speech API. If there is a w3 standard, I would use it over any other work.

Checking canuse, there seems to be wide support for speech synthesis. The trick is to see if there is a way to control the speech in a predictable way that I can match. Not going to do this at this time though.
jcpalmer
 
Posts: 115
Joined: Tue Dec 16, 2014 4:14 pm

Re: Speech on WebGL

Postby jcpalmer » Mon Sep 18, 2017 9:36 pm

@alex_farlie
Sitting on the couch yesterday, I did look at the Web Speech API. You can have speech that corresponds to a mood / expression, indirectly. There are attributes for rate, volume & pitch. It would require playing with those to get 'angry' lets say, but probably do-able.

The problem I see, is while there is a rate of speech, it is up to each browser to define what that is. I made a test page,to verify times differences across browsers are large enough to ruin the animation. They do.

With much work this might be overcome, but notice the speak() function does not actually start the speech, partly because it only queues the speech. There is no event which indicates it is ready. You see an implementation could retrieve the speech from a server according to the API. Testing would have to be done to check. If this is the case, the network could delay the audio while the animation started. Hope this helps.

BTW, the repo you list is C based, not Javascript.
jcpalmer
 
Posts: 115
Joined: Tue Dec 16, 2014 4:14 pm

Re: Speech on WebGL

Postby alex_farlie » Wed Sep 20, 2017 9:12 am

jcpalmer wrote:@alex_farlie
Sitting on the couch yesterday, I did look at the Web Speech API. You can have speech that corresponds to a mood / expression, indirectly. There are attributes for rate, volume & pitch. It would require playing with those to get 'angry' lets say, but probably do-able.

The problem I see, is while there is a rate of speech, it is up to each browser to define what that is. I made a test page,to verify times differences across browsers are large enough to ruin the animation. They do.

With much work this might be overcome, but notice the speak() function does not actually start the speech, partly because it only queues the speech. There is no event which indicates it is ready. You see an implementation could retrieve the speech from a server according to the API. Testing would have to be done to check. If this is the case, the network could delay the audio while the animation started. Hope this helps.

BTW, the repo you list is C based, not Javascript.


Thanks, my previous comments were intended generally, I was not expecting anyone to suddenly try and implement the ideas in them. :lol: It's amazing people find the time to research things like this.
alex_farlie
 
Posts: 38
Joined: Thu Sep 07, 2017 9:44 pm


Return to WIP (Work In Progress)

Who is online

Users browsing this forum: No registered users and 1 guest