The speech issue was that your were using the native speech, but Chrome does not support HTML5 speech on mobile. The server voices work.
You can use your own JavaScript to render your avatar however you wish. You can copy the code from createBox() and modify it to suite your needs, our SDK is open source.