Autosynced text scroll by
  • Do you know if there's any software, commercial or free, that given text and an audio file of someone reading said text, can generate a video with words scrolling by in sync with the audio? Something sort of like a dictation software but, you know, not really.
    It could be done manually in any video editing software, of course, but that'd be bloody time intensive.
  • I am not aware of any such thing...would have to be some kind of speech recognition software crossed with...well..I dunno what.
  • Indeed. It'd be the ultimate audiobook recording tool.

    Upon doing a bit more research, I found out this sort of thing is called "forced alignment" and there are a number of university research projects on the subject. Of course, that they're research projects means what you can find is very barebone software, which only comes in source files to be locally compiled, runs on UNIX only, outputs stuff in ungodly custom formats and is mostly so old you'll have tons of fun trying to make it run. I just spent an hour with P2FA (or rather, its prerequisite HTK) before giving up on the umpteen unmet library dependency.

    I'll leave some links if somebody wants to try their hands at this.

    https://www.ling.upenn.edu/phonetics/p2fa/
    http://prosodylab.org/tools/aligner/
    http://www.kyloo.net/software/doku.php/mgiza:overview (this is the most promising and possibly the only one actually worth looking into)
    http://www.voxforge.org/home/dev/autoaudioseg