Hi fellow pushers,
I've been catching up on recent episodes while annotating them and I have to admit that the process is more time-consuming than I expected. In order to automate part of it, I've written a quick-and-dirty python script that generates improved annotation templates. It only works for episodes that are less than a week old. It does these two things:
- It downloads the ReChat logs for the target episode, tries to filter the questions from the Q&A and writes them, inside a comment, into the annotation file.
- It looks for Youtube autogenerated captions to figure out which of the questions are actually answered in the episode and when. It generates annotations for all of those.
Taking into account how little effort I've put in, I'm actually surprised of how well it works. Sometimes it misses a question or includes one that shouldn't be there and it also messes the timing of questions by a few seconds quite often. Overall, however, it ends up saving me time.
In case any of you decides to give it a try, you will need a python interpreter and the
python-requests package (which I believe is pretty standard, but not part of the standard python library). It relies on
ReChat and
KeepSubs, because I wanted to save myself the trouble of interacting with Google's authentication system.
In order to generate the template, just run:
python create_template.py episode_number
You can get it
here.
Try episodes 143 and 144 if you want to see the kind of output it generates when youtube includes or fails to include automated captions.