The Annotation Pushers

Hi needeep, it'd be great to have your help with this! Abner was the one who set us all up on the GitHub repo so, if he hasn't already dropped by here, I'll let him know you're interested in helping when next I see him in the IRC channel.
Hey needeep,

An invitation was sent to you. Welcome :)
The guide is a wonderful resource and I use it quite often.

I'd like to contribute. I'm debiatan on github.
Thank you debiatan, we're excited to have you! Invitation sent.
Hi fellow pushers,

I've been catching up on recent episodes while annotating them and I have to admit that the process is more time-consuming than I expected. In order to automate part of it, I've written a quick-and-dirty python script that generates improved annotation templates. It only works for episodes that are less than a week old. It does these two things:
- It downloads the ReChat logs for the target episode, tries to filter the questions from the Q&A and writes them, inside a comment, into the annotation file.
- It looks for Youtube autogenerated captions to figure out which of the questions are actually answered in the episode and when. It generates annotations for all of those.

Taking into account how little effort I've put in, I'm actually surprised of how well it works. Sometimes it misses a question or includes one that shouldn't be there and it also messes the timing of questions by a few seconds quite often. Overall, however, it ends up saving me time.

In case any of you decides to give it a try, you will need a python interpreter and the python-requests package (which I believe is pretty standard, but not part of the standard python library). It relies on ReChat and KeepSubs, because I wanted to save myself the trouble of interacting with Google's authentication system.

In order to generate the template, just run:
python create_template.py episode_number

You can get it here.

Try episodes 143 and 144 if you want to see the kind of output it generates when youtube includes or fails to include automated captions.
Hei debiatan!

Man, absolutely. At a guess, the shortest amount of time I could possibly spend on one would be maybe 2× the length of the vid, but that'll be after a fair bit more practice.

Your script seems pretty nifty. I've tried it for Days 144 and 143 as you suggest but, personally, I'm not sure it'll be such a big win for me. Since I use my IRC log for doing the Q&As, I can get through them pretty much in real-time. If you feel comfortable not watching the whole Q&A, though, maybe you'll beat me, and if you weren't using a log then you will definitely get a big win.

The stub creation – just the bit before the markers – is cool, though, and is something I recently thought of automating but hadn't got round to doing. Also, now that I've started putting the quotes in the notes (like so), maybe it'd be an idea to automate that. We'll only need to add the timecode manually while annotating.
Hi folks, great to see that there's been a bunch of activity in the repo while I've been languishing in laptopless hell. My laptop is currently being repaired and should hopefully be back with me next week so I can pick up where I left off.

Just a couple of questions have occurred to me:

  1. Do we want to create issues for the new videos, even though debiatan has been annotating them like a ninja almost before Casey has signed off? I'd been creating issues, but there was one day that I fell asleep straight after the stream and awoke to find that episode already annotated! I did create an issue for it and retrospectively assigned our ninja, thinking that we want to keep track of all the videos in the issues system. Do we want to do this?
  2. Would it be a good idea to invite csnover to the team on the strength of his recent pull requests, even though he hasn't (as far as I know) asked to join?
Hi Miblo,

glad to hear your laptoplessness is about to come to an end.

1) Issues are useful as a centralized backlog when no one finds the time to annotate new episodes. They also allow us to keep track of who's doing what so as to avoid duplicated effort. It just so happens that all current active annotators are closing issues on separate fronts without having discussed it beforehand (kind of magical, if you ask me), so it might seem that issues are superfluous right now.

When I pushed the "non-issued" annotation last week, I considered creating the issue myself, but decided not to because if would have been just busy work for me. I wasn't thinking about the big picture, though.

I'm very pro-issues and I'm certainly pro-miblo-creating-the-issues. If we all collectively decided to keep them, I could refrain myself from committing annotations before there was an issue for them in the tracker (so as not to mess with miblo's workflow). We could also try writing a script to create the issues automatically whenever a new episode was uploaded to the archive (I think I could manage that).

2) If we let csnover in, we'll be out of work soon. Imagine us, nit-picking on the grammar and timing of each other's annotations* or building a subject index by grepping annotation files for keywords or, I don't know... maybe even programming!

* English is my second language, by the way, so feel free to correct my annotations as you see fit. I'll try not to repeat my mistakes.
Issues is something we decided to due so we don't have multiple people working on the same episode at once, though edits to make things better are great.

If it is already done and no one has made a issue then I can see both sides of creating and not creating, and I am not really sure which is correct. On one side if we have commit messages we can just look at who pushed the annotations or we could go look at the closed issue. I think we should discuss what the other annotation pushers think on the topic so we can get a clear answer. To me it would be nice to have the issue in case we rebase and lose commit messages we would still have the log of who did what. So I think just take the 30 seconds to create the issue if one does not exist, anyone can do this that is on the team. It is a tiny bit of work but I think the time is worth it if we ever had to go back and look at who did something. Since issue's stay if it was rebased we always should have them. Though lets hear what the others think.
Cheers, chaps. Okay, yep, let's see what the others think. My workflow is out of the window at the moment, though, so if we do decide to keep the issues (and retrospectively create them for ones already closed), then someone other than me will probably have to take care of them for the time being. Once I'm back up and running I'll be happy to fight it out for the issue creation responsibilities, or I could just slip quietly into pushing annotations.
I think the issues are useful for communicating to others who might already be working on annotating an episode.

The case where someone (notably debiatan) basically has an episode annotated as soon as it's over makes issues a little less useful. However, in the event this doesn't happen, having an issue around would probably be good to help make sure an episode doesn't get missed.

Following on from that, the simplest thing would probably be to have people create an issue for annotating an episode, if one doesn't already exist, to help keep things consistent (in addition to the other potential benefits listed by others).
I count four in favor of issue creation and none against, so I've created the issues I skipped this past week. I don't mean this as a discussion stopper; I just though I would follow standard procedure while we don't resolve this question.
Aye, agreed. So I think it'd be worthwhile recognising one person as the habitual issue creator, but if someone is ready to start annotating an episode before the issue creator has done the deed, then they ought to create the issue and assign themself to it.

I'm back in business, by the way (yippee!), so could potentially be that habitual issue creator. However, if you're happy to continue creating the issues, debiatan, then I'll just concentrate on pushing annotations.
Hi Miblo. Glad you're back!

I agree with the scheme you propose and I'm also too lazy to pass up on your offer. You're the issue creator now, dog.
Cheers, debiatan! It's good to be back.

Ha, okay. Well your script eases the process a lot. Maybe next week I'll try making the most of it and get the Q&A in the stub. I may also try adding a search for "!addquote" to automatically create a Quotes section in the notes if that episode needs one.