Script to generate subtitles (srt files) from the .index files

I created a quick ruby script to parse the .index files and turn them into srt files to load into a video player. The srt files (after being loaded into a video player that supports them) will then show the simplified annotations as subtitles while the video is playing.

I'm using this with the Resilio version of the video files so I made the script output filenames to match. This should facilitate any automatic loading of subtitles that your video player might do. It could also be useful for other sources of the video (and pretty straightforward to tweak the script to change the output filenames if you want to).

Hopefully someone else will find this useful, so I'm posting here to share. It's on a gist at https://gist.github.com/joestraitiff/7654860e4ed1eb20ca79a59ee4dba1cc

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
# Process handmade hero .index files into subtitles (srt files) in a specified
# directory (that must exist). Outputs all the files named to match the Resilio
# versions so they can be played alonside those files.
#
# WARNING: this code is rough, you might need to tweak it for your environment
# I tested with ruby 2.5.3
#
# This is just a rough conversion to use the annotations as subtitles for the
# entire duration until a new one is presented. (NOTE: this can be modified to
# show for a shorter time and/or fixed time -- left as an exercise for the
# reader :)  NOTE: these annotation from the .index files don't have all the same
# info as the ones on the web player as they are pure text and don't have the
# handles of the people asking the questions during Q&A.  You can see what these
# annotations look like by looking at the episode guides on the web and doing a
# search (e.g. for "4coder" at https://hero.handmade.network/episode/code).
# But, they are very useful for my purposes.
#
# To get the most up to date annotations you can query the files yourself from
# the website with wget (or something similar):
#
#   * https://hero.handmade.network/code.index
#   * https://hero.handmade.network/chat.index
#   * https://hero.handmade.network/intro-to-c.index
#   * https://hero.handmade.network/misc.index
#   * https://hero.handmade.network/ray.index
#
# This is free and unencumbered software released into the public domain.
#
# Anyone is free to copy, modify, publish, use, compile, sell, or
# distribute this software, either in source code form or as a compiled
# binary, for any purpose, commercial or non-commercial, and by any
# means.
#
# In jurisdictions that recognize copyright laws, the author or authors
# of this software dedicate any and all copyright interest in the
# software to the public domain. We make this dedication for the benefit
# of the public at large and to the detriment of our heirs and
# successors. We intend this dedication to be an overt act of
# relinquishment in perpetuity of all present and future rights to this
# software under copyright law.
#
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
# EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
# MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
# IN NO EVENT SHALL THE AUTHORS BE LIABLE FOR ANY CLAIM, DAMAGES OR
# OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
# ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
# OTHER DEALINGS IN THE SOFTWARE.
#
# For more information, please refer to <http://unlicense.org/>

SRT_DIR = "srts"
FILES_TO_CONVERT = [
  "code.index",
  "ray.index",
  "chat.index",
  "misc.index",
  "intro-to-c.index"
]

@curr_file = nil
@marker = nil
@msg = ""

def hero_filename(md)
  if md[3].empty?
    "handmade_hero_#{md[1]}_#{md[2]}.srt"
  else
    "handmade_hero_#{md[1]}_#{md[2]}_#{md[3]}.srt"
  end
end

def intro_filename(md)
  if md[3].empty?
    "introduction_to_c_windows_00#{md[2]}_main.srt"
  else
    "introduction_to_c_windows_00#{md[2]}_qa.srt"
  end
end

def chat_filename(md)
  "handmade_hero_chat_#{md[2]}.srt"
end

def misc_filename(md)
  case md[0]
  when "30mlocp" then "the_thirty_million_line_problem.srt"
  when "basic_emacs" then "emacs_tutorial.srt"
  end
end

def ray_filename(md)
  "handmade_ray_#{md[2]}.srt"
end

def parse_name(line, filename)
  md = /(\D*)(\d*)(\D*)$/.match(line.split('"')[1])
  case filename
  when "code.index" then hero_filename(md)
  when "intro-to-c.index" then intro_filename(md)
  when "chat.index" then chat_filename(md)
  when "misc.index" then misc_filename(md)
  when "ray.index" then ray_filename(md)
  end
end

def new_srt_file(filename)
  @curr_file&.close
  @curr_file = File.open(File.join(SRT_DIR, filename), "w")
  @index = 1
  puts "create file: #{filename}"
end

def seconds_to_time(seconds)
  hours = seconds / 3600
  minutes = (seconds / 60) % 60
  seconds = seconds % 60
  "%02d:%02d:%02d,000" % [hours, minutes, seconds]
end

def write_subtitle(start, finish, msg)
  @curr_file.puts "#{@index}"
  @curr_file.puts "#{seconds_to_time(start)} --> #{seconds_to_time(finish)}"
  @curr_file.puts @msg
  @curr_file.puts ""

  @index += 1
end

def next_marker(line)
  _j, time, _c, *msg = line.chop.split('"') # splat/join for embedded "

  write_subtitle(@marker || 0, time.to_i, @msg) if @marker

  @marker = time.to_i
  @msg = msg.join('"')
end

def final_marker
  return unless @marker
  write_subtitle(@marker, @marker + 3000, @msg)
  @marker = nil
end

FILES_TO_CONVERT.each do |filename|
  File.open(filename).readlines.each do |line|
    if line.start_with?("name:")
      new_srt_file(parse_name(line, filename))
    elsif line.start_with?("\"")
      next_marker(line)
    elsif line.start_with?("---")
      final_marker
    end
  end
end

Edited by Joe Straitiff on
Nice one, @Joe!

Just a heads up that the .index file format may have to change slightly as a part of the stuff I'm currently doing on ~Cinera's config file / multiple projects per instance.
No problem. It'll be easy enough to tweak when that happens. Thanks for the heads up!