Archive for April, 2006

Improving online presentations:Larry Wall at OSDC

Tuesday, April 18th, 2006

The short version

I’ve posted a iPod-compatible video representation (68 MB, mirror, original; or original flickery QuickTime 55 MB, mirror, original) of Larry Wall’s Perl6 presentation at the OSDC::Israel::2006 conference in February. Enjoy.

The long version

Yesterday, an announcement appeared on offering an electronic representation of Larry Wall’s Perl6 presentation at the OSDC::Israel::2006 conference. That representation consisted of three parts: an MP3 recording of Larry speaking, Larry’s original PowerPoint slides, and a text transcript of the MP3.

One of the first comments to appear said that it was difficult to follow along with the slides since the MP3 had little indication of when Larry switched slides. After listening to the audio for a little while, I wholeheartedly agreed and realized that I could help make the experience better with some hard work and some of the tools I’ve built for my company’s MediaLandscape project. What follows are the steps I took to create the movie linked above.

Creating the source presentation

The MediaLandscape tools work with a collection of media files and metadata to describe them to construct a rich presentation. Usually the media files consist of a single video or audio file plus a collection of slides as either image files, a slide show (PPT, Keynote, etc) or as a screen capture. The critical metadata is the date and title of the presentation, the name and optional photo of the speaker (or speakers), the duration of the primary media, and the times at which the slides were displayed. That metadata should be collected in one of several supported XML formats.

Of that data, the only thing I was missing was the slide timing. I figured I could reconstruct that one lacking type of data. With that information, I would have a complete source presentation that I could then convert into one of a wide variety of formats.

My first step was to divide the PowerPoint into PNG files. Fortunately, PowerPoint supports a simple “Save As…” option to export the slides as a directory full of bitmap images. I chose the default of 72 dpi for rasterizing the slides.

Saving PowerPoint slides as PNG files

Next, I moved all of the files into a single folder. I added a mug shot of Larry from the OSDC::IL site and started creating the presentation.xml file (presoxml[view] the final version). I got the presentation date and time from the OSDC schedule, and googled for the time zone information (I hope I got that right). I added Larry’s info and then attacked the slides.

All of the slide information except the timing would be straightforward, so I cobbled together a short Perl program, shown below, to type my XML for me. I put in placeholder times in millisec for each slide, just so they’d be in the right order.

#!/usr/bin/perl -w
for my $i (1..164) {
   print <<"EOF";

Finding the slide times

This next step was the most tedious. I opened all of the slides in via open *.png, loaded the presentation.xml file in Emacs and started playing the MP3. Whenever I thought the slide had changed, I paused the MP3 and typed the time into the XML file. About 25% of the time, I could hear when Larry clicked his keyboard to advance the slide. Another 50% of the time, I could tell by Larry’s pauses when he switched slides. The last 25% came from the context of the talk and required me to occasionally rewind the audio a bit to figure out when the slide transition actually happened. In retrospect, the most tiresome part was doing mental math to transform the hh:mm:ss in iTunes to milliseconds for the XML. I probably should have transcribed the hh:mm:ss directly and used code to do the math.

This process took a bit over two hours to index the one hour, twelve minute talk. But then I was done! The rest was just running programs.

Creating the output

I then ran the MediaLandscape Publish program on the source presentation, targeting a video output. At the heart of the process is a command line utility I wrote called qtcomposite which takes the XML slide manifest and the media files and outputs a QuickTime reference movie with the slides placed on the timeline of the movie at the appropriate points. The program takes only 1-2 seconds to run, and the output is playable directly in QuickTime (less a few stutters at slide transitions when the PNGs are loading). This let me check and tweak some of the slide times.

Finally, I used QuickTime Pro and the built-in iPod (H.264) codec to export the reference movie as a monolithic, compressed video file. That export took about 10 minutes. I previously had tried using the DivX codec, but that codec mangled the slide timing when erroneously trying to normalize the framerate.

Screenshot of the iPod export dialog in QuickTime Pro

Lastly, I uploaded the video to my webserver and wrote this summary.


Definitely, the most time-consuming step was creating the slide timing. I would have spent over an hour listening to the presentation anyway, since I like hearing Larry speak, so it wasn’t wasted time.

It would have been much more pleasant and accurate if I had computer-recorded timing of the slide transitions from Larry’s original talk. There are several presentation capture solutions that can do just that, and I plan to talk about them in a future post.

Also, this output did not incorporate the transcript. Our engine supports subtitle tracks (optionally as XML or burned into the video) but I was a little too burned out from doing the slide timing manually to also partition the transcript into time-tagged subtitles. Any volunteers? :-) We can accept any of the subtitle formats that the module supports.

Editorial Notes

After publication, I edited this post a couple of times because I belatedly discovered that my DivX version of the video had the slide timings all wrong in the second half of the presentation. The iPod export codec got everything right. Apologies to readers who don’t have easy access to an H.264 decoder…