Navigating Video – Talking Interfaces

One of the things that adopting SMIL for the DAISY talking book in the 1990’s enabled was the idea that structured navigation of audio could extend to other content (e.g., video). The NCX navigation layer was, in my mind, ideally, media agnostic; NCX navigation targets could point to audio, video, or various combinations (or fragments of) audio, video, image or text. At WWW2002 I did present a poster on gNCX (the generalized NCX) and began work on some unpublished drafts of a potential W3C submission (I will dig those out and post links here as time permits). DAISY’s focus on digital talking books precluded dedicated effort to push the “NCX for anything” model, though I believe some within DAISY get this NCX for video idea.

The work on the gNCX led to various concept prototypes,taking videos and adding text tracks (captions) and overlaying an NCX to allow navigation into the video presentation. The prototypes were specific to the Windows platform, using various pieces of rudimentary SMIL engines and UI’s that I had lying around, having been hacked together during the DAISY standards development. I started with a simple demo using video from the UN concerning the aftermath of the Gujarat earthquake, with all of the SMIL and NCX handcoded. A more ambitious demo, involving a hypothetical textbook for cinema students, used two full length motion picture versions of the same Shakespeare play, by two different directors, and providing a user interface that allowed the student to easily compare how each director interpreted any scene in the play, and all in a fully accessible manner. I presented a somewhat functional version of this demo at a DAISY technical meeting in Amsterdam in 2004, but the daunting amount of hand coding needed to fully complete the effort forced it to the back burner, until recently.

To be continued.

Leave a Reply Cancel reply