By Paul Shen, TVU Networks
September 16th, 2020
Terms like “linear,” “non-linear” and “random access” were thrown around a lot not that long ago when it came to video, especially video editing.
Being “linear” simply meant that frame after frame of video was recorded sequentially on a line-like medium, such as videotape or film. Then came “non-linear” –as in editing—that recorded frames onto a randomly accessible medium like a spinning computer disk.
The difference became painfully obvious when it came to accessing a desired video clip for an edit. Getting to the clip on videotape took time as the linear medium shuttled in one direction or the other. Once found, the precise clip was identified by the frame-accurate timecode information associated with its in and out points.
For a time, these timecode IDs for clips were just as important in non-linear editing. While the clips were randomly accessed on non-linear editing systems, conforming a final edit in post-production still required an edit decision list of clips identified by timecode numbers from the NLE that corresponded to the same timecode IDs on a linear source medium.
Of course, the conforming process has faded away as the distinction between offline and online editing has become almost archaic.
What hasn’t become an anachronism is the process of scrubbing through video content until the desired clip for an edit is found—whether that’s on an NLE system, a slow motion instant replay server or some other type of production storage. But that’s all changing thanks to the power of artificial intelligence (AI) algorithms and clever cloud technology implementation.
Find The Right Clip Instantly
Put yourselves in the shoes of a news editor, MMJ or reporter. You have several 10-minute clips of source footage from interviews you’ve conducted with newsmakers. You have to find the exact clips needed to tell a great story, and you are on deadline.
With the assistance of the speech-to-text algorithm available as part of our TVU MediaMind AI engine, transcripts of those interviews are presented to you, the journalist who is working on this hypothetical story, in mere moments. Reading through the transcripts, you identify the exact sentences spoken by your sources that you want to include in your story.
Because every frame of video recorded includes rich metadata generated by MediaMind AI algorithms, including the speech-to-text algorithm, it is now possible to jump instantly and precisely to the desired video clips that correspond to the quotes from the transcript.
Gone is the need to scrub back and forth through video to get to the precise in and out points of desired video clips. Gone too is any concern about timecode –or in this instance timestamps—and frames. It all happens in the background, enabling you, the journalist, to focus your efforts on storytelling.
Cloud Video Production
Making this happen in the cloud could present challenges—especially when full-resolution video files are involved. However, we have leveraged a lot of technology to make this cloud-based clip retrieval happen without any latency.
One important strategy has been using proxies tied by their timestamps to the corresponding full-resolution video frames. Doing so makes it easy to scan frame-by-frame backwards and forwards through clips if it’s necessary to deviate from the originally text-identified in and out points of clips.
In a sense, this TVU MediaMind AI engine-based clipping method brings things full circle, relying on timecode, random access and non-linear in a whole new way. However, enhanced by AI-driven speech-to-text transcription, this new way to find clips is far easier and much more natural than anything that’s come before.
For more information on this topic or to contact the blog authorContact us