If you ever feel like you need a primer on artificial intelligence, you might like to watch a speech that Demis Hassabis gave at the British Museum in November 2015. Hassabis, the founder and CEO of DeepMind (which was acquired by Google in 2014), talks about the differences between narrow and general AI; creativity, memory and the human brain; and how, in his opinion, content recommendation systems based on collaborative filtering are “primitive” and don’t really work.
Look out for the spectacular Space Invaders demonstration at around the 21-minute mark (after a few hundred attempts, the AI player effectively becomes immortal!).
And later, speaking about memory and learning, Hassabis says,
“If you remember this lecture tomorrow, it won’t be stored like a video tape in your mind. Actually, you’ll be combining it with other components of things that you’ve experienced before e.g. other lectures, other specific pieces of content. The hippocampus pulls all of these things together as a coherent whole and represents it as an episodic memory.”
It’s fascinating stuff and yesterday, I did remember parts of his lecture because I’d been on calls with two industry vendors, Accedo and Make.TV, during which we discussed machine learning and their use of Amazon’s Rekognition solution.
Rekognition, if you don’t know it, identifies objects, people, celebrities, text, scenes and activities in videos and images and outputs metadata files. It also provides facial recognition and sentiment analysis. And depending on your viewpoint, it’s either potentially terrifying (from a Minority Report, dystopian surveillance perspective) and / or incredibly useful. I’m gonna go with the latter interpretation for now (!) and talk about three of the VOD and TV use cases for the technology.
#1. Accedo & Increasing User Engagement by Replacing Artwork
Last August, Accedo, the video experience company, together with their client, ITV (the UK’s biggest commercial broadcaster), worked on a project which leveraged the emotional triggers of viewers to deliver personalised cover art for VOD titles – and in turn increase user engagement and video consumption. The prototype was then tested in focus groups with the aim of delivering a set of rules to preselect relevant thumbnails, such as key characters in the episode, key elements in the episode’s description, or key scenes.
ITV’s Head of Product, Lee Marshall, commented at the time that,
“Improving user engagement is extremely important to us, helping to ensure that we are delivering relevant and interesting content to all of our users. This project opens up huge potential to offer our viewers a much more personalized, and therefore engaging, experience.”
#2. Make.TV and Analysing Live Video Feeds
Make.TV is a really interesting company because it allows its clients to essentially build a virtual newsroom in the Cloud. Founded by Andreas Jacobi in 2014, and with its head office in Seattle, Make.TV can acquire unlimited concurrent live feeds from professional cameras, encoders, mobiles, drones, and online sources such as social media; route them through a curatable, web-based, multi-view and finally, distribute live signals simultaneously to unlimited online and broadcast destinations.
Jacobi told me that his team had used Rekognition in some of their Tour de France coverage. During the competition, Make.TV was ingesting hundreds of live feeds and, with Rekognition running on-the-fly analysis and outputting relevant metadata in the background, they could recognise those that focused only on, for example, Team Italy (because of the colour of their jerseys) or Team Sky. Pretty cool if you want to provide rich, personalised coverage.
#3. Searchable Video Libraries
Back in 2009, I was working on a project at ITN Source (now part of Getty Images) for which the goal was to deliver 200 hours of historical footage to our client, the UK’s Department of Education. Some of that content was already digitised, some had to be encoded or transcoded but either way, each of the many videos that made up the consignment had to be watched by a team of dedicated human “shotlisters”. The shotlisters literally listed each shot (along with a time stamp) and described the action in any given film clip, scene by scene. The aim, of course, was to create individual metadata files which could then be searched. I wonder how much easier their jobs could have been if they’d had access to something like Rekognition?
Could this work for big archives? The Rekognition charging structure isn’t particularly cheap: the first 1,000 minutes of video analysis per month are free after which it’s $0.10 per 1 minute of archived video and $0.12 per 1 minute of live stream video. That said, the major advantage here is scale.
Rekognition is something I’m planning to keep my eye on this year so please do drop me a line if you’re working on, or already using, a related solution.
Again related, I’ll be posting a piece later this week on how UKTV has used machine learning in its new suite of VOD apps.
Finally, if you want to read more about AI, check out Tim Urban’s superb (and very detailed, kinda scary) guide to the AI Revolution or this list of books, articles and blogs from James Vincent at The Verge.
Photo by Noah Buscher on Unsplash