The bicycle’s top left. I’m not an AI… you are. Stop hitting yourself
Written in Python and powered by the Caffe2 deep learning framework, the codebase – which implements object-sniffing algos such as Mask R-CNN and RetinaNet – is available under the Apache 2.0 licence.
CAPTCHA jokes aside – it’s about more than just identifying what’s in an image. Normally, when we look at a still pic or video, we see multiple overlapping things and calculate their relative distances; relationship to each other; the fact that foreground and background objects are affected by the focal length of the camera (or the human eye); as well as where one object ends and the other begins. If your training data includes moving pics, you’ll might need to look at relative size, positioning, or partial obstruction from frame to frame. These are some of the problems that ML researchers in the computer vision game – and specifically in object detection – are attempting to solve.
Fbook’s framework includes implementations of algos that can not only detect and classify objects but can also simultaneously generate segmentation masks. Identifying selected regions of images as members of a group helps train a model to identify other objects of said group. Judging by Facebook’s illustration in the blog (depicting a Mask R-CNN output), some researchers are achieving a decent hit rate.
Detectron is the basis for many research projects at Facebook AI research – aka FAIR. The Zuckerborgs said researchers will be able to train models to be deployed in the cloud and on smartmobes and other mobile devices.
We can’t help but picture a future where Zuckerbots and Googleborgs attempt to trick each other’s computer vision while human users arm ourselves with blurry spatulas/chairs/lasers in hopes of fighting off the Machines. ®