RoboCop-ter: Boffins build drone to pinpoint brutal thugs in crowds

‘Violent behavior’ identified and highlighted by surveillance system destined for a police force near you

The artificially intelligent technology uses a video camera on a hovering quadcopter to study the body movements of everyone in view. It then raises an alert when it identifies aggressive actions, such as punching, stabbing, shooting, kicking, and strangling, with an accuracy of about 85 per cent. It doesn’t perform any facial recognition – it merely detects possible violence between folks.

And its designers believe the system could be expanded to automatically spot people crossing borders illegally, detect kidnappings in public areas, and set off alarms when vandalism is observed.

The inventors are based at the University of Cambridge, in England, India’s National Institute of Technology, and the Indian Institute of Science. They hope their autonomous spy drones will help cops crush crime and soldiers expose enemies hiding in groups of innocents.

“Law enforcement agencies have been motivated to use aerial surveillance systems to surveil large areas,” the team of researchers noted in their paper detailing the technology, which was made public this week.

“Governments have recently deployed drones in war zones to monitor hostiles, to spy on foreign drug cartels, conducting border control operations, as well as finding criminal activity in urban and rural areas.

“One or more soldiers pilot most of these drones for long durations which makes these systems prone to mistakes due to the human fatigue.”

The model works in two steps. First, the feature pyramid network, a convolutional neural network, detects individual humans in images from the drone’s camera. Second, it uses a scatternet, also a convolutional neural network, tacked onto a regression network to analyze and ascertain the pose of each human in the image.

It breaks down the outline of the body into 14 key points to work out the position of the person’s arms, legs, and face in order to identify the different classes of violent behavior specified in the training process.

Here’s a video of how it works.

Youtube Video

The system was trained on the Aerial Violent Individual dataset compiled by the researchers. Twenty-five people were called in to act out attempts at punching, stabbing, shooting, kicking, and strangling to create 2,000 annotated images. Each image featured two to ten people, so this system isn’t, right now, equipped to handle very large crowds.

The accuracy level is highest when the system has to deal with fewer people. With one person in the image, the accuracy was 94.1 per cent, but it drops to 84 per cent for five people, and goes down to 79.8 per cent for ten people. “The fall in accuracy was mainly because some humans were not detected by the system,” Amarjot Singh, a coauthor of the paper, said.

It’s difficult to really judge how accurate the drone system is considering it hasn’t really been tested on normal people in real settings yet – just volunteers hired by the research team. In other words, it was trained by people pretending to attack each other, and was tested by showing it people, well, pretending to attack each other. On the other hand, it is a research project, rather than a commercial offering (yet).

The images fed into the system were also recorded when the drone was two, four, six, and eight metres away. So, that gives you an idea of how close it had to be. And considering how loud the quadcopters are, it’s probably not very discreet. In real crowds and brawls, the gizmos would be a few hundred feet away, reducing visibility somewhat.

The live video analysis was carried out on Amazon’s cloud service, in real time, using two Nvidia Tesla GPUs, while the drone’s builtin hardware directed its flight movements. The tech was trained using a single Tesla GPU on a local machine.

“The system detected the violent individuals at five frames per second to 16 frames per second for a maximum of ten and a minimum of two people, respectively, in the aerial image frame,” the paper stated.

Ethical concerns

Performing inference in the cloud is a potential security and privacy hazard, seeing as you’re streaming footage of people into a third-party computer system. To mitigate any legal headaches, the trained neural network processes each frame received by the drone in the cloud and, apparently, deletes it after it the image is processed.

“This adds data safely layer as we retain the data in the cloud only for the time it is required,” Singh, a PhD student at the University of Cambridge, told The Register.

The use of AI for surveillance is concerning. Similar technologies involving actual facial recognition, such as Amazon’s Rekognition service, have been employed by the police. These systems often suffer from high false positives, and aren’t very accurate at all, so it’ll be a while before something something like this can be combined with drones.

In this case, an overly sensitive system could produce false positives for people playing football together, and think that they were kicking one another. At the moment, the technology identifies what the team labels violence – but this definition could be expanded to praying, kissing, holding hands, whatever a harsh or authoritarian government wants to detect and stamp out.

Singh said he is a little worried the final form of the technology could be widened by its users to track more than thugs causing trouble.

“The system [could potentially] be used to identify and track individuals who the government thinks is violent but in reality might not be,” Singh said. “The designer of the [final] system decides what is ‘violent’ which is one concern I can think of.”

Interestingly, Google and Facebook have published studies showing how neural networks can be used to track poses, and experts have raised concerns about how it could be used for digital surveillance or military purposes before. Now, the drone paper is proof that it’s possible.

The researchers used the Parrot AR Drone, a fairly cheap gizmo, to carry out their experiments. Jack Clark, strategy and communications director at OpenAI, previously told The Register that he believed commercially available drones could in the future be reprogrammed using “off-the-shelf software, such as open-source pre-trained models,” to create “machines of terror.”

It’s also pretty cheap to run. This experiment cost about $0.100 per hour to run on Amazon’s platform, so it’s not too expensive after the system has been trained.

Oversight needed

Singh admitted that “one could potentially use this system for malicious applications, but training this system for those applications will require large amounts of data which requires numerous resources. I am hoping that some oversight would come with those resources which can prevent the misuse of this technology.”

But he believed the concerns of hobbyists reprogramming drones for nefarious reasons were unwarranted. “Buying the drone is indeed easy but designing an algorithm which can identify the violent individuals requires certain expertise related to designing deep systems which are not easy to acquire. I don’t think that these systems are easy to implement,” he said.

The researchers are planning to test their system in real settings during two music festivals, and monitoring national borders in India. If it performs well, they said they hoped to commercialize it.

‘Pernicious’

“Artificial intelligence is a potent technology which has been used to design several systems such as google maps, face detection in cameras, etc which have significantly improve our lives,” Singh said.

“One such application of AI is in surveillance systems! AI can help develop powerful surveillance systems which can assist in identifying pernicious individuals which will make the society a safer place. Therefore I think it is a good thing and is also necessary.

“That being said, I also think that AI is extremely powerful and should be regulated for specific applications like defense, similar to nuclear technology.”

The research will be presented at a workshop during the International Conference on Computer Vision and Pattern Recognition (CVPR) 2018 workshops in Salt Lake City, Utah, USA, in June. ®