Artificial intelligence is revolutionizing industries by automating manual tasks and improving pattern matching at scale. In healthcare, AI has been able to detect early-stage lung cancer 94% of the time, outperforming veteran radiologists. AI-powered predictive maintenance is increasing the life of heavy machinery by 20 to 40%. And in cybersecurity, 83% of organizations report that they wouldn’t be able to deal with the sheer volume of cyberattacks without the aid of AI.
Now, AI is having a meaningful impact on physical security. Forward-thinking security and operations teams are building upon the foundation created by video surveillance and access control systems with AI. The added intelligence that AI brings means that security officers no longer need to stare at camera walls or spend hours dispositioning nuisance alarms. Now they can focus on proactively responding to verified security alerts, responding to incidents 10X faster and intervening before situations escalate.
Recent breakthroughs have made it possible for AI to meaningfully improve physical security. These recent breakthroughs were preceded by incremental improvements that helped to pave the way, but ultimately did not significantly improve physical security or perimeter control. In this blog, we’ll look at three of the most important technology breakthroughs: motion detection, computer vision, and video analytics – and how these advancements helped pave the way for computer vision intelligence, delivering on the promise of AI for security and operations teams.
While AI started to take shape as a field way back in the 1960s, motion detection is an even older concept, originating in the 1940s during World War II. Motion detection started with radar sensors, but over the years, became more sophisticated to the point where algorithms were trained to compare pixel changes scene by scene in moving video.
These early systems were slow and lacked accuracy. More advanced techniques and algorithms were introduced, including background subtraction. Background subtraction identifies which parts of a scene stay static over the series of frames in a video. The background can then be subtracted from the original frames to identify dynamic objects in the scene.
Even with advancements like background subtraction, motion detection algorithms have a major shortcoming when applied to physical security. The algorithms are prone to making incorrect predictions that trigger false alerts, especially during less than optimal visual conditions, like at night or when it’s raining. Shadows, a change in lighting, or even moving leaves easily disrupt these algorithms.
Systems that rely on motion detection generate a lot of false alarms. If you have a motion detection system at your home and also own a dog, for instance, you know exactly what I’m talking about. In addition, these systems lack security awareness. The burden is still on the security officer to investigate and figure out what is really happening.
In 2012, AI became even more applicable to physical security with a huge leap forward in computer vision, a branch of AI. Advances in deep learning enabled computer vision to detect objects and even the different attributes within those objects with a high level of accuracy. This represented a big advancement over motion detection.
Inspired by the human visual cortex, convolutional neural network (CNN) algorithms receive input data, like an image, and then apply filters to that image according to rules that it has learned from training data. Eventually, by applying the same filter repeatedly, it builds out a feature map of an image and ‘understands’ what is represented on that image. One of the first CNN architectures, AlexNet, showed 84.7% accuracy during a machine learning challenge, ImageNET.
After strides in classifying entire images with deep neural nets, the next extension was to be able to detect objects localized in regions of an image. RCNN (Region-Based Convolutional Neural Network) models identify regions in an image that contain objects of interest and group together semantically contiguous regions based on scale, color, enclosures, and textures.
RCNN use this information to extract patterns, objects, and features from an image and are commonly used for object detection in surveillance camera images.
But as huge as these advancements in computer vision were, the technology still wasn’t a great solution for security and operations teams. Although identifying objects is more useful than identifying motion alone, detecting objects doesn’t provide a lot of information on what is actually happening and how they should respond. Critical context is missing.
For example, someone holding a knife in a kitchen area is not actually a security threat. Someone forcing a door open and then brandishing a knife in a secure office area is a security threat. Time, location, environment, activity, and movement along with the risk history of the space are all vital to identifying real security threats. Without that context, these systems generate a lot of false alarms and the burden is still on security and operations teams to verify alerts and determine what action if any needs to be taken.
Video analytics examines the pixels within an image along X and Y coordinates, and runs algorithms on it to determine what is in the image. For example, by identifying a head, shoulders, arms, and legs the analytics ascertains that a person is present.
Video analytics has been adapted for security to the point where it can not only tell you that a person is in the frame but that that person has moved to another location by tracking the person or object across multiple cameras. This represented further improvement over computer vision alone and represented a solution more purpose-built for security and operations teams.
Despite these advances, however, video analytics is still plagued by false alerts. Rarely is a person or object ever in complete view, which makes it difficult for video analytics to accurately determine what it is looking at. If there is a group of 10 people, for example, very few of them will be in full view and the accuracy of the video analytics suffers.
Movement also causes issues even with static cameras. A slight shift in the camera position moves the pixels on the X and Y access and requires recalibration, either manually and across the system. These and other issues including poor performance in lowlight, lead to false positives. The solution up until this point has been to continually upgrade cameras. The more megapixels available, the better the accuracy of the video analytics. As a result, teams that leverage video analytics find themselves in a continual cycle of upgrading cameras without ever solving the false alarm problem.
Finally, because video analytics is relatively slow and prone to false positives, it is most often applied to forensics, helping to determine what happened after a security incident has occurred. While forensics are an important tool for security and operations teams, it does not meaningfully impact perimeter safety and security.
Computer vision intelligence
Computer vision intelligence adds near human levels of perception and real-time security awareness. As the name suggests, computer vision intelligence builds on the advancements made in computer vision and AI and adds the ability to comprehend contextual aspects of an image or video and apply security awareness to that information.
For instance, where video analytics struggles to identify partially obscured persons and objects, computer vision intelligence can infer exactly what is in the scene – without needing 4k cameras. Because computer vision intelligence isn’t reliant on counting pixels, it also doesn’t suffer from lower accuracy under poor conditions or from lower megapixel cameras. Computer vision intelligence is just as effective with lower resolution cameras as it is with higher resolution cameras.
On top of that, rather than simply identifying motion or objects, computer vision intelligence identifies behaviors based on numerous aspects of the scene and can accurately distinguish threat behaviors from normal behavior. For example, in most cases, people walking around with laptops in an office setting is normal behavior. If someone tailgates to enter a secure area and then leaves with a laptop that is likely a security threat.
Computer vision intelligence offers real-time situational awareness that wasn’t possible with motion detection, computer vision, or video analytics. This new era of AI provides security and operations teams with accurate 24/7 threat detection, automated verification of access control alerts, and provides the information necessary for making smarter operational decisions.
The new era of computer vision intelligence
There is a misconception that computer vision and AI are primarily used for managing false positives and other uncontextualized alarms or monitoring sensors. Perhaps that’s because until now, these technologies have largely been sources of false alarms and therefore the next logical step is to better manage these alarms.
This misconception undermines the potential of the breakthrough in computer vision and AI. What we have always wanted from AI was a way to fundamentally improve physical security. The only reason we have settled for inaccurate AI technology is because the technology simply has not advanced enough. But leaps and bounds have been made and computer vision intelligence is delivering on the original promise of AI in physical security.
With computer vision intelligence, physical security and operations teams at companies that include VMware, Impossible Foods and NorCal Cannabis are empowering their security and operations teams to take a proactive approach and respond to security incidents before they escalate. For a deeper dive into the evolution of AI technology in physical security, download the whitepaper Entering the Era of Computer Vision Intelligence.