Meta Updates SEER System, A Key Feature Of Its AR Push
According to Meta, their revised SEER system is now the largest and most advanced computer vision model accessible, with its newest improvements in automated object detection within images.
SEER (short for 'self-supervised') can learn from any random group of images on the internet without the need for manual curation and labeling, accelerating its ability to identify a wide range of different objects within a frame, and it can now outperform industry-standard computer vision systems in terms of accuracy. It's only going to get better. The original version of SEER, which was first disclosed by Meta last year, was based on a one billion-image model. This new edition has a tenfold increase in scope.
According to Meta:
"When we first announced SEER last spring, it outperformed state-of-the-art systems, demonstrating that self-supervised learning can excel at computer vision tasks in real-world settings. We've now scaled SEER from 1 billion to 10 billion dense parameters, making it to our knowledge the largest dense computer vision model of its kind."
"Traditional computer vision systems are trained primarily on examples from the U.S. and wealthy countries in Europe, so they often don't work well for images from other places with different socioeconomic characteristics. But SEER delivers strong results for images from all around the globe – including non-U.S. and non-Europe regions with a wide range of income levels."
The system's ability to recognize distinct images of different people and civilizations and attribute meaning and interpretation to objects from various worldwide regions is noteworthy. It is significant since it will broaden the system's awareness of multiple items and applications, improving accuracy and providing better automated descriptions of what's in a frame. This, combined with product identification matching, signpost signals, branding warnings, and other information, can provide more context for visually impaired people.
Meta also mentions that the system is an integral part of the upcoming transition.
"Advancing computer vision is an important part of building the Metaverse. For example, to build AR glasses that can guide you to your misplaced keys or show you how to make a favorite recipe, we will need machines that understand the visual world as people do. They will need to work well in kitchens in Kansas and Kyoto and Kuala Lumpur, Kinshasa, and various other places worldwide. It means recognizing different variations of everyday objects like house keys, stoves, or spices. SEER breaks new ground in achieving this robust performance."
For years, Meta has been improving object recognition and has made tremendous progress in automatic captions, reader descriptions, and other areas. It's also working on the next stage, detecting things within the video. While this isn't a realistic option right now, it might lead to a slew of new data insights in the future by allowing you to understand more about what each particular user posts about and how to contact them with your promos.
This can be useful even right now.
For example, if you knew that a specific subset of Instagram users was more likely to share a picture of their lunch based on previous posting habits, you might better target your ads. Extrapolate it to any subject with a high degree of data matching precision, and you've got yourself an excellent way to get the most out of your ad strategy.
And that's before contemplating sophisticated uses like augmented reality overlays or refining video algorithms to show people more of the information they're more likely to engage with, based on what's actually in each frame, as Meta points out.
The next stage is approaching, and technologies like these will support significant changes in the online world.
You can find more information about Meta's SEER system here.