
Meta's Multilingual AI Models: Open-Source and Bible-Powered.
Meta AI Research has open-sourced DINOv2, a pretrained foundation model for computer vision tasks, including image classification, video action recognition, semantic segmentation, and depth estimation. DINOv2 is based on the Vision Transformer architecture and is trained on a curated dataset of 142M images. It outperforms other self-supervised learning models and shows performance comparable to or better than that of weakly-supervised models. The model is available on GitHub, and an interactive demo of several computer vision tasks using DINOv2 is available on the project site.