The recent enthusiasm for open-world vision systems show the high interest of the community to perform perception tasks outside of the closed-vocabulary benchmark setups which have been so popular until now. Being able to discover objects in images/videos without knowing in advance what objects populate the dataset is an exciting prospect. But how to find objects without knowing anything about them? Recent works show that it is possible to perform class-agnostic unsupervised object localization by exploiting self-supervised pre-trained features. We propose here a survey of unsupervised object localization methods that discover objects in images without requiring any manual annotation in the era of self-supervised ViTs. We gather links of discussed methods in the repository https://github.com/valeoai/Awesome-Unsupervised-Object-Localization.

最近对开放式视觉系统的热情表明了社区在封闭词汇基准设置之外进行感知任务的高度兴趣。在不事先知道数据集中包含哪些对象的情况下，能够在图像/视频中发现对象是一个令人兴奋的前景。最近的研究表明，通过利用自我监督预训练特征，可以进行无类别无监督的对象定位。在自我监督ViT的时代，我们在此提出一种调查无监督对象定位方法，其能够在图像中发现对象而无需任何手动注释。我们在以下链接中汇总了所讨论方法的资源库：this https URL

自监督 ViTs 时代的无监督对象定位调查