Amazon’s papers on this year’s computer vision and pattern recognition conference (CVPR), sorted by research topic.
3-D Perception
Implicit contrastive surface clusters to Lidar Point Cloud
Zaiwei Zhang, my Bai, Erran Li
Anomali Classification
WINCLIP: zero/few-shot-anomali classification and segmentation
Jongheon Jeong, Yang Zou, Taewan Kim, Dongqing Zhang, Avinash Ravichandran, Onkar Dabeer
Data Annotation
CVPR HIGHLIGHT*:
HANDSOFF: labeled data set generation without additional human comments
Austin Xu, Mariya Vasileva, Achal Dave, Arjun Seshadri
Knowledge combination to learn rotated detection without rotated annotation
Tanyu zhu, Bryce Ferenczi, Pulak Purkait, Tom Drummond, Hamid Rezatofighi, Anton Van Den Hengel
Image generation
Flexnerf: Photorealistic Free-Viewpoint reproduction of moving people from sparse views
Vinoj Jayasundara, AMIT AGRAWAL, NICOLAS HERON, INBHINAV SHRIVASTAVA, LARRY DAVIS
Lemart: Ethics Effectively Masked Region Transformation to Image Harmonization
Sheng Liu, Cong Phuoc Huynh, Cong Chen, Maxim Arap, Raffay Hamid
Segmentation picture
Network -free, unattended semantic segmentation with synthetic images
Qianli Feng, Raghudeep Gadde, Wentong Liao, Eduard Ramon Maldonado, Aleix Martinez
Polyform: Refoning image segmentation as sequential polygon generation
Jiang Liu, Hui Ding, Zhaowei Cai, Yuting Zhang, Ravi Kumar Satzoda, Vijay Mahadevan, R. Manmatha
Spatio-Temporal Pixel level Contrastive Learning-Based Source-Free Domain Adaptation for Semantic Segmentation
Shao-Yuan Lo, Poojankumar Oza, Sumanth Chennupati, Alejandro Galindo, Vishal M. Pate
Machine learning
A meta-learning approach to predicting performance and data requirements
Achin Jain, Gurumurthy Swaminathan, Paolo Favaro, Hao Yang, Avinash Ravichandran, Hrayr Harutyunyan, Alessandro Achille, Onkar Dabeer, Bernt Schiele, Ashwin Swaminathan, Stefano Soatto
Utilization of inter-rater agreement for classification in the presence of noisy labels
Maria Sofia Bucarelli, Lucas Cassano, Federico Siciliano, Amin Mantrach, Fabrizio Silvestri
Train/test time adjustment with retrieval
Luca Zancato, Alessandro Achille, Tian Yu Liu, Matthew Trees, Pramuditha Periera, Stefano Soatto
Multimodal models
Dynamic inference with earth -based vision and language models
Burak Uzkent, Amanmeet Garg, Wentao Zhu, Keval Doshi, Jingru Yi, Andy Wang, Mohamed Omar
GIVL: Improving geographically including Vision-Language Models with precautions
Da Yin, Feng Gao, Govind Thattai, Michael Johnston, Kai-Wei Chang
Founding of counterfactual explanation of image classification to textual concept space
Siwon Kim, Jinoh Oh, Sungjin Lee, Seunghak Yu, Jae Do, Tara Taghavi
Understanding and construction of latent modality structures in multimodal representation learning
Qian Jiang, Changyou Chen, Han Zhao, Liquen Chen, Qing Ping, Son Tran, Yi Xu, Belinda Zeng, Trishul Chilimbi
Object detection
Scaled: a scalable multi-datas object detector
Yanbei Chen, Manchen Wang, Abhay Mittal, Zhenlin Xu, Paolo Favaro, Joe Tighe, Davide Modolo
Product characterization
Learning Price and Class -Specific Representation Duet for Finkornet Fashion Analysis
Yang (Andrew) Jiao, Yan Gao, Jingjing Meng, Jin Shang, Yi Sun
Skill: Skip Color and Landscape Mark: Self-monitored Design Representations For Products in E-commerce
Vinay Kumar Verma, Dowen Rabius Sanny, Prateek Sircar, Shreyas Sunil Kulkarni, Deepak Gupta, Abhishek Singh
Video understanding
Film2Scenes: Using film metadata to learn the scene representation
Shixing Chen, Chun-Hao Liu, Xiang Hao, Xiaohan Nie, Maxim Arap, Raffay Hamid
Selective structured State Secpaces for Long -Formed Video Understanding
Jue Wang, Wentao Zhu, Pichao Wang, Xiang Yu, Linda Liu, Mohamed Omar, Raffay Hamid
* Difference to the top 10% of papers that were accepted at the conference