A quick guide to Amazon’s papers on ICCV 2023

Amazon’s papers at this year’s International Conference on Computer Vision, arranged by topic.

3-D

Hal3d: Hierarchical active learning for fine-grained 3D sub-marking
Fengen Yu, Yiming Qian, Francisca Gil Urata, Brian Jackson, Eric Bennett, Richard Zhang

IMGEON: Image induced geometry-noticing voxel representation for 3D vision with multiple viewpoint
Tao You, Shun-Po Chuang, Yu-Lun Liu, Cheng Sun, Ke Zhang, Donna Roy, Cheng-Hao Kuo, My Sun

Recognition

Skeletr: Against skeletal -based action recognition in nature
Haodong Duan, Mingze Xu, Bing Shuai, Davide Modolo, Zhuowen Tu, Joe Tighe, Alessandro Bergamo

Data presentation

Linear spaces of meanings: compositional structures in vision -language models
Matthew Trees, Pramuditha Perera, Luca Zancato, Alessandro Achille, among Bhatia, Stefano Soatto

Movement -controlled masking for spatiotemporal representation learning
David Fan, Jue Wang, Leo Liao, Yi Zhu, Vimal Bhat, Hector Santos, Rohith Mysore Vijaya Kumar, Xinyu (Arthur) Li

Baptized Video Generation

SIDGAN: High resolution called video cleaning via shift in variant learning
Urwa Muaz, Wondong Jang, Rohun Tripathi, Santhosh Mani, Wenbin Ouyang, Ravi Teja Gadde, Baris Gecer, Sergio Elizondo, Reza Madad, Naveen Nair

Geospatial Foundation models

Against geospatial foundation models via continuous prior prior prior prior prior priora
Matias Mendieta, Boran Han, Xingjian Shi, Yi Zhu, Chen Chen

Graph Neural Networks

Learning adaptive neighborhoods for graphic neural networks
Avi Saha, Oscar Mendez, Chris Russell, Richard Bowden

Image picking

FASHIONNTM: Multi-Turn Fashion Image Henting Via cascade memory
Anwesan Pal, Sahil Wadhwa, Ayush Jaiswal, Xu Zhang, Yue Wu, Rakesh Chada, Pradeep Natarajan, Henrik I. Christensen

Segmentation picture

Amodal segmentation trainer with shape before
Jianxiong Gao, Xuelin Qian, Yikai Wang, Tianjun Xiao, Tong He, Zheng Zhang, Yanwei Fu

LD-ZNET: A latent diffusion method for text-based image segmentation
Kouttya PNVR, Bharat Singh, Pallabi Ghosh, Help Siddicie, David Jacobs

Rethinking Amodal Video Segmentation from Learning Signals Signals With ObjectCentric Representation
Ke Fan, Jingshi Lei, Xuelin Qian, Miaopng Yu, Tianjun Xiao, Tong He, Zheng Zhang, Yanwei Fu

Information extraction

DOCTR: Document transformation for structured information extraction in documents
Haofofu Liao, Aruni Roychowdhury, Weijian Li, Ankan Bansal, Yuting Zhang, Zhuowen Tu, Pleased Kumar Satzoda, R. Manmatha, Vijay Mahadevan

Machine rental

Safe: Machine Unlearning with Shard Graph
Yonatan Dolls, Ben Bowman, Alessandro Achille, Aditya Golatkar, Ashwin Swaminathan, Stefano Soatto

Object detection

Bidirectional adaptation to domain adaptive detection with transformers
Liqiang He, Wei Wang, Albert Chen, My Sun, Cheng-Hao Kuo, Sinisa Todorovic

Unattended Open Localization Object Location In Videos
Ke Fan, Zechen Bai, Tianjun Xiao, Dominik Zietlow, Max Horn, Zixu Zhao, Carl-Johann Simon-Gabriel, Mike Zheng Shou, Francesco Locatello, Bernt Schiele, Thomas Brox, Zheng Zhang, Yanwei Fu, Tong He HE HE

Object tracking

Object -centered tracking of multiple objects
Zixu Zhao, Jiaze Wang, Max Horn, Yizhuo Ding, Tong He, Zechen Bai, Dominik Zietlow, Carl-Johann Simon-Gabriel, Bing Shuai, Zhuowen Tu, Thomas Brox, Bernt Schiele, Yanwei Fu, Francesco Locatello, Zheng Zhang, Tia

Cleaning of stage text

Clipting: Looking at the larger picture in stage text recognition
Aviad Aberdam, David Haim Bensaid, Alona Golts, Roy Ganz, Oren Nuriel, Royee Tichauer, Shai Mazor, Ron Litman

Against models that can see and read
Roy Ganz, Oren Nuriel, Aviad Aberdam, Yair Kittenplon, Shai Mazor, Ron Litman

Transfer of Learning

Padclip: Pseudo-marking with adaptive debiasing in clips for unattended domain adaptation
Zhengfeng Lai, Sol Vesdapunt, Ning Zhou, Jun Wu, Cong Phuoc Huynh, Xuelu Li, Kah Kuen Fu, Chen-Nee Chuah

Video picking

Audio-enhanced Text-to-Video picking using text-conditioned function adjustment
Sarah Ibrahimi, Xiaohang Sun, Pichao Wang, Amanmeet Garg, Ashutosh Sanan, Mohamed Omar

Segmentation video

MEGA: Multimodal adjustment unit and distillation to cinematic video segmentation
Najmeh Sadoughi, Xinyu (Arthur) Li, Avijit Vajpayee, David Fan, Bing Shuai, Hector Santos, Vimal Bhat, Rohith Mysore Vijaya Kumar

Leave a Comment