A paper accepted to IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2024