DVIS-DAQ: Improving Video Segmentation via Dynamic Anchor Queries

1Wuhan University, 2Skywork AI
empty

DVIS-DAQ demonstrates superior handling of both newly emerging and disappearing objects through dynamic anchor queries, and achieves SOTA performance on 5 benchmarks: OVIS, YTVIS19, YTVIS21, YTVIS22, and VIPSeg. It achieves an impressive 57.1 AP on the most challenging OVIS dataset.

Abstract

Modern video segmentation methods adopt object queries to perform inter-frame association and demonstrate satisfactory performance in tracking continuously appearing objects despite large-scale motion and transient occlusion. However, they all underperform on newly emerging and disappearing objects that are common in the real world because they attempt to model object emergence and disappearance through feature transitions between background and foreground queries that have significant feature gaps. We introduce Dynamic Anchor Queries (DAQ) to shorten the transition gap between the anchor and target queries by dynamically generating anchor queries based on the features of potential candidates. Furthermore, we introduce a query-level object Emergence and Disappearance Simulation (EDS) strategy, which unleashes the potential of DAQ without any additional cost. Finally, we combine our proposed DAQ and EDS with DVIS~\cite{zhang2023dvis} to obtain DVIS-DAQ. Extensive experiments demonstrate that DVIS-DAQ achieves a new SOTA performance on five mainstream video segmentation benchmarks.

Model Scope Comparison

empty

Performance on OVIS dataset.

empty

Performance on YTVIS19 and YTVIS21 datasets.

We comprehensively compared the current video instance segmentation methods, and DVIS-DAQ undoubtedly surpasses all existing methods.

Video

Method: DVIS-DAQ

empty empty

We dynamically generate anchor queries for newly emerging and disappearing objects, and utilize these anchor queries to capture the appearance and disappearance of objects.

Demos

BibTeX

@article{dvisdaq,
  title={DVIS-DAQ: Improving Video Segmentation via Dynamic Anchor Queries},
  author={Zhou, Yikang and Zhang, Tao and Ji, Shunping and Yan, Shuicheng and Li  Xiangtai},
  journal={arXiv},
  year={2024}
}