Introduction
The 7th LSVOS challenge will be held in conjunction with ICCV 2025 in Honolulu, Hawai'i. This year, we will continue the same setup as last year and still have two tracks: VOS and RVOS. In the Video Object Segmentation (VOS) track, we will utilize LVOS and MOSE. to study the VOS under more challenging complex environments. LVOS is designed for long-term videos, dealing with complex object motion and long-term reappearance, while MOSE focuses on complex scenes, covering aspects such as object disappearance and reappearance, inconspicuous small objects, heavy occlusions, and crowded environments. For the Referring Video Object Segmentation (RVOS) track, we will continue to use MeViS. MeViS focuses on the identification of the target object in a video based on motion-related descriptions rather than static attributes. This innovative approach subverts the foundational design principles of existing RVOS methods, compelling researchers to engage in a more in - depth exploration and reevaluation of motion modeling. In addition, we will hold a series of talks by the leading experts in video understating and embodied intelligence. In this year, the following topics will be covered:
- Semantic/panoptic segmentation for images/videos
- Video Object Segmentation in Complex Scenes
- Long-term Video Object Segmentation
- Referring Video Object Segmentation
- Video Segmentation with Motion Expressions
- Vision and Language
- Cognitive Models of Object Perception
- Real-world Understanding and embodied intelligence
Call for Paper
We invite authors to submit unpublished papers to our workshop, to be presented at a poster session upon acceptance. All submissions will go through a double-blind review process. All contributions must be submitted (along with supplementary materials, if any).
Accepted papers will be published in the official ICCV Workshops proceedings and the Computer Vision Foundation (CVF) Open Access archive.
Dates
Challenge Dates
TBD
Paper Submission Dates
TBD
Speakers
TBD
Tracks & Submission
The 7th LSVOS challenge includes two tracks: VOS and RVOS.
Below are the links and task descriptions for the two tracks:
Track1: Video Object Segmentation (VOS)
The video object segmentation task aims to segmenting a particular object instance throughout the entire video sequence given only the object mask of the first frame.
Track 2: Referring Video Object Segmentation (RVOS)
Referring video object segmentation aims to segment an object in video with language expressions.
Leadboard
TBD
Schedule
TBD
Organizers

Lingyi Hong
Fudan University
Henghui Ding
Fudan University
Chang Liu
Nanyang Technological University
Ning Xu
Apple Inc.
Linjie Yang
ByteDance Inc.
Yuchen Fan
Meta Reality LabsContact
Feel free to contact us:
henghui.ding@gmail.com
honglyhly@gmail.com