Large-scale Video Object Segmentation

Workshop in conjunction with ICCV 2025

October 20 Morning, Half-day

Honolulu Convention Center, Hawaii

Latest News


[🔥Update - Aug 8] We are glad to annouance that a new track, MOSEv2, will be added to this year's challenge! Please check here for details.


[🔥Update - Aug 2] The Evaluation Servers for validation are now available!


[Update - Jul 27] Our distinguished list of Speakers has been released! Get ready to be inspired by their valuable insights and expertise!


[Update - Jul 25] The Challenge Timeline has released! Link to the evaluation server will be released on the challenge start date.


[Update - Jun 3] LSVOS 2025 will be using [OpenReview] to manage submissions. We are looking forward to your work and engaging discussions at the workshop! Please check the Paper Submission Timeline section for submission open and deadline dates.

Introduction

The 7th LSVOS challenge will be held in conjunction with ICCV 2025 in Honolulu, Hawai'i. This year, we will continue the same setup as last year and still have two tracks: VOS and RVOS. In the Video Object Segmentation (VOS) track, we will utilize LVOS and MOSE. to study the VOS under more challenging complex environments. LVOS is designed for long-term videos, dealing with complex object motion and long-term reappearance, while MOSE focuses on complex scenes, covering aspects such as object disappearance and reappearance, inconspicuous small objects, heavy occlusions, and crowded environments. For the Referring Video Object Segmentation (RVOS) track, we will continue to use MeViS. MeViS focuses on the identification of the target object in a video based on motion-related descriptions rather than static attributes. This innovative approach subverts the foundational design principles of existing RVOS methods, compelling researchers to engage in a more in - depth exploration and reevaluation of motion modeling. In addition, we will hold a series of talks by the leading experts in video understating and embodied intelligence. In this year, the following topics will be covered:

  • Semantic/panoptic segmentation for images/videos
  • Video Object Segmentation in Complex Scenes
  • Long-term Video Object Segmentation
  • Referring Video Object Segmentation
  • Video Segmentation with Motion Expressions
  • Vision and Language
  • Cognitive Models of Object Perception
  • Real-world Understanding and embodied intelligence

Challenge Timeline

Event Date
Challenge Release Aug 10, 2025
Validation Server Online Aug 10, 2025
Test Server Online Sep 01, 2025
Submission Deadline Sep 07, 2025
Notification of Results Sep 12, 2025
*All dates are in UTC, 23:59 of the specified day.

Call for Paper


[Update] LSVOS 2025 will be using [OpenReview] to manage submissions. We are looking forward to your work and engaging discussions at the workshop! Please check the Paper Submission Timeline section for submission open and deadline dates.


We invite authors to submit unpublished papers (8-page ICCV format) to our workshop, to be presented at a poster session upon acceptance. All submissions will go through a double-blind review process. All contributions must be submitted (along with supplementary materials, if any) through the paper submission portal.


Accepted papers will be published in the official ICCV Workshops proceedings and the Computer Vision Foundation (CVF) Open Access archive.

Paper Submission Timeline





Event Date
Submission portal open Jun 10, 2025
Regular paper submission deadline Jul 01, 2025
Supplemental material deadline Jul 01, 2025
Notification of paper acceptance Jul 07, 2025
Camera ready deadline Aug 18, 2025
*All dates are in UTC, 23:59 of the specified day.

Challenge Tracks & Submission

The 7th LSVOS challenge includes three tracks: Complex VOS (MOSEv2, new!), Classic VOS and RVOS.

Below are the links and task descriptions for the three tracks:

[NEW🔥] Track 1: Complex Video Object Segmentation (MOSEv2)

MOSEv2 focuses on VOS in complex scenes with frequent object disappearance and reappearance, severe occlusions, smaller targets, and new challenges including adverse weather, low light, multi-shot sequences, camouflage, non-physical targets, and knowledge-dependent scenarios. Submission server [click here]

Track 2: Video Object Segmentation (Classic VOS)

The video object segmentation task aims to segmenting a particular object instance throughout the entire video sequence given only the object mask of the first frame. The dataset is a mixture of MOSEv1 and LVOS datasets. Submission server [click here]

Track 3: Referring Video Object Segmentation (RVOS)

Referring video object segmentation aims to segment an object in video with language expressions. Submission server [click here]

Speakers

Ming-Hsuan Yang

UC Merced

Yutong Bai

UC Berkeley (BAIR)

Shuangrui Ding

Chinese University of Hong Kong

Organizers

Lingyi Hong

Fudan University

Henghui Ding

Fudan University

Chang Liu

Tiktok Inc.

Ning Xu

Apple Inc.

Linjie Yang

ByteDance Inc.

Yuchen Fan

Meta Reality Labs