CPSeg - Ansatte ved Biomedicinsk Institut

CPSeg: Finer-grained Image Semantic Segmentation via Chain-of-Thought Language Prompting

Publikation: Bidrag til bog/antologi/rapport › Konferencebidrag i proceedings › Forskning › fagfællebedømt

Dokumenter

CPSeg
Accepteret manuskript, 1,67 MB, PDF-dokument

Li, Lei

Natural scene analysis and remote sensing imagery offer immense potential for advancements in large-scale language-guided context-aware data utilization. This potential is particularly significant for enhancing performance in downstream tasks such as object detection and segmentation with designed language prompting. In light of this, we introduce the CPSeg (Chain-of-Thought Language Prompting for Finer-grained Semantic Segmentation), an innovative framework designed to augment image segmentation performance by integrating a novel "Chain-of-Thought" process that harnesses textual information associated with images. This groundbreaking approach has been applied to a flood disaster scenario. CPSeg encodes prompt texts derived from various sentences to formulate a coherent chain-of-thought. We use a new vision-language dataset, FloodPrompt, which includes images, semantic masks, and corresponding text information. This not only strengthens the semantic understanding of the scenario but also aids in the key task of semantic segmentation through an interplay of pixel and text matching maps. Our qualitative and quantitative analyses validate the effectiveness of CPSeg.

Originalsprog	Engelsk
Titel	2024 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)
Forlag	IEEE
Publikationsdato	2024
Sider	502-511
DOI	https://doi.org/10.1109/WACV57701.2024.00057
Status	Udgivet - 2024
Begivenhed	WACV 2024 - IEEE/CVF Winter Conference on Applications of Computer Vision - Waikola, Hawaii, USA Varighed: 4 jan. 2024 → 8 jan. 2024

Konference

Konference	WACV 2024 - IEEE/CVF Winter Conference on Applications of Computer Vision
Land	USA
By	Waikola, Hawaii
Periode	04/01/2024 → 08/01/2024

Antal downloads er baseret på statistik fra Google Scholar og www.ku.dk

Ingen data tilgængelig

ID: 378943255