TolerantGAN: Text-Guided Image Manipulation Tolerant to Real-World Image

1Citations
Citations of this article
16Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

Although text-guided image manipulation approaches have demonstrated highly accurate performance for editing the appearance of images in a virtual or simple scenario, their real-world applications face significant challenges. The primary cause of these challenges is the misalignment in the distribution of training and real-world data, which leads to unstable text-guided image manipulation. In this work, we propose a novel framework called TolerantGAN and tackle the new task of real-world text-guided image manipulation independent of the training data. To achieve this, we introduce two key concepts of a border smoothly connection module (BSCM) and a manipulation direction-based attention module (MDAM). BSCM smoothens the misalignment in the distribution of training and real-world data. MDAM extracts only regions highly relevant for image manipulation and assists in reconstructing unobserved objects in the training data. For in-the-wild input images of various classes, TolerantGAN robustly outperforms the state-of-the-art methods.

References Powered by Scopus

ImageNet: A Large-Scale Hierarchical Image Database

52015Citations
N/AReaders
Get full text

Image quality assessment: From error visibility to structural similarity

45196Citations
N/AReaders
Get full text

You only look once: Unified, real-time object detection

38169Citations
N/AReaders
Get full text

Cited by Powered by Scopus

Opportunities and Challenges of YOLO -World in Smart City Surveillance

0Citations
N/AReaders
Get full text

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Cite

CITATION STYLE

APA

Watanabe, Y., Togo, R., Maeda, K., Ogawa, T., & Haseyama, M. (2024). TolerantGAN: Text-Guided Image Manipulation Tolerant to Real-World Image. IEEE Open Journal of Signal Processing, 5, 150–159. https://doi.org/10.1109/OJSP.2023.3343335

Readers over time

‘24‘25036912

Readers' Seniority

Tooltip

PhD / Post grad / Masters / Doc 1

100%

Article Metrics

Tooltip
Mentions
News Mentions: 1

Save time finding and organizing research with Mendeley

Sign up for free
0