EmotiCrafter: Text-to-Emotional-Image Generation based on Valence-Arousal Model (ICCV2025)

Recent research shows that emotions can enhance users' cognition and influence information communication. While research on visual emotion analysis is extensive, limited work has been done on helping users generate emotionally rich image content. Existing work on emotional image generation relies on discrete emotion categories, making it challenging to capture complex and subtle emotional nuances accurately. Additionally, these methods struggle to control the specific content of generated images based on text prompts. In this paper, we introduce the task of continuous emotional image content generation (C-EICG) and present EmotiCrafter, a general emotional image generation model that generates images based on free text prompts and Valence-Arousal (V-A) values. It leverages a novel emotion-embedding mapping network to fuse V-A values into textual features, enabling the capture of emotions in alignment with intended input prompts. A novel loss function is also proposed to enhance emotion expression. The experimental results show that our method effectively generates images representing specific emotions with the desired content and outperforms existing techniques.

💡 What is EmotiCrafter?

We introduce the task of Continuous Emotional Image Content Generation (C-EICG) and present EmotiCrafter, a novel emotional image generation model that:

Accepts free-form text prompts
Conditions on Valence-Arousal (V-A) values
Leverages a new emotion-embedding mapping network to fuse V-A signals into text features
Uses a custom loss function to improve emotional fidelity

👉 Try EmotiCrafter Demo on Hugging Face 🤗

🛠️ Setup Guide

1. Create Conda Environment

conda env create -f environment.yml

2. Clone the Repository

git clone https://github.com/idvxlab/EmotiCrafter
cd EmotiCrafter

3. Download the SDXL Base Model

You need to download the Stable Diffusion XL Base 1.0 model and place it appropriately.

4. Download Pretrained Model or Train Your Own

Option A: Use Pretrained Model

You could download the pretrained modal from this url and place it appropriately.

Option B: Train Your Own Model

a. Preprocess the Data

python preprocess.py --sdxl_path [pretrained SDXL]

b. Start Training

python train.py \
  --batch_size 768 \
  --lr 0.001 \
  --epochs 200 \
  --save_dir ./ckpt \
  --scale_factor 1.5 \
  --enable_density True

🖼️ Inference

Make sure you have your environment activated and model paths ready.

conda activate emotion

Single Image Inference

python inference.py \
  --prompt "A man is running fast" \
  --arousal 2.5 \
  --valence -2 \
  --ckpt_path [pretrained_eit] \
  --sdxl_path  [pretrained_sdxl] \
  --seed 0

5x5 Grid Inference

python inference5x5.py \
  --prompt "A man is running fast" \
  --ckpt_path [pretrained_eit] \
  --sdxl_path  [pretrained_sdxl] \
  --seed 0

The raw image data has been uploaded to this url. However, EmotiCrafter did not use image data for model training.

We thank the Stable Diffusion XL (SDXL), FindingEmo, OASIS, and Emotic for the their excellent works, which made this work possible. If you use EmotiCrafter in your research or applications, please cite our work.

@inproceedings{dang2025emoticrafter,
  title={Emoticrafter: Text-to-emotional-image generation based on valence-arousal model},
  author={Dang, Shengqi and He, Yi and Ling, Long and Qian, Ziqing and Zhao, Nanxuan and Cao, Nan},
  booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision},
  pages={15218--15228},
  year={2025}
}

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
config		config
data		data
metrics		metrics
.gitignore		.gitignore
environment.yml		environment.yml
inference.py		inference.py
inference5x5.py		inference5x5.py
model.py		model.py
preprocess.py		preprocess.py
readme.md		readme.md
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

EmotiCrafter: Text-to-Emotional-Image Generation based on Valence-Arousal Model (ICCV2025)

💡 What is EmotiCrafter?

🛠️ Setup Guide

1. Create Conda Environment

2. Clone the Repository

3. Download the SDXL Base Model

4. Download Pretrained Model or Train Your Own

Option A: Use Pretrained Model

Option B: Train Your Own Model

a. Preprocess the Data

b. Start Training

🖼️ Inference

Single Image Inference

5x5 Grid Inference

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

EmotiCrafter: Text-to-Emotional-Image Generation based on Valence-Arousal Model (ICCV2025)

💡 What is EmotiCrafter?

🛠️ Setup Guide

1. Create Conda Environment

2. Clone the Repository

3. Download the SDXL Base Model

4. Download Pretrained Model or Train Your Own

Option A: Use Pretrained Model

Option B: Train Your Own Model

a. Preprocess the Data

b. Start Training

🖼️ Inference

Single Image Inference

5x5 Grid Inference

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages