FaceChain-FACT:Face Adapter for Human AIGC

Haoyu Xie, Yang Liu, Lei Shang, Cheng Yu, Jun Dan, Chao Xu, Fei Wang, Xuansong Xie, Baigui Sun
Alibaba Group,

Highlight

We support Zero-Shot portrait generation.
We trained model with Millons of exquisite human portraits.
We have 100+ haute couture templates on the shelf.
Our model support CPU & second-level inference time.
图片描述

Second-level portrait photo generated by FaceChain-FACT portraits.

Abstract

Nowadays, face customization poses a greater challenge to image generation due to the high detail of human faces.

FaceChain, an outstanding work in personalized portrait generation with facial identity preservation, has received widespread attention in the community. It trains a LoRA model of a person to intergrate facial information and generates customized portraits. However, due to the need to train users' LoRA models, FaceChain's pipeline is divided into two stages: training and inference, which increases the user's cost. We propose a zero-shot version called FaceChain-FACT that does not require Face LoRA model training. In addition, we only require single photo of the user to generate customized portraits. Compared to SOTA commercial applications, our generation speed is accelerated by 100 times, with second-level image generation speed. We integrate a transformer-based face feature extractor whose structure is similar to Stable Diffusion, which enables Stable Diffusion to better utilize face information; We use dense fine-grained features as face conditions, which has better character reproduction; FaceChain-FACT is seamlessly compatible with ControlNet and LoRA plugins, plug and play.

Method

We use a series of image preprocessing, including face segmentation, face cropping and alignment, hand detection, face quality screening, etc., to screen and obtain a training dataset. We leverage a transformer-based face feature extractor to extract features and utilize the dense fine-grained features from the penultimate layer as the face condition. Stable Diffusion receives the face condition through the FACT-Adapter and combines it with text embeddings to generate portrait images. By fusing various LoRA models from FaceChain, it can generate portraits with a variety of styles.

图片描述

Various Characters Result

图片描述 图片描述

BibTeX

@article{liu2023facechain,
  author    = {Yang Liu and Cheng Yu and Lei Shang and Yongyi He and Ziheng Wu and Xingjun Wang and Chao Xu and Haoyu Xie and Weida Wang and Yuze Zhao and Lin Zhu and Chen Cheng and Weitao Chen and Yuan Yao and Wenmeng Zhou and Jiaqi Xu and Qiang Wang and Yingda Chen and Xuansong Xie and Baigui Sun},
  title     = {FaceChain: A Playground for Human-centric Artificial Intelligence Generated Content},
  journal   = {arXiv},
  year      = {2023},
}