Research on dynamic generation and optimization model of AI virtual performance role based on generative adversarial networks

Conference proceedings article

ผู้เขียน/บรรณาธิการ

PETER NIGEL POWER

กลุ่มสาขาการวิจัยเชิงกลยุทธ์

รายละเอียดสำหรับงานพิมพ์

รายชื่อผู้แต่ง: HUI WANG, NIGEL POWER

ปีที่เผยแพร่ (ค.ศ.): 2025

ชื่อชุด: IET Conference Proceedings

Volume number: 25

หน้าแรก: 63

หน้าสุดท้าย: 68

จำนวนหน้า: 6

ภาษา: English-United States (EN-US)

บทคัดย่อ

Generative Adversarial Networks (GAN), as an emerging deep learning model, can generate high-quality image and video samples through the competition between generators and discriminators, providing new ideas for virtual character generation. In this paper, a dynamic generation model of AI virtual performance role based on GAN is proposed, and its architecture includes two parts: generator and discriminator. The generator uses a deep convolutional neural network (CNN) to transform random noise vectors into realistic virtual character images through convolutional layers, upsampling layers, and fully connected layers; the discriminator determines the authenticity of the input image through convolutional and downsampling layers. In order to generate a virtual character with a specific posture and expression, the generator introduces a conditional input mechanism, which takes posture parameters and expression parameters as conditional inputs to guide the generator to generate an image with specific characteristics. In the aspect of model optimization, this paper designs a loss function that includes confrontation loss and content loss, in which binary cross entropy loss is used for confrontation loss and L1 loss is used for content loss, so as to improve the fidelity and diversity of the generated image. Convolutional block attention module (CBAM) is introduced into the network structure, which enhances the model's ability to extract important features. In the training strategy, the multi-scale training method is adopted, and the model's ability to understand the details and global structure of the image is enhanced by discriminating and generating the image at different scales. The experimental results show that the proposed model is superior to the baseline model in terms of fidelity and diversity, with the incidence score (IS) of 4.23 and the Fréchet Inception Distance (FID) of 12.56. This model can effectively generate realistic and diverse AI virtual performance role images, which provides strong technical support for the development of related fields.

คำสำคัญ

ไม่พบข้อมูลที่เกี่ยวข้อง