Camera Focal Length Prediction for Neural Novel View Synthesis from Monocular Video

Conference proceedings article

ผู้เขียน/บรรณาธิการ

กลุ่มสาขาการวิจัยเชิงกลยุทธ์

วิศวกรรมและวิทยาศาสตร์เชิงคำนวณ (การเปลี่ยนแปลงด้วยเทคโนโลยีดิจิตอล)

รายละเอียดสำหรับงานพิมพ์

รายชื่อผู้แต่ง: Dipanita Chakraborty, Werapon Chiracharit, Kosin Chamnongthai, and Minoru Okada

ปีที่เผยแพร่ (ค.ศ.): 2024

ชื่อชุด: The 16th APSIPA Annual Summit and Conference 2024 (APSIPA ASC 2024)

หน้าแรก: 1

หน้าสุดท้าย: 5

จำนวนหน้า: 5

ภาษา: English-United States (EN-US)

บทคัดย่อ

Novel view synthesis is a challenging task that generates multi-view images from a single-view object by reconstructing the depth and spatial information between the camera and the object. This task specifically facilitates the rendering of 2D objects into 3D representations from monocular video scenes. Existing methods suffer from depth information loss between the camera and object when provided with limited single-view input images, resulting in poor reconstruction accuracy in 3D space. Moreover, they lack a high-level depth feature map representation of scene information. Therefore, we propose a multilayer encoderdecoder architecture-based network that efficiently predicts the focal length between the object and camera from a mono-view image. Additionally, we utilize a combined feature extraction strategy from the estimated depth feature map and RGB input image to synthesize novel views. While the encoder network extracts semantic high-level features at multiple scales, the decoder network refines these combined features for synthesis. Our method effectively improves depth information retention while achieving good reconstruction performance. Experimental results evaluated on a benchmark dataset demonstrate the efficacy of our proposed method.

คำสำคัญ

Artificial Intelligence