A research method for reconstructing a detailed 3D model of a clothed human from a single image. The approach first predicts shape using PIFu-HD and synthesizes back-view appearance using Pose-with-Style, then iteratively generates multi-view images guided by 3D surface normals and contours using a diffusion model. The multi-view images are consolidated into a consistent UV texture map via differentiable rendering. Compared to prior methods like PIFu, TEXTure, Magic-123, and TeCH, this approach produces fewer artifacts and more consistent full-body textures, though limitations include baked-in lighting and inability to refine existing 3D shapes.
ā¢3m watch time
Sort: