3D Mesh Hollowing: Solid Mesh to Hollow Mesh Conversion Using Multi-View Data and Neural Networks
Abstract
This paper presents a novel approach addressing the issues that lead to solid meshes that are significant in 3D modeling, with wide applications in manufacturing, product design, medical modeling, and virtual reality by converting solid meshes into hollow meshes using the Convolutional Reconstruction Model (CRM) using the CraftsMan system, leveraging pre-trained checkpoints and advanced geometric representations. To model the hollowing process, we first tackle the complex problem of generating hollow mesh by leveraging flexible 3D grid representation called Flexicubes that promote end-to-end optimization between the outer and inner structures. Integrating Signed Distance Fields (SDF) and a Dual Marching Cubes algorithm in CRM serves its purpose of obtaining both the outer surfaces and inner voids of 3D object generation. Using a triplane method spatially aligns four orthographic views of the 3D object, which permits the extraction of geometry accurately, while the U-Net-based approach permits pixel alignment along the 2D-dimensional plane. Canonical Coordinate Maps (CCMs) also assist the model in understanding how to read its spatial structure and give the model the ability to produce hollow structures from a single image. Through responsible training on a filtered version of the Objaverse 3D datasets, CRM effectively synthesizes hollow mesh. With the use of pre-trained checkpoints in the CraftsMan, synthesis of hollow mesh is produced quickly while attaining accuracy and robustness. Collectively, we greatly scale that model of solid to hollow mesh using one system that is fast and efficient and can be applied to different applications such as 3D modeling, gaming, and virtual reality.
1. INTRODUCTION
Converting solid meshes into hollow meshes is a critical process in 3D modeling, with significant applications in manufacturing, product design, medical modeling, and virtual reality. Hollow mesh models are essential for simulating physical properties and optimizing material usage. Despite the growing demand, this problem is complex due to the need to preserve both external surface fidelity and internal geometric details. Traditional techniques often fail to achieve accurate internal void generation while maintaining external surface quality, resulting in either geometrically inconsistent models or computationally expensive solutions.
Existing methods for converting solid to hollow meshes rely on geometric manipulation, iterative refinement, or intensive simulations. These techniques often struggle with scalability and handling complex internal structures, making it difficult to create smooth transitions between solid and hollow regions. Issues such as self-intersections, mesh distortions, and loss of detail are common, highlighting the need for a more robust and efficient solution.
In this paper, we propose a novel approach for converting solid meshes into hollow meshes using the Craftsman system. Our system utilizes the Convolutional Reconstruction Model (CRM) that employs specialized geometric techniques to capture both internal and external structures. This approach uses a flexible 3D grid representation called Flexicubes, enabling end-to-end optimization for precise modeling of hollow internal structures while preserving external surface quality. Additionally, Signed Distance Fields (SDF) are used to encode surface geometry and the Dual Marching Cubes algorithm is applied to ensure smooth and consistent mesh generation.
To accurately represent complex geometries, the system uses a triplane-based method, aligning four orthographic views of the object—front, back, left, and right—to extract external and internal features. Canonical Coordinate Maps (CCMs) are also employed to improve spatial understanding, allowing the model to interpret internal and external geometry simultaneously. This unique combination of geometric representation and spatial mapping ensures that the generated hollow meshes are both geometrically consistent.
The proposed method is trained by using the Objaverse 3D datasets, which provide detailed supervision for internal structures. Leveraging pre-trained checkpoints and a streamlined inference pipeline, the system can generate hollow meshes quickly, with high accuracy and robustness. The resulting hollow meshes exhibit smooth transitions between solid and hollow regions, making them ideal for diverse applications in 3D modeling, CAD, and digital content creation.
2. RELATED WORK
3D Generation with 2D Supervision Adapting 2D image priors using NeRF and GAN-based architectures is done by Mildenhall et al. [2020], as well as Chan et al. [2021a]. Techniques like 2D diffusion models, which are very intensive for per-shape optimization, were used by Poole et al. [2022] to optimize 3D shapes. Recent works focused on multi-view generation with 2D models [Long et al., 2023], but suffer from view inconsistency and complex lighting issues.
These rely on 3D native generative models with representations such as point clouds, meshes, or implicit fields [Li et al., 2018; Nash et al., 2020] as well as GANs or autoregressive models [Goodfellow et al., 2014; Sun et al., 2020]. Recent work in this area is moving toward a new class of diffusion models for 3D objects, which are computationally expensive and fragile about fine geometric details [Ho et al., 2020; Chou et al., 2023].
SDS [Poole et al., 2022] and multi-view diffusion models [Liu et al., 2023; Chan et al., 2023] advanced single-view 3D reconstruction. While optimizing 3D shapes with 2D priors with SDS, one found it rather computationally expensive and prone to inconsistency between views. Recent work like Zero123 [Liu et al., 2023] and SyncDreamer [Chan et al., 2023] incorporate multi-view diffusion that was beneficial for consistency while being nonoptimal in terms of geometry.
Direct 3D generative models using point clouds or implicit fields [Luo and Hu, 2021; Liu et al., 2023] are faster in inference but generalize poorly because the datasets are small. For the remedy, Wonder3D uses a cross-domain diffusion model with cross-domain attention and geometry-aware normal fusion to generate high-quality and efficient 3D reconstructions from single images, which works better than prior methods in speed and detail [Long et al., 2023]
3. METHOD
Conversion of solid meshes to hollow meshes is significant in 3D modeling, which has a wide application in manufacturing, product design, medical modeling, and virtual reality. Hollow mesh models significantly provide the test of physical properties, light-weight structure, and material optimization therefore, it is hard to resolve this problem since an operation, by trying to leave both exterior fidelity of the surface and interior geometric details, would undermine one of the factors while trying to retain the other. The traditional methods fail to achieve internally exact void generation together with surface quality, thereby providing only geometrically inconsistent models or computationally expensive solutions.
We start by introducing a method of mesh hollowing using the Craftsman system; it is an integration of advanced representations of geometry and designs to built architectures. In our system, we utilize the application of the Convolutional by employing specific techniques in geometry to reconstruct the internal and external structures.This technique utilizes what is commonly referred to as Flexicubes-a flexible 3D grid representation on one's side to allow for end-to-end optimization for precise modeling of hollow internal structures while maintaining surface quality. In addition, Signed Distance Fields are encoded in the surface geometry, and the Dual Marching Cubes algorithm is utilized to ensure consistent mesh generation.
3.1 Multi-view Diffusion Model
First, we explain the design of the multi-view diffusion model to generate four orthographic view images from a single input image. Instead of training from scratch, which is typically extremely expensive, we initialize the diffusion models using the checkpoint of the CraftsMan, a high-performance diffusion model for multi-view image generation from a single image. The original Craftsman supports the 6 views generation. We reduce it to 4 views by removing two more perspectives (up and down).
3.2 Convolutional Reconstruction Model
now describe the architecture of the CRM in detail. In Figure 4 it is shown that CRM takes as input four orthographic images along with Canonical Coordinate Maps (CCMs) and processes them using a convolutional U-Net to transform the combined inputs into a rolled-out triplane representation. This representation is then reshaped into a standard triplane format. To recover geometric and texture-meaningful information from the triplane features, small MLPs are used to decode into Signed Distance Field values, texture colors, and Flexicubes parameters. Finally, the decoded values are used for the creation of a textured 3D mesh with the use of the dual marching cubes algorithm. Below, a detailed overview of the core components of CRM is provided.
Triplane Representation: The system uses a triplane-based approach to correctly depict complex geometries by first orienting four orthographic views of the object, its front, back, left, and right to allow the outer as well as inner features to come out. Such a combination is further redefined by using Canonical Coordinate Maps (CCMs) for a proper spatial insight where in the model can infer both the outer and inner geometries in association.
The proposed method is trained on Objaverse 3D datasets and precise internal structures supervision. Besides utilizing the pre-trained checkpoint of the Craftsman system and streamlined inference pipeline, the generation of hollow meshes is performed very rapidly, showing high accuracy and robustness. Hollow meshes have been given smooth transitions between the solid and hollow regions, thus making them suitable for diverse applications in 3D modeling, CAD, and digital content creation.
4. RESULT
The model was trained on Objaverse 3D datasets with high-quality solid meshes in highly differentiated shapes and sizes, hence providing robust geometric presentation. The quality of the model was assessed using standard mesh quality metrics, such as Chamfer Distance and Intersection over Union (IoU), to evaluate its performance. The experimental results indicate that the model demonstrates superior performances in hollow mesh generation terms. Using the checkpoints of the CraftsMan system enables real-time hollow mesh synthesis, thus opening up completely new possibilities for interactive 3D modeling.
Solid Mesh
Hollow Mesh
5. DISCUSSION
Hollow mesh generation has thus proved a success for CRM, which can be used with SDFs and the dual-surface extraction algorithm. End-to-end optimization between the inner and outer structures allows the model to achieve geometric transformations that no other methods can successfully carry out. In addition, the incorporation of pre-trained models within the CraftsMan system makes the hollowing process speed much faster and more robust while being suitable for real-world applications in 3D modeling, gaming, and virtual reality.
The employment of CCMs ensures that CRM remains spatially coherent in the face of heterogeneity and complexity within objects. This structured methodology leads to a more granular understanding of spatial relations within 3D objects, thus resulting in good quality hollow outputs acquired.
CONCLUSION
This paper addresses the problem of the transformation of solid meshes into hollow meshes using the CraftsMan system. By using the Convolutional Reconstruction Model (CRM) that is embedded with state-of-the-art techniques like Flexicubes, Signed Distance Fields (SDFs), Dual Marching Cubes, and Canonical Coordinate Maps (CCMs), we present a flawless yet timely and strong method of having a hollow interlacement of a part. By using the checkpoints of the CraftsMan system, CRM establishes a new height of the hollow mesh synthesis, thus defining its feasibility across a range of industries that include 3D modeling, gaming, and virtual reality
What Can You Take Away from This White Paper
- Mildenhall, B., Srinivasan, P.P., Tancik, M., Barron, J.T., Ramamoorthi, R., Ng, R.: Nerf: Representing scenes as neural radiance fields for view synthesis. Communications of the ACM 65(1), 99–106 (2021)
- Poole, B., Jain, A., Barron, J.T., Mildenhall, B.: Dreamfusion: Text-to-3d using 2d diffusion. arXiv preprint arXiv:2209.14988 (2022)
- Chan, E.R., Lin, C.Z., Chan, M.A., Nagano, K., Pan, B., De Mello, S., Gallo, O., Guibas, L.J., Tremblay, J., Khamis, S., et al.: Efficient geometry-aware 3d generative adversarial networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 16123–16133 (2022)
- Shi, Y., Wang, P., Ye, J., Long, M., Li, K., Yang, X.: Mvdream: Multi-view diffusion for 3d generation. arXiv preprint arXiv:2308.16512 (2023)
- Gene Chou, Yuval Bahat, and Felix Heide. Diffusion-SDF: Conditional generative modeling of signed distance functions. In International Conference on Computer Vision (ICCV), pages 2262–2272, 2023.
- Shitong Luo and Wei Hu. Diffusion probabilistic models for 3d point cloud generation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2021.
- Yuan Liu, Cheng Lin, Zijiao Zeng, Xiaoxiao Long, Lingjie Liu, Taku Komura, and Wenping Wang. Syncdreamer: Generating multiview-consistent images from a single-view image. arXiv preprint arXiv:2309.03453, 2023b.
- Xiaoxiao Long, Yuan-Chen Guo, Cheng Lin, Yuan Liu, Zhiyang Dou, Lingjie Liu, Yuexin Ma, Song-Hai Zhang, Marc Habermann, Christian Theobalt, et al. Wonder3d: Single image to 3d using cross-domain diffusion. arXiv preprint arXiv:2310.15008, 2023.
- Deitke, M., Liu, R., Wallingford, M., Ngo, H., Michel, O., Kusupati, A., Fan, A., Laforte, C., Voleti, V., Gadre, S.Y., et al.: Objaverse-xl: A universe of 10m+ 3d objects. Advances in Neural Information Processing Systems 36 (2024) .
- Deitke, M., Schwenk, D., Salvador, J., Weihs, L., Michel, O., VanderBilt, E., Schmidt, L., Ehsani, K., Kembhavi, A., Farhadi, A.: Objaverse: A universe of annotated 3d objects. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 13142–13153 (2023)
REFERENCE
Australia
470 St Kilda Rd
Melbourne Vic 3004
USA
Venture X, 2451 W Grapevine Mills Cir,
Grapevine, TX 76051, United States
Netherlands
Landfort 64. Lelystad 8219AL
Canada
4025 River Mill Way, Mississauga, ON L4W 4C1, Canada
India
4A, Maple High Street, Hoshangabad Road, Bhopal, MP.