View-consistent Object Removal in Radiance Fields

Case Western Reserve University
ACM MM 2024

Radiance Fields (RFs) have emerged as a crucial technology for 3D scene representation, enabling the synthesis of novel views with remarkable realism. However, as RFs become more widely used, the need for effective editing techniques that maintain coherence across different perspectives becomes evident. Current methods primarily depend on per-frame 2D image inpainting, which often fails to maintain consistency across views, thus compromising the realism of edited RF scenes. In this work, we introduce a novel RF editing pipeline that significantly enhances consistency by requiring the inpainting of only a single reference image. This image is then projected across multiple views using a depth-based approach, effectively reducing the inconsistencies observed with per-frame inpainting. However, projections typically assume photometric consistency across views, which is often impractical in real-world settings. To accommodate realistic variations in lighting and viewpoint, our pipeline adjusts the appearance of the projected views by generating multiple directional variants of the inpainted image, thereby adapting to different photometric conditions. Additionally, we present an effective and robust multi-view object segmentation approach as a valuable byproduct of our pipeline. Extensive experiments demonstrate that our method significantly surpasses existing frameworks in maintaining content consistency across views and enhancing visual quality.

Scene Object Removal Results

Comparison with Previous Methods

Original Scene

SPIn-NeRF

NeRFiller

Ours

Multi-view Consistency

Here we showcase how our pipeline ensures the multi-view consistency by evaluating the number of keypoint matchings across different rendered views. These keypoint matchings are given by SuperGlue, and we only consider the matchings within the inpainted region. As evident in the results, our pipeline has significantly more results within the inpainted region, which indicates a better multi-view consistency.

SPIn-NeRF

NeRFiller

Ours

Mask Consistency

Our multi-view segmentation approach can not only maintain mask consistency across different views, but can also take regions without semantic meanings into consideration (i.e. the shadow on the left side of the book).

Original Scene

Masks

Citation

If you find our work helpful, please consider cite us:

@inproceedings{lu2024view, title={View-consistent Object Removal in Radiance Fields}, author={Lu, Yiren and Ma, Jing and Yin, Yu}, booktitle={Proceedings of the 32nd ACM International Conference on Multimedia}, pages={3597--3606}, year={2024} }

View-consistent Object Removal in Radiance Fields

Abstract

Inpainting Pipeline

Scene Object Removal Results

Comparison with Previous Methods

Original Scene

SPIn-NeRF

NeRFiller

Ours

Multi-view Consistency

SPIn-NeRF

NeRFiller

Ours

Mask Consistency

Original Scene

Masks

Citation