Embedding Shepard’s Interpolation into CNN Models for Unguided Depth Completion

Mengistu, Shambel Fente; Pistellato, Mara; Bergamasco, Filippo

doi:10.1007/978-3-031-47546-7_23

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 14318))

Included in the following conference series:

International Conference of the Italian Association for Artificial Intelligence

435 Accesses

Abstract

When acquiring sparse data samples, an interpolation method is often needed to fill in the missing information. An example application, known as “depth completion”, consists in estimating dense depth maps from sparse observations (e.g. LiDAR acquisitions). To do this, algorithmic methods fill the depth image by performing a sequence of basic image processing operations, while recent approaches propose data-driven solutions, mostly based on Convolutional Neural Networks (CNNs), to predict the missing information. In this work, we combine learning-based and classical algorithmic approaches to ideally exploit the performance of the former with the ability to generalize of the latter. First, we define a novel architecture block called IDWBlock. This component allows to embed Shepard’s interpolation (or Inverse Distance Weighting, IDW) into a CNN model, with the advantage of requiring a small number of parameters regardless of the kernel size. Second, we propose two network architectures involving a combination of the IDWBlock and learning-based depth completion techniques. In the experimental section, we tested the models’ performances on the KITTI depth completion benchmark and NYU-depth-v2 dataset, showing how they present strong robustness to input sparsity under different densities and patterns.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 59.99; Price excludes VAT (USA)

Softcover Book: USD 79.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
We can assume without loss of generality that I is square and that \(S=2a+1, a \in \mathbb {N}\). If that is not the case, I can be padded with zeros to meet such condition.

References

Alhashim, I., Wonka, P.: High quality monocular depth estimation via transfer learning. ArXiv abs/1812.11941 (2018)
Google Scholar
Barron, J.T., Poole, B.: The fast bilateral solver. ArXiv abs/1511.03296 (2015)
Google Scholar
Chen, X., Kundu, K., Zhang, Z., Ma, H., Fidler, S., Urtasun, R.: Monocular 3D object detection for autonomous driving. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2147–2156 (2016). https://doi.org/10.1109/CVPR.2016.236
Cheng, X., Wang, P., Yang, R.: Depth estimation via affinity learned with convolutional spatial propagation network. In: European Conference on Computer Vision (2018)
Google Scholar
Chodosh, N., Wang, C., Lucey, S.: Deep convolutional compressed sensing for lidar depth completion. ArXiv abs/1803.08949 (2018)
Google Scholar
Choi, K., Chong, K.: Modified inverse distance weighting interpolation for particulate matter estimation and mapping. Atmosphere 13(5), 846 (2022). https://doi.org/10.3390/atmos13050846. https://www.mdpi.com/2073-4433/13/5/846
Franke, R.: Scattered data interpolation: tests of some methods. Math. Comput. 38(157), 181–200 (1982)
MathSciNet MATH Google Scholar
Gasparetto, A., et al.: Cross-dataset data augmentation for convolutional neural networks training. In: 2018 24th International Conference on Pattern Recognition (ICPR), pp. 910–915. IEEE (2018). https://doi.org/10.1109/ICPR.2018.8545812
He, L., Wang, G., Hu, Z.: Learning depth from single images with deep neural network embedding focal length. IEEE Trans. Image Process. 27(9), 4676–4689 (2018)
Article MathSciNet Google Scholar
Huang, Z., Fan, J., Cheng, S., Yi, S., Wang, X., Li, H.: HMS-Net: hierarchical multi-scale sparsity-invariant network for sparse depth completion. IEEE Trans. Image Process. 29, 3429–3441 (2018)
Article MATH Google Scholar
Ku, J., Harakeh, A., Waslander, S.L.: In defense of classical image processing: fast depth completion on the CPU. In: 2018 15th Conference on Computer and Robot Vision (CRV), pp. 16–22 (2018). https://doi.org/10.1109/CRV.2018.00013
Li, B., Zhang, T., Xia, T.: Vehicle detection from 3D Lidar using fully convolutional network. ArXiv abs/1608.07916 (2016)
Google Scholar
Li, J., Heap, A.D.: A review of comparative studies of spatial interpolation methods in environmental sciences: performance and impact factors. Ecol. Inform. 6(3), 228–241 (2011). https://doi.org/10.1016/j.ecoinf.2010.12.003. https://www.sciencedirect.com/science/article/pii/S1574954110001147
Li, Y., Ibanez-Guzman, J.: Lidar for autonomous driving: the principles, challenges, and trends for automotive lidar and perception systems. IEEE Signal Process. Mag. 37(4), 50–61 (2020)
Article Google Scholar
Lin, T.Y., Dollár, P., Girshick, R., He, K., Hariharan, B., Belongie, S.: Feature pyramid networks for object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2117–2125 (2017)
Google Scholar
Ma, F., Cavalheiro, G.V., Karaman, S.: Self-supervised sparse-to-dense: self-supervised depth completion from lidar and monocular camera. In: 2019 International Conference on Robotics and Automation (ICRA), pp. 3288–3295 (2018)
Google Scholar
Ma, F., Karaman, S.: Sparse-to-dense: depth prediction from sparse depth samples and a single image. In: 2018 IEEE International Conference on Robotics and Automation (ICRA), pp. 1–8 (2017)
Google Scholar
Märkert, F., Sunkel, M., Haselhoff, A., Rudolph, S.: Segmentation-guided domain adaptation for efficient depth completion. ArXiv abs/2210.09213 (2022). https://api.semanticscholar.org/CorpusID:252918440
Mulkal, M., Wandi, R.: Inverse distance weight spatial interpolation for topographic surface 3D modelling. TECHSI - Jurnal Teknik Informatika 11, 385 (2019). https://doi.org/10.29103/techsi.v11i3.1934
Nielson, R., Franke, R.: Scattered data interpolation and applications: a tutorial and survey. In: Hagen, H., Roller, D. (eds. ) Geometric Modeling. Computer Graphics – Systems and Applications, pp. 131–160. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-76404-2_6
Pistellato, M., Albarelli, A., Bergamasco, F., Torsello, A.: Robust joint selection of camera orientations and feature projections over multiple views. In: Proceedings - International Conference on Pattern Recognition, pp. 3703–3708 (2016). https://doi.org/10.1109/ICPR.2016.7900210
Pistellato, M., Bergamasco, F., Albarelli, A., Torsello, A.: Dynamic optimal path selection for 3D triangulation with multiple cameras. In: Murino, V., Puppo, E. (eds.) ICIAP 2015. LNCS, vol. 9279, pp. 468–479. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-23231-7_42
Chapter Google Scholar
Pistellato, M., Bergamasco, F., Albarelli, A., Torsello, A.: Robust cylinder estimation in point clouds from pairwise axes similarities. In: ICPRAM 2019 - Proceedings of the 8th International Conference on Pattern Recognition Applications and Methods, pp. 640–647 (2019). https://doi.org/10.5220/0007401706400647
Pistellato, M., Cosmo, L., Bergamasco, F., Gasparetto, A., Albarelli, A.: Adaptive albedo compensation for accurate phase-shift coding. In: 2018 24th International Conference on Pattern Recognition (ICPR), pp. 2450–2455. IEEE (2018). https://doi.org/10.1109/ICPR.2018.8545465
Pohlen, T., Hermans, A., Mathias, M., Leibe, B.: Full-resolution residual networks for semantic segmentation in street scenes. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4151–4160 (2017)
Google Scholar
Ranjan, A., et al.: Competitive collaboration: joint unsupervised learning of depth, camera motion, optical flow and motion segmentation. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 12232–12241 (2019). https://doi.org/10.1109/CVPR.2019.01252
Rho, K., Ha, J., Kim, Y.: GuideFormer: transformers for image guided depth completion. In: 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 6240–6249 (2022). https://doi.org/10.1109/CVPR52688.2022.00615
Ronneberger, O., Fischer, P., Brox, T.: U-Net: convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9351, pp. 234–241. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24574-4_28
Chapter Google Scholar
Shepard, D.: A two-dimensional interpolation function for irregularly-spaced data. In: Proceedings of the 1968 23rd ACM National Conference, pp. 517–524 (1968)
Google Scholar
Silberman, N., Hoiem, D., Kohli, P., Fergus, R.: Indoor segmentation and support inference from RGBD images. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012. LNCS, vol. 7576, pp. 746–760. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-33715-4_54
Chapter Google Scholar
Skala, V.: RBF interpolation with CSRBF of large data sets. Procedia Comput. Sci. 108, 2433–2437 (2017)
Article Google Scholar
Uhrig, J., Schneider, N., Schneider, L., Franke, U., Brox, T., Geiger, A.: Sparsity invariant CNNs. In: International Conference on 3D Vision (3DV) (2017)
Google Scholar
Wang, Y., Chao, W.L., Garg, D., Hariharan, B., Campbell, M.E., Weinberger, K.Q.: Pseudo-lidar from visual depth estimation: Bridging the gap in 3D object detection for autonomous driving. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 8437–8445 (2018)
Google Scholar
Wei, P., Cagle, L., Reza, T., Ball, J., Gafford, J.: Lidar and camera detection fusion in a real-time industrial multi-sensor collision avoidance system. Electronics 7(6), 84 (2018)
Article Google Scholar
Wong, A., Fei, X., Tsuei, S., Soatto, S.: Unsupervised depth completion from visual inertial odometry. IEEE Robot. Autom. Lett. 5(2), 1899–1906 (2020). https://doi.org/10.1109/LRA.2020.2969938
Article Google Scholar
Wright, G.B.: Radial Basis Function Interpolation: Numerical and Analytical Developments. University of Colorado, Boulder (2003)
Google Scholar
Ye, J., Ji, Y., Wang, X., Ou, K., Tao, D., Song, M.: Student becoming the master: Knowledge amalgamation for joint scene parsing, depth estimation, and more. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2824–2833 (2019)
Google Scholar
Zhang, Y., Guo, X., Poggi, M., Zhu, Z., Huang, G., Mattoccia, S.: CompletionFormer: depth completion with convolutions and vision transformers. In: 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 18527–18536 (2023)
Google Scholar
Zhang, Z., Cui, Z., Xu, C., Yan, Y., Sebe, N., Yang, J.: Pattern-affinitive propagation across depth, surface normal and semantic segmentation. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4101–4110 (2019). https://doi.org/10.1109/CVPR.2019.00423
Zhao, H., Shi, J., Qi, X., Wang, X., Jia, J.: Pyramid scene parsing network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2881–2890 (2017)
Google Scholar
Zhu, A.Z., Yuan, L., Chaney, K., Daniilidis, K.: Unsupervised event-based learning of optical flow, depth, and egomotion. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 989–997 (2018)
Google Scholar
Zou, Y.L., Hu, F.L., Zhou, C.C., Li, C.L., Dunn, K.J.: Analysis of radial basis function interpolation approach. Appl. Geophys. 10(4), 397–410 (2013)
Article Google Scholar

Download references

Author information

Authors and Affiliations

DAIS, Università Ca’Foscari Venezia, 155, via Torino, Venezia, Italy
Shambel Fente Mengistu, Mara Pistellato & Filippo Bergamasco

Authors

Shambel Fente Mengistu
View author publications
You can also search for this author in PubMed Google Scholar
Mara Pistellato
View author publications
You can also search for this author in PubMed Google Scholar
Filippo Bergamasco
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Mara Pistellato .

Editor information

Editors and Affiliations

University of Rome Tor Vergata, Rome, Italy
Roberto Basili
Sapienza University of Rome, Rome, Italy
Domenico Lembo
Roma Tre University, Rome, Italy
Carla Limongelli
National Research Council, Rome, Italy
Andrea Orlandini

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Mengistu, S.F., Pistellato, M., Bergamasco, F. (2023). Embedding Shepard’s Interpolation into CNN Models for Unguided Depth Completion. In: Basili, R., Lembo, D., Limongelli, C., Orlandini, A. (eds) AIxIA 2023 – Advances in Artificial Intelligence. AIxIA 2023. Lecture Notes in Computer Science(), vol 14318. Springer, Cham. https://doi.org/10.1007/978-3-031-47546-7_23

Download citation

DOI: https://doi.org/10.1007/978-3-031-47546-7_23
Published: 02 November 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-47545-0
Online ISBN: 978-3-031-47546-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Embedding Shepard’s Interpolation into CNN Models for Unguided Depth Completion