Abstract
Purpose
In bronchoschopy, computer vision systems for navigation assistance are an attractive low-cost solution to guide the endoscopist to target peripheral lesions for biopsy and histological analysis. We propose a decoupled deep learning architecture that projects input frames onto the domain of CT renderings, thus allowing offline training from patient-specific CT data.
Methods
A fully convolutional network architecture is implemented on GPU and tested on a phantom dataset involving 32 video sequences and \(\sim \)60k frames with aligned ground truth and renderings, which is made available as the first public dataset for bronchoscopy navigation.
Results
An average estimated depth accuracy of 1.5 mm was obtained, outperforming conventional direct depth estimation from input frames by 60%, and with a computational time of \(\le \)30 ms on modern GPUs. Qualitatively, the estimated depth and renderings closely resemble the ground truth.
Conclusions
The proposed method shows a novel architecture to perform real-time monocular depth estimation without losing patient specificity in bronchoscopy. Future work will include integration within SLAM systems and collection of in vivo datasets.











Similar content being viewed by others
Notes
While in our experiments the material \(\textit{violet-rubber}\) was uniformly used, any other physically meaningful BRDF can be used for the purpose.
References
Asano F, Eberhardt R, Herth FJF (2014) Virtual bronchoscopic navigation for peripheral pulmonary lesions. Respiration 88(5):430–440
Dosovitskiy A, Fischery P, Ilg E, Husser P, Hazirbas C, Golkov V, vd Smagt P, Cremers D, Brox T (2015) Flownet: learning optical flow with convolutional networks. In: IEEE International conference on computer vision (ICCV), pp 2758–2766
Eberhardt R, Kahn N, Gompelmann D, Schumann M, Heussel CP, Herth FJ (2010) Lungpoint—a new approach to peripheral lesions. J Thorac Oncol 5(10):1559–1563
Eigen D, Fergus R (2015) Predicting depth, surface normals and semantic labels with a common multi-scale convolutional architecture. In: IEEE International conference on computer vision (ICCV), pp 2650–2658
Engel J, Schops T, Cremers D (2014) Lsd-slam: large-scale direct monocular slam. In: European conference in computer vision (ECCV), pp 834–849
Garrido-Jurado S, noz Salinas RM, Madrid-Cuevas F, Marín-Jiménez M (2014) Automatic generation and detection of highly reliable fiducial markers under occlusion. Pattern Recognit 47(6):2280–2292
Gilbert C, Akulian J, Ortiz R, Lee H, Yarmus L (2014) Novel bronchoscopic strategies for the diagnosis of peripheral lung lesions: present techniques and future directions. Respirology 19(5):636–644
Goodfellow I, Bengio Y, Courville A (2016) Deep Learning. MIT Press. http://www.deeplearningbook.org
Hayashi Y, Misawa K, Oda M, Hawkes DJ, Mori K (2016) Clinical application of a surgical navigation system based on virtual laparoscopy in laparoscopic gastrectomy for gastric cancer. Int J Comput Assist Radiol Surg 11(5):827–836
Herth FJ, Eberhardt R, Sterman D, Silvestri GA, Hoffmann H, Shah PL (2015) Bronchoscopic transparenchymal nodule access (btpna): first in human trial of a novel procedure for sampling solitary pulmonary nodules. Thorax 70(4):326–332
Kazhdan M, Bolitho M, Hoppe H (2006) Poisson surface reconstruction. In: Eurographics symposium on geometry processing, pp 61–70
Kingma DP, Ba J (2015) Adam: a method for stochastic optimization. In: International conference on learning representations (ICLR)
Leong S, Ju H, Marshall H, Bowman R, Yang I, Ree AM, Saxon C, Fong KM (2012) Electromagnetic navigation bronchoscopy: a descriptive analysis. J Thorac Dis 4(2):173–185
Liu F, Shen C, Lin G, Reid I (2016) Learning depth from single monocular images using deep convolutional neural fields. IEEE Trans Pattern Anal Mach Intell 38(10):2024–2039
Lorensen, W.E., Cline HE (1987) Marching cubes: a high resolution 3d surface construction algorithm. In: ACM SIGGRAPH, pp 163–169
Luo X, Feuerstein M, Deguchi D, Kitasaka T, Takabatake H, Mori K (2012) Development and comparison of new hybrid motion tracking for bronchoscopic navigation. Med Image Anal 16(3):577–596
Mahmoud N, Cirauqui I, Hostettler A, Doignon C, Soler L, Marescaux J, Montiel JMM (2017) Orbslam-based endoscope tracking and 3d reconstruction. In: International workshop on computer-assisted and robotic endoscopy (CARE), pp 72–83
Maier-Hein L, Mountney P, Bartoli A, Elhawary H, Elson D, Groch A, Kolb A, Rodrigues M, Sorger J, Speidel S, Stoyanov D (2013) Optical techniques for 3d surface reconstruction in computer-assisted laparoscopic surgery. Med Image Anal 17(8):974–996
Malti A, Bartoli A (2014) Combining conformal deformation and cook-torrance shading for 3-d reconstruction in laparoscopy. IEEE Trans Biomed Eng 61(6):1684–1692
Matusik W, Pfister H, Brand M, McMillan L (2003) A data-driven reflectance model. ACM Trans Graph 22(3):759–769
Merritt SA, Khare R, Bascom R, Higgins WE (2013) Interactive ct-video registration for the continuous guidance of bronchoscopy. IEEE Trans Med Imaging 32(8):1376–1396
Mirota D, Wang H, Taylor R, Ishii M, Gallia G, Hager G (2012) A system for video-based navigation for endoscopic endonasal skull base surgery. IEEE Trans Med Imaging 31(4):963–976
Mur-Artal R, Montiel JMM, Tardos JD (2015) Orb-slam: a versatile and accurate monocular slam system. IEEE Trans Robot 31(5):1147–1163
Mura M, Abu-Kheil Y, Ciuti G, Visentini-Scarzanella M, Menciassi A, Dario P, Dias J, Seneviratne L (2016) Vision-based haptic feedback for capsule endoscopy navigation: a proof of concept. J Micro Bio Robot 11(1):35–45
Reiter, A., Leondard, S., Sinha, A., Ishii, M., Taylor, R.H., Hager, G.D.: Endoscopic-ct: learning-based photometric reconstruction for endoscopic surgery. In: SPIE medical imaging, pp 1–6 (2016)
Siegel RL, Miller KD, Jemal A (2016) Cancer statistics, 2016. CA A Cancer J Clin 66(1):7–30
Szeliski R, Zabih R, Scharstein D, Veksler O, Kolmogorov V, Agarwala A, Tappen M, Rother C (2008) A comparative study of energy minimization methods for markov random fields with smoothness-based priors. IEEE Trans Pattern Anal Mach Intell 30(6):1068–1080
Tagliasacchi A, Alhashim I, Olson M, Zhang H (2012) Mean curvature skeletons. Comput Graph Forum 31(5):1735–1744
Umeyama S (1991) Least-squares estimation of transformation parameters between two point patterns. IEEE Trans Pattern Anal Mach Intell 13(4):376–380
Visentini-Scarzanella M, Kawasaki H (2015) Simultaneous camera, light position and radiant intensity distribution calibration. In: Pacific rim symposium on image and video technology (PSIVT), pp 557–571
Visentini-Scarzanella M, Mylonas GP, Stoyanov D, Yang GZ: i-brush: A gaze-contingent virtual paintbrush for dense 3d reconstruction in robotic assisted surgery. In: International conference on medical image computing and computer-assisted intervention (MICCAI), pp 353–360
Weisstein EW (2002) Sphere point picking. Tech. rep, Wolfram MathWorld
Zhang Z (2000) A flexible new technique for camera calibration. IEEE Trans Pattern Anal Mach Intell 22(11):1330–1334
Zhao Q, Price T, Pizer S, Niethammer M, Alterovitz R, Rosenman J (2016) The endoscopogram: a 3d model reconstructed from endoscopic video frames. In: International conference on medical image computing and computer-assisted intervention (MICCAI), pp 439–447
Acknowledgements
M.V.S. was supported by the Toshiba Fellowship Programme.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflicts of interest.
Ethical standards
This article does not contain any studies with human participants or animals performed by any of the authors.
Informed consent
This article does not contain any patient data.
Rights and permissions
About this article
Cite this article
Visentini-Scarzanella, M., Sugiura, T., Kaneko, T. et al. Deep monocular 3D reconstruction for assisted navigation in bronchoscopy. Int J CARS 12, 1089–1099 (2017). https://doi.org/10.1007/s11548-017-1609-2
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11548-017-1609-2