Squeeze-EnGAN: Memory Efficient and Unsupervised Low-Light Image Enhancement for Intelligent Vehicles
Abstract
:1. Introduction
- General LLIE models struggle to adequately reflect the characteristics of driving environments. Issues such as noise, overexposure, and blur make it difficult to reliably apply these models to autonomous driving.
- Most deep learning-based LLIE approaches rely on supervised learning methods that require paired normal and low-light images for training. However, obtaining natural image pairs for the same scene is challenging, particularly for driving scenarios where creating such pairs is even more constrained.
- For LLIE to be applied to autonomous driving, computational resource usage must be minimized, but few models prioritize this requirement.
2. Related Works
2.1. Traditional Methods
2.2. Deep Learning Approaches
2.3. Adversarial Learning Without Supervision
2.4. U-net with Fire Module
3. Methods
3.1. Generator with Fire Modules
3.2. Global-Local Discriminator
3.3. Perceptual Loss
4. Experiments
4.1. Datasets and Two Training Sessions
4.2. Implementation Details
4.3. Results
4.4. Performance on Edge System
4.5. Ablation Study
- To investigate the effect of the local discriminator, we conducted training without it.
- All layers of the generator network shown in Figure 2, except for the final convolution layer, were replaced with fire modules.
- Fire modules in the decoder were removed and replaced with standard convolution layers. Therefore, fire modules are used only in the encoder.
- Fire modules in the encoder were removed and replaced with standard convolution layers. Therefore, fire modules are used only in the decoder.
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
Appendix A
Layer | Cin | Cmid | Cout | Kernel Size |
---|---|---|---|---|
Convolution 2D | 3 | - | 32 | 3 × 3 |
Convolution 2D | 32 | - | 32 | 3 × 3 |
MaxPooling | - | - | - | 2 × 2 |
FireModule | 32 | 32 | 64 | 3 × 3 |
FireModule | 64 | 32 | 64 | 3 × 3 |
MaxPooling | - | - | - | 2 × 2 |
FireModule | 64 | 48 | 128 | 3 × 3 |
FireModule | 128 | 48 | 128 | 3 × 3 |
MaxPooling | - | - | - | 2 × 2 |
FireModule | 128 | 64 | 256 | 3 × 3 |
FireModule | 256 | 64 | 256 | 3×3 |
MaxPooling | - | - | - | 2×2 |
FireModule | 256 | 80 | 512 | 3×3 |
FireModule | 512 | 80 | 512 | 3×3 |
Layer | Cin | Cmid | Cout | Kernel Size |
---|---|---|---|---|
Upsampling | - | - | - | - |
FireModule | 512 | 80 | 256 | 3 × 3 |
FireModule | 512 | 64 | 256 | 3 × 3 |
FireModule | 256 | 64 | 256 | 3 × 3 |
Upsampling | - | - | - | - |
FireModule | 256 | 64 | 128 | 3 × 3 |
FireModule | 256 | 48 | 128 | 3 × 3 |
FireModule | 128 | 48 | 128 | 3 × 3 |
Upsampling | - | - | - | - |
FireModule | 128 | 48 | 64 | 3 × 3 |
FireModule | 128 | 32 | 64 | 3 × 3 |
FireModule | 64 | 32 | 64 | 3 × 3 |
Upsampling | - | - | - | - |
Convolution 2D | 64 | - | 32 | 3 × 3 |
Convolution 2D | 64 | - | 32 | 3 × 3 |
Convolution 2D | 32 | - | 3 | 3 × 3 |
References
- Gruyer, D.; Magnier, V.; Hamdi, K.; Claussmann, L.; Orfila, O.; Rakotonirainy, A. Perception, information processing and modeling: Critical stages for autonomous driving applications. Annu. Rev. Control 2017, 44, 323–341. [Google Scholar] [CrossRef]
- Wang, P. Research on comparison of lidar and camera in autonomous driving. J. Phys. Conf. Ser. 2021, 2093, 012032. [Google Scholar] [CrossRef]
- Janai, J.; Güney, F.; Behl, A.; Geiger, A. Computer vision for autonomous vehicles: Problems, datasets and state of the art. Found. Trends Comput. Graph. Vis. 2020, 12, 1–308. [Google Scholar] [CrossRef]
- Song, H.; Cho, J.; Ha, J.; Park, J.; Jo, K. Panoptic-FusionNet: Camera-LiDAR fusion-based point cloud panoptic segmentation for autonomous driving. Expert Syst. Appl. 2024, 251, 123950. [Google Scholar] [CrossRef]
- Yeong, D.J.; Velasco-Hernandez, G.; Barry, J.; Walsh, J. Sensor and sensor fusion technology in autonomous vehicles: A review. Sensors 2021, 21, 2140. [Google Scholar] [CrossRef]
- Xiao, Y.; Jiang, A.; Ye, J.; Wang, M.W. Making of night vision: Object detection under low-illumination. IEEE Access 2020, 8, 123075–123086. [Google Scholar] [CrossRef]
- Muhammad, K.; Ullah, A.; Lloret, J.; Del Ser, J.; de Albuquerque, V.H.C. Deep learning for safe autonomous driving: Current challenges and future directions. IEEE Trans. Intell. Transp. Syst. 2020, 22, 4316–4336. [Google Scholar] [CrossRef]
- Liu, S.; Liu, L.; Tang, J.; Yu, B.; Wang, Y.; Shi, W. Edge computing for autonomous driving: Opportunities and challenges. Proc. IEEE 2019, 107, 1697–1716. [Google Scholar] [CrossRef]
- Liu, L.; Lu, S.; Zhong, R.; Wu, B.; Yao, Y.; Zhang, Q.; Shi, W. Computing systems for autonomous driving: State of the art and challenges. IEEE Internet Things J. 2020, 8, 6469–6486. [Google Scholar] [CrossRef]
- Jiang, Y.; Gong, X.; Liu, D.; Cheng, Y.; Fang, C.; Shen, X.; Wang, Z. EnlightenGAN: Deep light enhancement without paired supervision. IEEE Trans. Image Process. 2021, 30, 2340–2349. [Google Scholar] [CrossRef]
- Goodfellow, I.; Pouget-Abadie, J.; Mirza, M.; Xu, B.; Warde-Farley, D.; Ozair, S.; Bengio, Y. Generative adversarial nets. arXiv 2014, arXiv:1406.2661. [Google Scholar]
- Iandola, F.N. SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <0.5 MB model size. arXiv 2016, arXiv:1602.07360. [Google Scholar]
- Ronneberger, O.; Fischer, P.; Brox, T. U-Net: Convolutional Networks for Biomedical Image Segmentation. In Proceedings of the 18th International Conference on Medical Image Computing and Computer-Assisted Intervention (MICCAI 2015), Munich, Germany, 5–9 October 2015. [Google Scholar]
- Pizer, S.M.; Amburn, E.P.; Austin, J.D.; Cromartie, R.; Geselowitz, A.; Greer, T.; Zuiderveld, K. Adaptive histogram equalization and its variations. Comput. Vis. Graph. Image Process. 1987, 39, 355–368. [Google Scholar] [CrossRef]
- Land, E.H.; McCann, J.J. Lightness and retinex theory. J. Opt. Soc. Am. 1971, 61, 1–11. [Google Scholar] [CrossRef]
- Lore, K.G.; Akintayo, A.; Sarkar, S. LLNet: A deep autoencoder approach to natural low-light image enhancement. Pattern Recognit. 2017, 61, 650–662. [Google Scholar] [CrossRef]
- Radford, A. Unsupervised representation learning with deep convolutional generative adversarial networks. arXiv 2015, arXiv:1511.06434. [Google Scholar]
- Ledig, C.; Theis, L.; Huszar, F.; Caballero, J.; Cunningham, A.; Acosta, A.; Aitken, A.; Tejani, A.; Totz, J.; Wang, Z.; et al. Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017; pp. 4681–4690. [Google Scholar]
- Chen, Z.; Zeng, Z.; Shen, H.; Zheng, X.; Dai, P.; Ouyang, P. DN-GAN: Denoising generative adversarial networks for speckle noise reduction in optical coherence tomography images. Biomed. Signal Process. Control 2020, 55, 101632. [Google Scholar] [CrossRef]
- Li, R.; Pan, J.; Li, Z.; Tang, J. Single Image Dehazing via Conditional Generative Adversarial Network. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, USA, 18–22 June 2018; pp. 8202–8211. [Google Scholar]
- Yuan, Y.; Liu, S.; Zhang, J.; Zhang, Y.; Dong, C.; Lin, L. Unsupervised Image Super-Resolution Using Cycle-in-Cycle Generative Adversarial Networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Salt Lake City, UT, USA, 18–22 June 2018; pp. 701–710. [Google Scholar]
- Chen, Y.S.; Wang, Y.C.; Kao, M.H.; Chuang, Y.Y. Deep Photo Enhancer: Unpaired Learning for Image Enhancement from Photographs with GANs. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, USA, 18–22 June 2018; pp. 6306–6314. [Google Scholar]
- Beheshti, N.; Johnsson, L. Squeeze U-Net: A Memory and Energy Efficient Image Segmentation Network. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Virtual Event, 14–19 June 2020; pp. 364–365. [Google Scholar]
- Odena, A.; Dumoulin, V.; Olah, C. Deconvolution and checkerboard artifacts. Distill 2016, 1, e3. [Google Scholar] [CrossRef]
- Lin, T.Y.; Dollár, P.; Girshick, R.; He, K.; Hariharan, B.; Belongie, S. Feature Pyramid Networks for Object Detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017; pp. 2117–2125. [Google Scholar]
- Lim, J.S.; Astrid, M.; Yoon, H.J.; Lee, S.I. Small Object Detection Using Context and Attention. In Proceedings of the 2021 International Conference on Artificial Intelligence in Information and Communication (ICAIIC), Jeju Island, Republic of Korea, 13–16 April 2021; IEEE: Piscataway, NJ, USA, 2021; pp. 181–186. [Google Scholar]
- Jolicoeur-Martineau, A. The relativistic discriminator: A key element missing from standard GAN. arXiv 2018, arXiv:1807.00734. [Google Scholar]
- Mao, X.; Li, Q.; Xie, H.; Lau, R.Y.; Wang, Z.; Smolley, S.P. Least Squares Generative Adversarial Networks. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), Venice, Italy, 22–29 October 2017; pp. 2794–2802. [Google Scholar]
- Johnson, J.; Alahi, A.; Li, F.-F. Perceptual Losses for Real-Time Style Transfer and Super-Resolution. In Proceedings of the 14th European Conference on Computer Vision (ECCV 2016), Amsterdam, The Netherlands, 11–14 October 2016; Springer International Publishing: Berlin/Heidelberg, Germany, 2016; pp. 694–711. [Google Scholar]
- Simonyan, K. Very deep convolutional networks for large-scale image recognition. arXiv 2014, arXiv:1409.1556. [Google Scholar]
- RichardWebster, B.; Anthony, S.E.; Scheirer, W.J. Psyphy: A psychophysics driven evaluation framework for visual recognition. IEEE Trans. Pattern Anal. Mach. Intell. 2018, 41, 2280–2286. [Google Scholar] [CrossRef] [PubMed]
- AIHub. High-Precision Data Collection Vehicle Night City Road Data. Available online: https://www.aihub.or.kr/aihubdata/data/view.do?dataSetSn=71580 (accessed on 13 January 2025).
- AIHub. High-Precision Data Collection Vehicle Daytime City Road Data. Available online: https://www.aihub.or.kr/aihubdata/data/view.do?dataSetSn=71577 (accessed on 13 January 2025).
- Wei, C.; Wang, W.; Yang, W.; Liu, J. Deep Retinex Decomposition for Low-Light Enhancement. arXiv 2018, arXiv:1808.04560. [Google Scholar]
- Dang-Nguyen, D.T.; Pasquini, C.; Conotter, V.; Boato, G. RAISE: A Raw Images Dataset for Digital Image Forensics. In Proceedings of the 6th ACM Multimedia Systems Conference, Portland, OR, USA, 18–20 March 2015; pp. 219–224. [Google Scholar]
- Kalantari, N.K.; Ramamoorthi, R. Deep high dynamic range imaging of dynamic scenes. ACM Trans. Graph. 2017, 36, 144. [Google Scholar] [CrossRef]
- Cai, J.; Gu, S.; Zhang, L. Learning a deep single image contrast enhancer from multi-exposure images. IEEE Trans. Image Process. 2018, 27, 2049–2062. [Google Scholar] [CrossRef]
- Lee, C.; Lee, C.; Kim, C.S. Contrast enhancement based on layered difference representation of 2D histograms. IEEE Trans. Image Process. 2013, 22, 5372–5384. [Google Scholar] [CrossRef]
- Ma, K.; Zeng, K.; Wang, Z. Perceptual quality assessment for multi-exposure image fusion. IEEE Trans. Image Process. 2015, 24, 3345–3356. [Google Scholar] [CrossRef]
- Wang, S.; Zheng, J.; Hu, H.M.; Li, B. Naturalness preserved enhancement algorithm for non-uniform illumination images. IEEE Trans. Image Process. 2013, 22, 3538–3548. [Google Scholar] [CrossRef]
- Guo, X.; Li, Y.; Ling, H. LIME: Low-light image enhancement via illumination map estimation. IEEE Trans. Image Process. 2016, 26, 982–993. [Google Scholar] [CrossRef]
- Vonikakis, V. Busting image enhancement and tone-mapping algorithms. Available online: https://sites.google.com/site/vonikakis/datasets (accessed on 13 January 2025).
- Kingma, D.P. Adam: A method for stochastic optimization. arXiv 2014, arXiv:1412.6980. [Google Scholar]
- Wu, W.; Weng, J.; Zhang, P.; Wang, X.; Yang, W.; Jiang, J. URetinex-Net: Retinex-Based Deep Unfolding Network for Low-Light Image Enhancement. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA, 19–24 June 2022; pp. 5901–5910. [Google Scholar]
- Yan, Q.; Feng, Y.; Zhang, C.; Wang, P.; Wu, P.; Dong, W.; Zhang, Y. You only need one color space: An efficient network for low-light image enhancement. arXiv 2024, arXiv:2402.05809. [Google Scholar]
- Wang, Y.; Wan, R.; Yang, W.; Li, H.; Chau, L.P.; Kot, A. Low-Light Image Enhancement with Normalizing Flow. In Proceedings of the AAAI Conference on Artificial Intelligence, Virtual Event, 22 February–1 March 2022; Volume 36, No. 3. pp. 2604–2612. [Google Scholar]
- Guo, C.; Li, C.; Guo, J.; Loy, C.C.; Hou, J.; Kwong, S.; Cong, R. Zero-Reference Deep Curve Estimation for Low-Light Image Enhancement. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Virtual Event, 14–19 June 2020; pp. 1780–1789. [Google Scholar]
- Fu, Z.; Yang, Y.; Tu, X.; Huang, Y.; Ding, X.; Ma, K.K. Learning a Simple Low-Light Image Enhancer from Paired Low-Light Instances. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Vancouver, BC, Canada, 18–22 June 2023; pp. 22252–22261. [Google Scholar]
- Mittal, A.; Soundararajan, R.; Bovik, A.C. Making a “completely blind” image quality analyzer. IEEE Signal Process. Lett. 2012, 20, 209–212. [Google Scholar] [CrossRef]
- Wang, Z.; Bovik, A.C.; Sheikh, H.R.; Simoncelli, E.P. Image quality assessment: From error visibility to structural similarity. IEEE Trans. Image Process. 2004, 13, 600–612. [Google Scholar] [CrossRef] [PubMed]
- Jocher, G.; Qiu, J. Ultralytics YOLO11. Available online: https://github.com/ultralytics/ultralytics (accessed on 13 January 2025).
- Shaheen, K.; Hanif, M.A.; Hasan, O.; Shafique, M. Continual learning for real-world autonomous systems: Algorithms, challenges and frameworks. J. Intell. Robot. Syst. 2022, 105, 9. [Google Scholar] [CrossRef]
Layer Name | Squeeze-EnGAN (Ours) | EnlightenGAN * | ||||
---|---|---|---|---|---|---|
Components | #MACs (M) | #Params | Components | #MACs (M) | #Params | |
Conv1 | Conv2D × 2 | 2646.016 | 10272 | Conv2D × 2 | 2719.744 | 10560 |
DS1 | Maxpooling × 1 Fire module × 2 | 1556.480 | 24128 | Maxpooling × 1 Conv2D × 2 | 3571.712 | 55680 |
DS2 | Maxpooling × 1 Fire module × 2 | 1153.024 | 71712 | Maxpooling × 1 Conv2D × 2 | 3555.328 | 221952 |
DS3 | Maxpooling × 1 Fire module × 2 | 763.904 | 190336 | Maxpooling × 1 Conv2D × 2 | 3547.136 | 886272 |
DS4 | Maxpooling × 1 Fire module × 2 | 475.776 | 474592 | Maxpooling × 1 Conv2D × 2 | 3543.040 | 3542016 |
US1 | Upsampling × 1 Fire module × 3 | 1459.520 | 358608 | Upsampling × 1 Conv2D × 3 | 11828.224 | 2950912 |
US2 | Upsampling × 1 Fire module × 3 | 2266.112 | 138464 | Upsampling × 1 Conv2D × 3 | 11859.968 | 738176 |
US3 | Upsampling × 1 Fire module × 3 | 3226.624 | 48816 | Upsampling × 1 Conv2D × 3 | 11923.456 | 184768 |
US4 | Upsampling × 1 Conv2D × 3 | 12034.048 | 46240 | Upsampling × 1 Conv2D × 3 | 12034.048 | 46240 |
Conv2 | Conv2D × 1 | 25.344 | 99 | Conv2D × 1 | 25.344 | 99 |
Total | 25606.848 | 1363267 | 64608.000 | 8636675 |
Method | Dataset | |||||
---|---|---|---|---|---|---|
DICM | MEF | NPE | LIME | VV | HP-NCR | |
Input | 4.25 | 4.27 | 4.32 | 4.35 | 3.52 | 7.02 |
URetinex-Net | 4.20 | 3.79 | 4.69 | 4.34 | 3.03 | 6.34 |
CIDNet | 3.79 | 3.56 | 3.74 | 4.13 | 3.21 | 5.82 |
LLFlow | 3.63 | 3.46 | 4.09 | 3.98 | 3.01 | 5.69 |
ZeroDCE | 4.58 | 4.93 | 4.53 | 5.82 | 4.81 | 5.95 |
PairLIE | 4.09 | 4.18 | 4.21 | 4.51 | 3.66 | 5.91 |
EnlightenGAN | 3.50 | 3.23 | 4.11 | 3.72 | 2.58 | 5.48 |
Ours | 3.74 | 3.37 | 4.29 | 3.89 | 2.66 | 5.68 |
Method | LOL Dataset | ||
---|---|---|---|
PSNR | SSIM | ||
SL | URetinex-Net | 21.328 | 0.835 |
CIDNet | 23.809 | 0.857 | |
LLFlow | 21.149 | 0.854 | |
UL | PairLIE | 19.510 | 0.736 |
ZeroDCE | 14.861 | 0.559 | |
EnlightenGAN | 17.480 | 0.651 | |
Ours | 16.174 | 0.658 |
Metric | Method | ||||
---|---|---|---|---|---|
Original | EnlightenGAN | Squeeze-EnGAN | Squeeze-EnGAN-R | ||
YOLO11-s | Precision | 0.364 | 0.536 | 0.377 | 0.537 |
Recall | 0.369 | 0.329 | 0.436 | 0.375 | |
mAP50 | 0.335 | 0.372 | 0.387 | 0.399 | |
mAP50-95 | 0.183 | 0.192 | 0.195 | 0.208 | |
YOLO11-m | Precision | 0.622 | 0.645 | 0.710 | 0.623 |
Recall | 0.394 | 0.418 | 0.410 | 0.453 | |
mAP50 | 0.450 | 0.485 | 0.490 | 0.497 | |
mAP50-95 | 0.264 | 0.282 | 0.288 | 0.290 |
Method | Inference Time (s) | |
---|---|---|
Jetson Xavier | Jetson Nano | |
LLFlow | 1.612 | - |
CIDNet | 0.241 | 1.760 |
URetinex-Net | 0.371 | 3.332 |
PairLIE | 0.157 | 1.350 |
EnlightenGAN | 0.088 | 0.651 |
Squeeze-EnGAN (ours) | 0.061 | 0.590 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
In, H.; Kweon, J.; Moon, C. Squeeze-EnGAN: Memory Efficient and Unsupervised Low-Light Image Enhancement for Intelligent Vehicles. Sensors 2025, 25, 1825. https://doi.org/10.3390/s25061825
In H, Kweon J, Moon C. Squeeze-EnGAN: Memory Efficient and Unsupervised Low-Light Image Enhancement for Intelligent Vehicles. Sensors. 2025; 25(6):1825. https://doi.org/10.3390/s25061825
Chicago/Turabian StyleIn, Haegyo, Juhum Kweon, and Changjoo Moon. 2025. "Squeeze-EnGAN: Memory Efficient and Unsupervised Low-Light Image Enhancement for Intelligent Vehicles" Sensors 25, no. 6: 1825. https://doi.org/10.3390/s25061825
APA StyleIn, H., Kweon, J., & Moon, C. (2025). Squeeze-EnGAN: Memory Efficient and Unsupervised Low-Light Image Enhancement for Intelligent Vehicles. Sensors, 25(6), 1825. https://doi.org/10.3390/s25061825