Real-ESRGAN: A deep learning approach for general image restoration and its application to aerial images
Main Article Content
Abstract
General image restoration is a challenging task in computer vision, especially for images with complex scenes and noise. Practical algorithms for general image restoration, such as Enhanced Super-Resolution Generative Adversarial Networks (ESRGAN), have been developed to address this problem. Real-ESRGAN is a deep learning-based image restoration model that uses a generative adversarial network (GAN) to produce high-resolution images from low-resolution inputs. In recent years, Real-ESRGAN has gained significant attention for its impressive image restoration results on various types of images, including aerial images. Aerial images have unique challenges, such as high noise, low contrast, and blur, which affect the quality of the images. ESRGAN has been applied successfully to restore these images and enhance their visual quality, enabling better interpretation and analysis. In this article, we review the practical algorithms for general image restoration, with a focus on Real-ESRGAN and its application on aerial images. We discuss the architecture and application strategies of Real-ESRGAN, as well as its advantages and limitations. We also present examples of how Real-ESRGAN has been used in various applications, such as Segment Anything Model (SAM) and its application as object detection, classification, and segmentation. This study utilized the GÖKTÜRK II 2022 Istanbul Aerial Image dataset, which comprises 917,252 image chips with a resolution of (512x512) and a 3-channel RGB format. In order to enhance the visual quality, the image chips were upscaled to a higher resolution of (1024x1024) using a 2x scaling factor, resulting in a fourfold increase in data size equivalent to 2.684 TB with the same compression ratio. This project shows the potential of Real-ESRGAN in handling large-scale and diverse datasets, as well as its ability to enhance the visual quality of aerial images for real world image restoration, which is essential in various fields such as agriculture, urban planning, and disaster management.
Article Details
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
References
Tsai, R. Y., & Huang, T. S. (1984). Multiframe image restoration and registration. Advances in Computer Vision and Image Processing, 1, 317-339.
Wang, Y., Fevig, R., & Schultz, R. R. (2008, October). Super-resolution mosaicking of UAV surveillance video. In 2008 15th IEEE International Conference on Image Processing, 345-348. https://do.iorg/10.1109/ICIP.2008.4711762
Zhang, H., Zhang, L., & Shen, H. (2012). A super-resolution reconstruction algorithm for hyperspectral images. Signal Processing, 92(9), 2082-2096. https://doi.org/10.1016/j.sigpro.2012.01.020
Zhang, L., Dong, R., Yuan, S., Li, W., Zheng, J., & Fu, H. (2021). Making low-resolution satellite images reborn: a deep learning approach for super-resolution building extraction. Remote Sensing, 13(15), 2872. https://doi.org/10.3390/rs13152872
Robinson, M. D., Farsiu, S., Lo, J. Y., & Toth, C. A. (2008, October). Efficient restoration and enhancement of super-resolved X-ray images. In 2008 15th IEEE International Conference on Image Processing, 629-632. https://doi.org/10.1109/ICIP.2008.4711833
Fookes, C., Lin, F., Chandran, V., & Sridharan, S. (2012). Evaluation of image resolution and super-resolution on face recognition performance. Journal of Visual Communication and Image Representation, 23(1), 75-93. https://doi.org/10.1016/j.jvcir.2011.06.004
Ma, L., Zhao, D., & Gao, W. (2012). Learning-based image restoration for compressed images. Signal processing: Image communication, 27(1), 54-65. https://doi.org/10.1016/j.image.2011.05.004
Pickup, L., Roberts, S. J., & Zisserman, A. (2003). A sampled texture prior for image super-resolution. Advances in neural information processing systems, 16.
Wang, X., Xie, L., Dong, C., & Shan, Y. (2021). Real-Esrgan: Training real-world blind super-resolution with pure synthetic data. In Proceedings of the IEEE/CVF international conference on computer vision, 1905-1914.
Richards, J. A. (1984). Thematic mapping from multitemporal image data using the principal components transformation. Remote Sensing of Environment, 16(1), 35-46. https://doi.org/10.1016/0034-4257(84)90025-7
Carper, W., Lillesand, T., & Kiefer, R. (1990). The use of intensity-hue-saturation transformations for merging SPOT panchromatic and multispectral image data. Photogrammetric Engineering and remote sensing, 56(4), 459-467.
Ranchin, T., & Wald, L. (1993). The wavelet transform for the analysis of remotely sensed images. International Journal of Remote Sensing, 14(3), 615-619. https://doi.org/10.1080/01431169308904362
Zhang, Y. (2004). Understanding image fusion. Photogrammetric Engineering & Remote Sensing, 70(6), 657-661.
Nasrollahi, K., & Moeslund, T. B. (2014). Super-resolution: a comprehensive survey. Machine Vision and Applications, 25, 1423-1468. https://doi.org/10.1007/s00138-014-0623-4
Peng, Y., Yang, F., Dai, Q., Xu, W., & Vetterli, M. (2012, March). Super-resolution from unregistered aliased images with unknown scalings and shifts. In 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 857-860. https://doi.org/10.1109/ICASSP.2012.6288019
Yang, M. C., Huang, D. A., Tsai, C. Y., & Wang, Y. C. F. (2012, July). Self-learning of edge-preserving single image super-resolution via contourlet transform. In 2012 IEEE International Conference on Multimedia and Expo, 574-579. https://doi.org/10.1109/ICME.2012.169
Yang, Y., & Wang, Z. (2011). A new image super-resolution method in the wavelet domain. In 2011 Sixth International Conference on Image and Graphics, 163-167. https://doi.org/10.1109/ICIG.2011.79
Liebel, L., & Körner, M. (2016). Single-image super resolution for multispectral remote sensing data using convolutional neural networks. The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, 41, 883-890. https://doi.org/10.5194/isprs-archives-XLI-B3-883-2016
Šroubek, F., & Flusser, J. (2006). Resolution enhancement via probabilistic deconvolution of multiple degraded images. Pattern Recognition Letters, 27(4), 287-293. https://doi.org/10.1016/j.patrec.2005.08.010
Ma, W., Pan, Z., Guo, J., & Lei, B. (2018). Super-resolution of remote sensing images based on transferred generative adversarial network. In IGARSS 2018-2018 IEEE International Geoscience and Remote Sensing Symposium,1148-1151. https://doi.org/10.1109/IGARSS.2018.8517442
Moustafa, M. S., & Sayed, S. A. (2021). Satellite imagery super-resolution using squeeze-and-excitation-based GAN. International Journal of Aeronautical and Space Sciences, 22(6), 1481-1492. https://doi.org/10.1007/s42405-021-00396-6
Wang, J., Gao, K., Zhang, Z., Ni, C., Hu, Z., Chen, D., & Wu, Q. (2021). Multisensor remote sensing imagery super-resolution with conditional GAN. Journal of Remote Sensing. https://doi.org/10.34133/2021/9829706
Zhang, K., Sumbul, G., & Demir, B. (2020, March). An approach to super-resolution of Sentinel-2 images based on generative adversarial networks. In 2020 Mediterranean and Middle-East Geoscience and Remote Sensing Symposium (M2GARSS), 69-72. https://doi.org/10.1109/M2GARSS47143.2020.9105165
Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., ... & Bengio, Y. (2014). Generative adversarial nets. Advances in neural information processing systems, 27.
Ledig, C., Theis, L., Huszár, F., Caballero, J., Cunningham, A., Acosta, A., ... & Shi, W. (2017). Photo-realistic single image super-resolution using a generative adversarial network. In Proceedings of the IEEE conference on computer vision and pattern recognition, 4681-4690.
Kirillov, A., Mintun, E., Ravi, N., Mao, H., Rolland, C., Gustafson, L., ... & Girshick, R. (2023). Segment anything. Computer Vision and Pattern Recognition. https://doi.org/10.48550/arXiv.2304.02643
Wu, Q., & Osco, L. P. (2023). samgeo: A Python package for segmenting geospatial data with the Segment Anything Model (SAM). Journal of Open Source Software, 8(89), 5663. https://doi.org/10.21105/joss.05663
Wang, X., Xie, L., Dong, C., & Shan, Y. (2021). Real-esrgan: Training real-world blind super-resolution with pure synthetic data. In Proceedings of the IEEE/CVF international conference on computer vision (pp. 1905-1914).
Gazel Bulut, E. B. (2022). Super-Resolutıon image generation from earth observation satellites using generative adversarial networks. MS Thesis, Hacettepe University.