Real-ESRGAN: A deep learning approach for general image restoration and its application to aerial images

Main Article Content

Şükrü Burak Çetin

Abstract

General image restoration is a challenging task in computer vision, especially for images with complex scenes and noise. Practical algorithms for general image restoration, such as Enhanced Super-Resolution Generative Adversarial Networks (ESRGAN), have been developed to address this problem. Real-ESRGAN is a deep learning-based image restoration model that uses a generative adversarial network (GAN) to produce high-resolution images from low-resolution inputs. In recent years, Real-ESRGAN has gained significant attention for its impressive image restoration results on various types of images, including aerial images. Aerial images have unique challenges, such as high noise, low contrast, and blur, which affect the quality of the images. ESRGAN has been applied successfully to restore these images and enhance their visual quality, enabling better interpretation and analysis. In this article, we review the practical algorithms for general image restoration, with a focus on Real-ESRGAN and its application on aerial images. We discuss the architecture and application strategies of Real-ESRGAN, as well as its advantages and limitations. We also present examples of how Real-ESRGAN has been used in various applications, such as Segment Anything Model (SAM) and its application as object detection, classification, and segmentation. This study utilized the GÖKTÜRK II 2022 Istanbul Aerial Image dataset, which comprises 917,252 image chips with a resolution of (512x512) and a 3-channel RGB format. In order to enhance the visual quality, the image chips were upscaled to a higher resolution of (1024x1024) using a 2x scaling factor, resulting in a fourfold increase in data size equivalent to 2.684 TB with the same compression ratio. This project shows the potential of Real-ESRGAN in handling large-scale and diverse datasets, as well as its ability to enhance the visual quality of aerial images for real world image restoration, which is essential in various fields such as agriculture, urban planning, and disaster management.

Article Details

How to Cite
Çetin, Şükrü B. (2023). Real-ESRGAN: A deep learning approach for general image restoration and its application to aerial images. Advanced Remote Sensing, 3(2), 90–99. Retrieved from https://publish.mersin.edu.tr/index.php/arsej/article/view/1072
Section
Articles

References

Tsai, R. Y., & Huang, T. S. (1984). Multiframe image restoration and registration. Advances in Computer Vision and Image Processing, 1, 317-339.

Wang, Y., Fevig, R., & Schultz, R. R. (2008, October). Super-resolution mosaicking of UAV surveillance video. In 2008 15th IEEE International Conference on Image Processing, 345-348. https://do.iorg/10.1109/ICIP.2008.4711762

Zhang, H., Zhang, L., & Shen, H. (2012). A super-resolution reconstruction algorithm for hyperspectral images. Signal Processing, 92(9), 2082-2096. https://doi.org/10.1016/j.sigpro.2012.01.020

Zhang, L., Dong, R., Yuan, S., Li, W., Zheng, J., & Fu, H. (2021). Making low-resolution satellite images reborn: a deep learning approach for super-resolution building extraction. Remote Sensing, 13(15), 2872. https://doi.org/10.3390/rs13152872

Robinson, M. D., Farsiu, S., Lo, J. Y., & Toth, C. A. (2008, October). Efficient restoration and enhancement of super-resolved X-ray images. In 2008 15th IEEE International Conference on Image Processing, 629-632. https://doi.org/10.1109/ICIP.2008.4711833

Fookes, C., Lin, F., Chandran, V., & Sridharan, S. (2012). Evaluation of image resolution and super-resolution on face recognition performance. Journal of Visual Communication and Image Representation, 23(1), 75-93. https://doi.org/10.1016/j.jvcir.2011.06.004

Ma, L., Zhao, D., & Gao, W. (2012). Learning-based image restoration for compressed images. Signal processing: Image communication, 27(1), 54-65. https://doi.org/10.1016/j.image.2011.05.004

Pickup, L., Roberts, S. J., & Zisserman, A. (2003). A sampled texture prior for image super-resolution. Advances in neural information processing systems, 16.

Wang, X., Xie, L., Dong, C., & Shan, Y. (2021). Real-Esrgan: Training real-world blind super-resolution with pure synthetic data. In Proceedings of the IEEE/CVF international conference on computer vision, 1905-1914.

Richards, J. A. (1984). Thematic mapping from multitemporal image data using the principal components transformation. Remote Sensing of Environment, 16(1), 35-46. https://doi.org/10.1016/0034-4257(84)90025-7

Carper, W., Lillesand, T., & Kiefer, R. (1990). The use of intensity-hue-saturation transformations for merging SPOT panchromatic and multispectral image data. Photogrammetric Engineering and remote sensing, 56(4), 459-467.

Ranchin, T., & Wald, L. (1993). The wavelet transform for the analysis of remotely sensed images. International Journal of Remote Sensing, 14(3), 615-619. https://doi.org/10.1080/01431169308904362

Zhang, Y. (2004). Understanding image fusion. Photogrammetric Engineering & Remote Sensing, 70(6), 657-661.

Nasrollahi, K., & Moeslund, T. B. (2014). Super-resolution: a comprehensive survey. Machine Vision and Applications, 25, 1423-1468. https://doi.org/10.1007/s00138-014-0623-4

Peng, Y., Yang, F., Dai, Q., Xu, W., & Vetterli, M. (2012, March). Super-resolution from unregistered aliased images with unknown scalings and shifts. In 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 857-860. https://doi.org/10.1109/ICASSP.2012.6288019

Yang, M. C., Huang, D. A., Tsai, C. Y., & Wang, Y. C. F. (2012, July). Self-learning of edge-preserving single image super-resolution via contourlet transform. In 2012 IEEE International Conference on Multimedia and Expo, 574-579. https://doi.org/10.1109/ICME.2012.169

Yang, Y., & Wang, Z. (2011). A new image super-resolution method in the wavelet domain. In 2011 Sixth International Conference on Image and Graphics, 163-167. https://doi.org/10.1109/ICIG.2011.79

Liebel, L., & Körner, M. (2016). Single-image super resolution for multispectral remote sensing data using convolutional neural networks. The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, 41, 883-890. https://doi.org/10.5194/isprs-archives-XLI-B3-883-2016

Šroubek, F., & Flusser, J. (2006). Resolution enhancement via probabilistic deconvolution of multiple degraded images. Pattern Recognition Letters, 27(4), 287-293. https://doi.org/10.1016/j.patrec.2005.08.010

Ma, W., Pan, Z., Guo, J., & Lei, B. (2018). Super-resolution of remote sensing images based on transferred generative adversarial network. In IGARSS 2018-2018 IEEE International Geoscience and Remote Sensing Symposium,1148-1151. https://doi.org/10.1109/IGARSS.2018.8517442

Moustafa, M. S., & Sayed, S. A. (2021). Satellite imagery super-resolution using squeeze-and-excitation-based GAN. International Journal of Aeronautical and Space Sciences, 22(6), 1481-1492. https://doi.org/10.1007/s42405-021-00396-6

Wang, J., Gao, K., Zhang, Z., Ni, C., Hu, Z., Chen, D., & Wu, Q. (2021). Multisensor remote sensing imagery super-resolution with conditional GAN. Journal of Remote Sensing. https://doi.org/10.34133/2021/9829706

Zhang, K., Sumbul, G., & Demir, B. (2020, March). An approach to super-resolution of Sentinel-2 images based on generative adversarial networks. In 2020 Mediterranean and Middle-East Geoscience and Remote Sensing Symposium (M2GARSS), 69-72. https://doi.org/10.1109/M2GARSS47143.2020.9105165

Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., ... & Bengio, Y. (2014). Generative adversarial nets. Advances in neural information processing systems, 27.

Ledig, C., Theis, L., Huszár, F., Caballero, J., Cunningham, A., Acosta, A., ... & Shi, W. (2017). Photo-realistic single image super-resolution using a generative adversarial network. In Proceedings of the IEEE conference on computer vision and pattern recognition, 4681-4690.

Kirillov, A., Mintun, E., Ravi, N., Mao, H., Rolland, C., Gustafson, L., ... & Girshick, R. (2023). Segment anything. Computer Vision and Pattern Recognition. https://doi.org/10.48550/arXiv.2304.02643

Wu, Q., & Osco, L. P. (2023). samgeo: A Python package for segmenting geospatial data with the Segment Anything Model (SAM). Journal of Open Source Software, 8(89), 5663. https://doi.org/10.21105/joss.05663

Wang, X., Xie, L., Dong, C., & Shan, Y. (2021). Real-esrgan: Training real-world blind super-resolution with pure synthetic data. In Proceedings of the IEEE/CVF international conference on computer vision (pp. 1905-1914).

Gazel Bulut, E. B. (2022). Super-Resolutıon image generation from earth observation satellites using generative adversarial networks. MS Thesis, Hacettepe University.