HiT-RSNet: Enhancing Remote Sensing Effectiveness using Transformer-based Super-Resolution Network and Hierarchical Modeling Approach

Journal article


Authors/Editors


Strategic Research Themes


Publication Details

Author listSultan Naveed, Prom-on Santitham

PublisherInstitute of Electrical and Electronics Engineers

Publication year2025

JournalIEEE Access (2169-3536)

Volume number13

Start page171132

End page171155

Number of pages24

ISSN2169-3536

eISSN2169-3536

URLhttps://doi.org/10.1109/ACCESS.2025.3615996

LanguagesEnglish-United States (EN-US)


View on publisher site


Abstract

The accurate interpretation of remote sensing imagery in downstream geospatial tasks, such as urban monitoring, object detection, disaster management, and infrastructure mapping, heavily relies on effective data representation and analysis. In recent years, transformer-based methods have attracted significant interest in remote sensing super-resolution applications. However, these methods often struggle to reconstruct sharp boundaries, preserve fine textures, and maintain spectral-spatial consistency, particularly in complex scenes with dense urban layouts and smooth terrain transitions. These limitations include insufficient global context modeling, inadequate attention to geometric structures, and shallow multi-scale feature interactions. This paper proposes HiT-RSNet, a novel hybrid transformer-convolutional architectural design to address these challenges by jointly exploiting long-range dependencies and fine-grained local details. The model employs a dual-branch design that integrates Hierarchical Region Transformer Blocks (HRTB) for global contextual encoding with Residual Convolutional Attention Modules (RCAM) for local structure refinement. This design improves boundary sharpness and preserves fine textures. The HRTB comprises three specialized modules: a Channel-Wise Self-Attention (CWSAB) for spectral selectivity, Hierarchical Spatial Attention (HSAB) for structure-aware feature learning, and a Multi-Layer Feed-Forward Block (MLFFB) for efficient multi-scale information propagation. Extensive quantitative and qualitative experiments on four benchmarks (UCMerced, AID, RSSCN7, and WHU-RS19) across ×2, ×3, and ×4 scales consistently demonstrate HiT-RSNet’s superior performance compared to state-of-the-art methods. HiT-RSNet provides an efficient and effective solution for enhancing the resolution of remote sensing data. The implementation code is available at https://github.com/CPEKMUTT/HiT-RSNet


Keywords

Attention mechanismsFeatures ExtractionFeatures refinementHierarchical modelingRemote SensingTransformers


Last updated on 2025-08-10 at 00:00