作者: Xudong Zhao , Mengmeng Zhang , Ran Tao , Wei Li , Wenzhi Liao
DOI:
关键词:
摘要: Focusing on joint classification of Hyperspectral image (HSI) and Light detection and ranging (LiDAR) data, a fractional Fourier image transformer (FrIT) is proposed as a backbone network in this paper. In the proposed FrIT, HSI and LiDAR data are firstly fused at pixel-level. Both multi-source and HSI feature extractors are utilized to capture local contexts. Then, a plug-and-play image transformer FrIT is explored for global contexts and sequential feature extraction. Unlike the attention-based representations in classic visual image transformer (VIT), FrIT is capable of speeding up the transformer architectures massively. To reduce the information loss from shallow to deep layers, FrIT is devised to connect contextual features in multiple fractional domains. At last, to evaluate the performance of FrIT, a new HSI and LiDAR benchmark is provided for extensive experiments, on which the proposed FrIT gains an …