As the resolution of Earth observation imagery advances, many of the downstream industries that rely on remotely sensed imagery can also advance. Due to the physical limitations of the optical sensors carried by satellites, the growth of the resolution of remote sensing images becomes difficult. Therefore, it is becoming increasingly important to boost the resolution of Earth observation images by methods other than upgrading the physical components, such as super-resolution. As a computer vision task, convolutional neural networks (CNNs) perform well on super-resolution tasks. Transformer-based models also show good performance on it. The new model proposed in this paper, Convolutional Transformer Generative Adversarial Network (CTGAN). It promotes image super-resolution by balancing local features with global features. Results on real satellite datasets demonstrate the effectiveness of the CTGAN model.