With the rapid development of digital communication and the widespread use of the Internet of Things, multi-view image compression has attracted increasing attention as a fundamental technology for image data communication. Multi-view image compression aims to improve compression efficiency by leveraging correlations between images. However, the requirement of synchronization and inter-image communication at the encoder side poses significant challenges, especially for constrained devices. In this study, we introduce a novel distributed image compression model based on the attention mechanism to address the challenges associated with the availability of side information only during decoding. Our model integrates an encoder network, a quantization module, and a decoder network, to ensure both high compression performance and high-quality image reconstruction. The encoder uses a deep Convolutional Neural Network (CNN) to extract high-level features from the input image, which then pass through the quantization module for further compression before undergoing lossless entropy coding. The decoder of our model consists of three main components that allow us to fully exploit the information within and between images on the decoder side. Specifically, we first introduce a channel-spatial attention module to capture and refine information within individual image feature maps. Second, we employ a semi-coupled convolution module to extract both shared and specific information in images. Finally, a cross-attention module is employed to fuse mutual information extracted from side information. The effectiveness of our model is validated on various datasets, including KITTI Stereo and Cityscapes. The results highlight the superior compression capabilities of our method, surpassing state-of-the-art techniques.