Remote sensing is an efficient technology for mapping soil organic matter (SOM) of croplands during potential bare soil periods. However, since the effects of agricultural practices on the image spectra are different within a region with interleaved drylands and paddy fields in terms of degree and timing, the common multi-temporal image synthesis method and joint modeling of drylands and paddy fields pose challenges of accurate SOM prediction. Therefore, this study introduced two improvements: (1) separate modeling for drylands and paddy fields, and (2) a new multi-temporal image synthesis method, termed the optimal image synthesis method. The proposed synthesis method ranked images based on the SOM prediction accuracy of each image and then selected the first few images for synthesis to improve the prediction. In the study, we collected 103 surface soil samples from Youyi Farm in the Sanjiang Plain of Northeast China, with 46 from drylands and 57 from paddy fields. Multi-temporal Sentinel 2 images acquired during a potential bare soil period between April and June from 2019 to 2023 were used to build the SOM prediction models. Cross-validation results showed that for single-date images, while the optimal joint model of drylands and paddy fields reached a coefficient of determination (R2) of 0.56 and a root mean square error (RMSE) of 0.71 %, the individual statistical accuracy was low for paddy fields, with an R2 of 0.24 and an RMSE of 0.78 %. In contrast, separate modeling achieved higher accuracy in both drylands (R2 = 0.65 and RMSE = 0.58 %) and paddy fields (R2 = 0.43 and RMSE = 0.67 %). The common multi-temporal image synthesis method showed similar results. These imply that joint modeling probably masks the inferior performance in paddy fields, and its prediction accuracy could not be improved when jointly modeled with drylands. Compared to the prediction results from single-date images and the common multi-temporal image synthesis method, the optimal image synthesis method improved the prediction accuracy for both drylands and paddy fields, achieving R2s of 0.74 and 0.52 with RMSEs of 0.50 % and 0.62 %, respectively. Our study proves the necessity and validity of separate modeling for drylands and paddy fields, and demonstrates the potential of the proposed optimal image synthesis method for accurate prediction of SOM.