Tunnel maintenance is essential for ensuring the safety and health of urban transportation systems. However, traditional inspection methods are labor-intensive, time-consuming, and prone to human errors. In this paper, a YOLOv8-CM framework is proposed, which integrates You Only Look Once v8 (YOLOv8) and Convolutional Block Attention Module (CBAM) to automate the detection and segmentation of various defects and other critical objects in tunnels. Segment Anything Model (SAM) was used to generate accurate polygon masks for annotated features. U-shaped Network (U-net) was adopted to segment pixel-wise features of cracks in low-resolution images by processing split images detected by YOLOv8-CM. Common tunnel defects like cracks, water seepage, spalling, and rebar exposure, alongside essential maintenance objects such as repaired spalling, seam joints, bolted connections, stains, markings, and pipelines, all of which are primary concerns for tunnel maintenance personnel, are successfully detected and segmented. Results show that YOLOv8-CM outperforms YOLOv8, YOLOv7, and YOLOv5 in terms of mean average precision (mAP), achieving 0.908 for detection and 0.890 for segmentation of all objects. The highest predicted detection and segmentation accuracies are achieved by bolted connections, reaching 0.976 and 0.990, respectively. While cracks exhibit the lowest prediction precision, the attained accuracies of 0.746 for detection and 0.747 for segmentation remain noteworthy. Additionally, YOLOv8-CM exhibits higher detection and segmentation accuracy than Detectron2, particularly in detecting intricate defects like water seepage and spalling. The proposed YOLOv8-CM framework demonstrates effective capabilities in detecting and segmenting vital maintenance defects, showing its promising potential as a tool for intelligent tunnel monitoring.