A object detection algorithm based on improved YOLOv8s is proposed to address the issues of low accuracy and low efficiency in surface defect detection of hot-rolled strip steel. Firstly, an SPPD module based on feature map secondary stitching and incorporating GAM is proposed, which enhances the model’s multi-scale information fusion ability. Secondly, a feature extraction module DCN-block that integrates deformable convolution is proposed to increase the receptive field of the model and extract complete defect information. Finally, the C2f module in the feature fusion network is replaced with a BoT (bottleneck transformer) structure, and the multi-head self-attention mechanism in the Transformer is fused with convolution to enhance the model’s global position information perception ability. The experimental results show that the proposed algorithm achieves mean average precision (mAP) of 80.5% on the NEU-DET dataset, which is five percentage points higher than the original YOLOv8 algorithm. At the same time, the detection speed reaches 83 frames per second, meeting the requirements of real-time detection.