This paper leverages the lightweight characteristics of the YOLOv5 algorithm to enhance the performance of citrus fruit picking point detection by optimizing the enhanced feature representation of the YOLOv5 algorithm. In the original YOLOv5 network model, to improve the prior boxes obtained from the K-means clustering algorithm, the binary K-means+IoU algorithm is used to update the prior boxes for citrus fruit target detection. The ECANet attention mechanism is added to enhance the algorithm’s ability to focus on important features and eliminate interference from irrelevant features. Combining WIoU-Loss as the loss function for the candidate boxes in the citrus fruit recognition network model achieves more precise citrus fruit target recognition. We analyze the optimization effects of the three strategies—the ECANet module, the K-means+IoU algorithm, and the WIoU loss function—on the YOLOv5 algorithm. Using the citrus fruit image data constructed in this paper under natural environmental conditions, we analyze the improved YOLOv5 algorithm’s performance in detecting targets when citrus fruits overlap or are occluded. The experimental results of citrus fruit recognition show that the mAP value, precision P, recall R, and F1 value of the proposed recognition and detection method are 94.86%, 93.49%, 89.26%, and 0.88%, respectively. Moreover, the positioning error of citrus fruit targets does not exceed 2 mm. The proposed algorithm is proven to be effective and can provide reference for the motion target points of the end-effector of citrus picking robots.