With the rapid development of urban construction, accurate decoding of urban scenes has become particularly important in the fields of urban infrastructure planning, intelligent transportation management, and environmental monitoring. In this study, a dynamic adaptive convolution-based RGB-D urban scene segmentation algorithm (FastDVFN) is proposed, which aims at solving the limitations of the existing full convolutional neural network (FCN) methods in terms of efficiency and real-time performance. The method combines RGB-D point cloud feature extraction with adaptive convolution mechanism, and optimizes the parameter tuning of the convolutional layer to improve the segmentation accuracy and computational efficiency. On the Cityscapes dataset, the FastDVFN network achieves an mIoU metric of 72.9%, which improves the accuracy by 5.2% compared to the traditional method. In terms of running speed, it reaches 88 frames/s, outperforming other similar lightweight semantic segmentation networks. In the experiment, the number of parameters of FastDVFN network is 0.63m, which is significantly reduced compared to other methods. The effectiveness of the dynamic adaptive convolution and enhanced channel feature normalization (ECFN) module is demonstrated through comparative experiments and analysis of ablation experiments. The results show that the algorithm has strong real-time processing capability while maintaining high accuracy and can meet the practical application requirements of urban scenes.