On this page

Novel non-recursive accelerated cascade integrator filter optimization design based on GPU parallel computing

By: Yanhao Guan 1, Yi Lu 1, Guolin Shao 1
1School of Software, Nanchang University, Nanchang, Jiangxi, 330031, China

Abstract

With the development of computer technology, the traditional single-threaded CPU computing has been difficult to meet the needs of large-scale data processing, and GPU has become the key technology for optimizing filtering algorithms by virtue of its powerful parallel computing capability. In this paper, we propose a new nonrecursive accelerated cascade integrator filter design method based on multi-GPU parallel computing optimization, which adopts the CUDA programming model and synchronous batch normalization technique to make full use of the GPU parallel computing architecture to improve the filter performance. The method designs a four-stage optimization process: firstly, the reference frame image is stored into the GPU global memory; secondly, the adaptive correlation function matrix is calculated and stored into the shared memory; then the filter coefficients are solved by the three-step method of LU decomposition, forward substitution and backward substitution; and finally, the reference frame interpolation calculation is completed in the GPU. The performance test shows that the algorithm in this paper is accelerated up to 119.36 times compared with the traditional CPU method and SIRP+CU method in planar filter computation; In X-ray dynamic micro-CT reconstruction, the algorithm in this paper achieves a speedup ratio of 107.58 over the conventional CPU method when processing 1500 frames of projection data; In the radar clutter simulation, the acceleration ratio reaches 183.14 when processing 1 × 107 data volume, and the average computation time is only 73.15 ms for different data volumes. Experiments demonstrate that the non-recursively accelerated cascaded integrator filter based on GPU parallel computing significantly improves the processing efficiency while guaranteeing the computational accuracy, providing an efficient solution for large-scale computation.