On this page

Research on Computer Software Copyright Infringement Determination Method and Legal Response Mechanism Combined with Data Mining Algorithm

By: Shuang Yang1
1School of Political Science and Law, Sichuan University of Arts and Science, Dazhou, Sichuan, 635000, China

Abstract

Under the rapid development of information technology, the problem of computer software copyright infringement is becoming more and more prominent, and the traditional infringement determination methods are facing the challenges of low efficiency and lack of accuracy when dealing with massive data and complex code logic. Considering the different characteristics of computer software copyright infringement determination and traditional works, this paper discusses the important rules of copyright infringement determination (substantial similarity plus contact rule) from the perspective of academic theory. Under the guidance of the rule theory, the copyrighted software and the allegedly infringing software source code data are transformed into TF-IDF vectors using the optimized K-Means mean clustering algorithm. A high-dimensional mixed-attribute data similarity algorithm is used to calculate the similarity of the software source code data vectors and obtain the final mining results. Thereby, a computer software copyright infringement determination method based on the data mining algorithm is proposed. In the infringement determination experiment, the accuracy rate of this method is as high as 85% and above, and the highest recall rate is only 87.60%, which can provide powerful technical support for the determination of computer software copyright infringement.