Sentencing standardization based on judicial big data using gradient boosting decision trees

doi:70517/ijhsa47175

Research article
DOI: https://doi.org/10.70517/ijhsa47175

Volume 47, Issue 1
Pages: 865
-877
Open Access
Download

Sentencing standardization based on judicial big data using gradient boosting decision trees

By: ^¹, ^¹

¹School of Civil, Commercial, and Economic Law, China University of Political Science and Law

Published: 11/09/2025

Abstract

In the era of digital justice, the integration of big data analytics into sentencing decisions has emerged as a key direction for enhancing judicial transparency and fairness. This paper proposes a novel sentencing standardization framework based on judicial big data and interpretable machine learning. Focusing on online fraud adjudication documents from the Chinese judiciary, we construct a domain-specific database using a hybrid method of keyword-based pattern matching and association rule analysis to extract structured features such as criminal intent, means, economic loss, and mitigating factors. These features are encoded into machine-readable vectors and fed into a LightGBM-based gradient boosting decision tree (GBDT) model to predict sentencing outcomes. Extensive experiments using real-world fraud cases demonstrate the model’s high predictive performance, with R² scores reaching 0.98 and minimal average deviation. A series of visual and statistical evaluations—including boxplots, Taylor diagrams, and regression fits—validate the model’s robustness and its ability to replicate human sentencing logic.

Keywords: judicial big data, sentencing prediction, LightGBM, legal AI, pattern recognition, Chinese court decisions, online fraud, standardization model

On this page

Sentencing standardization based on judicial big data using gradient boosting decision trees

Abstract