Machine Learning Approaches to Injury Risk Prediction in Sport

Iftikhar Khan1 ORCiD, Hafsa Ali2, Menahil Faheem Siddiqui3 and Gopika Mohanakumaran Nair Geetha4
1. FMH College of Medicine and Dentistry, Lahore, Pakistan
2. Sindh Medical College, Jinnah Sindh Medical University, Karachi, Pakistan Research Organization Registry (ROR)
3. Karachi Medical and Dental College, Karachi, Pakistan
4. Independent Research, Sheffield, UK
Correspondence to: Iftikhar Khan,iffykhandir@gmail.com

Premier Journal of Computer Science

Additional information

  • Ethical approval: N/a
  • Consent: N/a
  • Funding: No industry funding
  • Conflicts of interest: N/a
  • Author contribution: Iftikhar Khan, Hafsa Ali, Menahil Faheem Siddiqui and Gopika Mohanakumaran Nair Geetha – Conceptualization, Writing – original draft, review and editing
  • Guarantor: Iftikhar Khan
  • Provenance and peer-review:
    Unsolicited and externally peer-reviewed
  • Data availability statement: N/a

Keywords: Injury risk prediction, Machine learning models, Sports analytics, Athlete monitoring, Random Forest and xgboost.

Peer Review
Received: 12 May 2025
Last revised: 12 July 2025
Accepted: 12 July 2025
Version accepted: 5
Published: 28 August 2025

Plain Language Summary Infographic
A professional infographic summarizing how machine learning predicts sports injury risk. It highlights Random Forest and XGBoost as leading models, with deep learning and hybrid approaches also noted. Key predictive features include training load, sleep quality, and previous injury history. The design uses athlete icons, sports symbols, and machine learning graphics with vibrant colors to illustrate strengths, limitations, and ethical considerations of ML in sports science
Abstract

Injury prediction has emerged as a priority within sports science and performance management to limit athlete downtime and enhance performance results. Machine learning (ML) has transformed the field in the short term by enabling the detection of complex patterns and risk factors from diverse data sources related to athletes. This narrative review provides an overview of the current knowledge on ML methods, key results, and the challenges encountered when applying them to injury risk prediction in various sports. This is among the first review articles to bring together and examine how ML is used for predicting injuries across various sports and different types of models. This review offers practical guidance for sports scientists and clinicians looking to incorporate simple-to-understand ML models into their athlete monitoring and injury prevention efforts. A thorough literature search was conducted using databases such as PubMed, Scopus, and Google Scholar, with keywords including “injury prediction,” “machine learning,” “sports analytics,” and “athlete monitoring.”

Tree-based methods, especially Random Forests and XGBoost variants, consistently performed best, effectively managing non-linear and multi-factorial inputs. Deep and hybrid models remain promising, particularly for multi-modal data sets; however, poor interpretability constraints limit their widespread use. Training load, sleep quality, and previous injury status were key predictive features. However, unbalanced datasets, inconsistent injury definitions, and broad prediction windows further undermine these present attempts in terms of generalizability and clinical relevance. Although accuracy measures reported by some models are very high, their utility in real-world settings is limited due to small sample sizes and high method heterogeneity. Designated data protocols need to be instituted; the explainability of the model in injury risk prediction must be enhanced; and ethical frameworks for data use must be established, considering the practical application of ML.

Highlights

  • Machine learning (ML) techniques, particularly Random Forest and XGBoost, are leading tools for predicting the risk of sports injuries.
  • Critical predictors across studies include training load, sleep quality, and a history of previous injuries.
  • Deep learning and hybrid models offer advanced capabilities but require improved interpretability for clinical adoption.
  • Future progress requires standardized methodologies, ethical data practices, and athlete-centered model validation.

Introduction

Youth and young athletes have high participation rates in sports, which is why sports are a leading cause of injury. A cross-sectional and review article indicates that around 20% of school children miss one day of school due to a sports injury, and about one in three seek medical attention for a sports injury.1,2 To prevent these injuries, a four-step model called Mechelen’s Model has been used for over 25 years. The model involves monitoring injury rates, identifying risk factors, developing prevention strategies, and assessing their effectiveness. Another key approach used is conducting Randomized Controlled Trials (RCTs) to examine and understand the efficacy of prevention methods. The present study was designed to utilize case-control and cohort studies, which can also be employed. Therefore, the most efficient strategy is the one that can be readily adopted and produce realistic and efficient results.1 The consequences of sports injuries have a profound effect on athletes. Injured athletes are prone to suffer mental health issues, including disordered eating, anxiety, depression, and suicidal thoughts. Furthermore, sports injuries can lead to career termination, negative behavior towards their colleagues, family members, and disturb athletes’ physical and emotional well-being.3

The traditional approach to preventing sports injuries incorporates evidence from public health and medical research. The research study is usually designed following RCTs, and building reviews on them. Moreover, two more approaches are used, namely “evidence-based practice” and “practice-based evidence.” Evidence-based practice involves making decisions in a specific situation based on available data and research. Conversely, practice-based evidence involves addressing everyday medical emergencies, rather than implementing interventions and evaluating their effectiveness.4 Though there are certain setbacks in the traditional method, these include the “research-practice gap,” where all the solutions provided for a specific injury are considered in an ideal situation. This ideal scenario, as mentioned in the article, is not often observed. Another setback is that practitioners trained to prevent or overcome sports injuries have different professional experiences (full-time professional coaches vs. part-time volunteer community coaches), which disrupts the implementation of research-based solutions.4

In modern times, AI is having a substantial positive impact on the prevention of sports injuries. It can help enhance machine learning (ML) by collecting data on the history of injuries, training loads and techniques, body measurements, health history, as well as genetic history, to create training programs tailored to each individual, thereby reducing injury risk and optimizing athlete performance. Moreover, AI is evolving to monitor future incoming data based on current data, alerting athletes and their associated colleagues to potential future injuries and health emergencies that may impact performance efficiency. This goal is being met through wearable devices that provide immediate feedback for decision-making during sports. Moreover, AI supports automation in training, performance tracking, and injury prevention by utilizing chatbots and motion sensors to guide athletes through exercises.5 In the following sections, we describe the methodology used to identify and select relevant studies on ML applications in sports injury prediction. The Methods detail our systematic search strategy, inclusion criteria, and evidence appraisal approach. The Results summarize findings across various ML models, sports, and performance metrics. Finally, the Discussion critically evaluates the model’s strengths and limitations, practical challenges, and ethical considerations, and provides future research directions.

Methodology

This review has been conducted in accordance with the PRISMA 2020 guidelines for systematic reviews.6 A comprehensive literature search was performed using multiple databases, including PubMed, Scopus, Google Scholar, and IEEE Xplore. The following Boolean search strategy was used across databases:

(“injury prediction” OR “injury risk”) AND (“machine learning” OR “artificial intelligence”) AND (“sports” OR “athletes” OR “sports analytics” OR “athlete monitoring”)

Searches were restricted to articles published between January 2010 and March 2025, ensuring the inclusion of contemporary ML methodologies. Language filters were applied to include studies published in English and Spanish only. Inclusion criteria encompassed observational studies evaluating ML techniques for sports injury prediction. No restrictions were applied regarding the age, gender, level of play (from amateur to professional), or type of injury of the athletes. Studies were selected in a multi-step process: initial title and abstract screening was followed by full-text assessment for eligibility. Discrepancies were resolved by consensus among reviewers. As illustrated in Figure 1, a PRISMA 2020 flow diagram adapted for the study selection process is provided. A total of 40 articles were ultimately included after screening 380 records.

Fig 1 | PRISMA-style flow diagram of study selection process
Note: This PRISMA diagram is adapted for use in a narrative review
Figure 1: PRISMA-style flow diagram of study selection process.
Note: This PRISMA diagram is adapted for use in a narrative review.

Evidence Appraisal

To assess the certainty of evidence in the included studies, we applied the GRADE-Narrative approach, adapted for narrative reviews.7 This method evaluates the quality of evidence across key outcomes based on study limitations, inconsistency, indirectness, imprecision, and publication bias. Each outcome was rated as high, moderate, low, or very low certainty of evidence. Studies were grouped by ML model type and sport-specific application, and quality was synthesized narratively. The results of this grading are summarized in Table 1. This table summarizes the certainty of evidence for various ML models applied in sports injury prediction, assessed using the GRADE-Narrative approach. Factors considered include risk of bias, consistency, imprecision, and potential publication bias.

Table 1: Summary of evidence certainty by ML model (GRADE-narrative assessment).
ML ModelEvidence BaseRisk of BiasConsistencyImprecisionPublication BiasCertainty of Evidence
Random Forest (RF)5 studiesModerateModerateSeriousSuspectedLow
XGBoost3 studiesModerateLowSeriousSuspectedVery Low
Support Vector Machine (SVM)4 studiesLowModerateModerateUndetectedModerate
Deep Learning (DL)2 studiesSeriousLowSeriousLikelyVery Low
Hybrid/Ensemble3 studiesModerateModerateModerateSuspectedLow
ML Techniques in Injury Prediction – ML Methods Overview

Supervised Learning

This section outlines the key ML paradigms and variables used in injury risk prediction models. ML models can be broadly categorized into supervised, unsupervised, and DL approaches. Supervised learning involves labeled datasets and includes algorithms such as RFs, SVMs, and logistic regression.8,9 Unsupervised learning, such as clustering and principal component analysis, identifies hidden patterns in unlabeled data.2 DL models, including neural networks and recurrent architectures, are particularly effective for processing complex, multimodal inputs like video or sensor data.10,11 Each model was selected based on its fit for the dataset characteristics, interpretability, and performance in prior literature.

DL

Another type of ML is DL. This is based on neural network models, which are inspired by the principles of how the human brain functions. Unlike other types of ML, DL can extract data from raw data (images and videos). DL further includes Back Propagation, an effective technique that modifies the model’s internal parameters through a process known as gradient descent to aid in learning. This enables the training of deep neural networks even with massive datasets.11 In many domains, DL has enhanced performance. For instance, DL models now outperform conventional ML techniques in human activity recognition, such as assessing movement using wearable sensors or videos. These advancements have been particularly evident in fields that utilize computer vision and inertial measurement units, resulting in more accurate identification and analysis of human movement. Additionally, recurrent neural networks (RNNs), which retain past neural inputs for future predictions, have also demonstrated significant improvements.11

Each algorithm applied in the literature was selected based on the nature of available data, the dimensionality of features, and the need for interpretability. For instance, RF and XGBoost were often favored for their capacity to handle non-linear data and imbalanced datasets. At the same time, SVMs excelled in smaller sample contexts due to their margin maximization strategy. However, DL, despite offering the best performance on multimodal inputs such as video or GPS data, often lacks transparency and requires large datasets to prevent overfitting.

Results

This review aimed to identify ML strategies to predict the risk of injury in multiple sports. Most studies have focused on sports with a high risk of injury, such as football, soccer, rugby, and basketball. We identified a total of five studies reporting the use of ML in predicting football (soccer) injuries. The studies (Table 2) by Anne Hecksteden et al., Nikki Rommers et al., Diogo Nuno Freitas et al., Iñaki Ruiz-Pérez et al, Jon L. Oliver et al., and Reza Saberisani et al. included 88, 734, 34, 206, 355, and 25 football players, respectively. Four studies were prospective, and one was a longitudinal design.1–6,8 Two studies on basketball players, by Juri Taborri et al. and Susanne Jauhiainen et al., were analyzed, reporting on 39 basketball players and 314, including those from basketball and floorball.9,10 Furthermore, one retrospective study by Arie-Willem de Leeuw et al. included 14 volleyball players.11 The remaining two studies involved 122 and 880 general athletes, as Maria Henriquez et al. and Susanne Jauhiainen et al. reported, respectively.11,12

Table 2: Summary of all the ml studies with details about sample size, ml model used, and performance metric.
AuthorSportSample SizeStudy DesignML ModelsPerformance MetricsKey Findings
Anne Hecksteden et al.Football88Not specifiedGradient BoostingROC AUCGradient boosting was used; performance was evaluated via the ROC AUC.
Nikki Rommers et al.Football734ProspectiveXGBoostF1-scoreHigh-performing XGBoost model; used F1-score.
Diogo Nuno Freitas et al.Football34Not specifiedSVMs, FNNs, AdaBoostROC AUCReported 74.22% overall accuracy, 71.43% sensitivity, 74.19% specificity.
Iñaki Ruiz-Pérez et al.Football206LongitudinalDecision Tree, AD Tree, SVMsROC AUC, F-scoreA combination of tree-based and SVM Models was used.
Jon L. Oliver et al.Football355Not specifiedDecision TreeROC AUCApplied decision trees to predict injury.
Reza Saberisani et al.Football25Not specifiedDecision TreeROC AUCSmall sample study using decision trees.
Juri Taborri et al.Basketball39Not specifiedLanding Error Score SystemF1-scoreInjury risk is classified using biomechanics.
Susanne Jauhiainen et al.Basketball & Floorball314Not specifiedRF, Logistic RegressionROC AUC (0.98)Very high performance noted (AUC 0.98).
Arie-Willem de Leeuw et al.Volleyball14RetrospectiveSubgroup DiscoveryNot specifiedSmall sample, subgroup-based pattern mining.
Maria Henriquez et al.Mixed Athletes122Not specifiedRFROC AUC (0.689)Used ROC AUC; mainly false positives reported.
Susanne Jauhiainen et al.Mixed Athletes880Not specifiedRF, SVMROC AUCApplied multiple ML models in a large cohort.

Among the assessed models, high efficacy was consistently observed in the RF and XGBoost models. RF models help handle features with a range of distributions, as they have no official distribution assumptions. They can also manage multimodal data, allowing for the interpretation of meaningful relationships between features and outcome variables. Maria Henriquez et al. used the performance metric receiver operating characteristic (ROC) area under the curve (AUC) to evaluate the performance of their RF machine models.11 The final ROC AUC accuracy metric was 68.90%, with errors primarily resulting from false positives rather than false negatives.11 Susanne Jauhiainen et al. also used RF, and the training ROC AUC values were high (AUC 0.98). ROC-AUC of 1 indicates a very accurate prediction, while a value of 0.5 implies a purely random prediction.10 The AUC range for the RF plot was 0.78–0.98, suggesting better model performance.10,11,13 The SVM also demonstrated high performance, with an AUC ranging from 0.85 to 0.96.3,9,10 XGBoost was also perceived as a high-performing model, as evidenced by studies that reported a precision of 84% for injury prediction.2 A study reported that SVM, Feedforward Neural Networks (FNNs), and Adaptive Boosting (AdaBoost) showed a good accuracy in detecting injuries, obtaining a resulting sensitivity of 71.43%, specificity of 74.19%, and overall accuracy of 74.22%.3

Discussion

The current review aims to analyze the integration of ML in sports such as football, volleyball, basketball, and floorball to predict injuries that players may experience.

Model Performance: A comparative summary of strengths, limitations, and optimal applications of various ML models is provided in Table 3. This table identifies the likelihood of injury, enabling coaches to adjust training intensity and techniques to prevent potential injuries. Gradient boosting and AdaBoost are also ML algorithms that are transparent and simpler to implement.14

Table 3: Comparative appraisal of ml models in sports injury prediction.
ML ModelStrengthsLimitationsBest Use Case/SportPerformance MetricsInterpretability
RFRobust to noise; handles non-linear, multi-factorial data; works with unbalanced datasets.Less interpretable (“black-box”); performance can drop with high-dimensional irrelevant variables.Football, multi-sport datasetsAUC: 0.78–0.98Low
XGBoostHigh efficiency; excels with imbalanced data; regularization helps prevent overfitting.Requires extensive parameter tuning; computationally intensive on large datasets.Youth football, elite training groupsPrecision: ~84%MediumX
SVMExcellent for small, high-dimensional datasets; strong generalization with kernel tricksLow scalability; complex kernel configurations reduce transparencyBasketball, volleyballAUC: 0.85–0.96Medium
DLPowerful with multimodal inputs (video, GPS, sensor); learns from raw data automatically.Requires large labeled datasets, high training cost, and poor interpretabilityWearables, elite teams with rich dataVaries (often high but dataset-dependent)Low
Hybrid/Ensemble ModelsCombines the strengths of multiple algorithms; resilient to noise and overfittingDifficult to interpret which component drives predictions; higher computational demandsResearch settings: diverse sportsAccuracy: ~74.2%, Sensitivity: ~71.4%, Specificity: ~74.2%Low–Medium

Challenges

There are certain challenges; for instance, some studies had a very small sample size, ranging from 14 to 122.1,3,8,11 Furthermore, interpreting and explaining data are significant barriers; coaches and doctors may be discouraged from adopting tools they do not understand. Technical experts would be required to decipher the results, which could lead to increased costs. However, emerging AI tools like SHapley Additive exPlanations (SHAP) provide actionable and interpretable output that is easier to understand.2 SHAP is a type of model clarification and explainability framework that can be integrated with ML to offer understanding into the model decision process.15 Furthermore, measures such as AUC may not be the best option for assessing ML performance, as they only consider black and white outcomes, that is, whether an individual is injured or not. Conversely, other methods, such as Brier Score and the Logarithmic Loss, can determine the exact predicted probability of injury.16 Presently, many ML models are also regarded as ‘black boxes’, making them less transparent, impeding independent evaluation of model performance, uses, and comprehensibility.15

Many studies have employed cost-sensitive models for detecting injuries, an approach adopted due to the disparity between injury and non-injury data.3 However, some studies did not use this model.2,8,11 Other studies only reflected the first injury, not the multiple injuries sustained throughout the season.2 Moreover, from a practical perspective, the increasing number of confounding variables is a limitation to finding the actual risk of injury using ML models. For example, an athlete might have a very high risk of injury, but not get injured due to the lack of playtime, while other players with a low risk of injury might face harm due to confounding factors.11 GPS-based models can be inaccessible for many practitioners due to the high costs in applied sports settings (250 euros per unit).4 Data availability also poses a significant challenge for ML, as it relies on the nature and features of the data for efficient working. It needs constant availability of data. This data availability can be costly.15 Newer studies have suggested that wearable devices and mobile applications can overtake older laboratory motion data collection methods.3,10

Comparative Appraisal of ML Models

RF: RF models are robust to noise and missing values, making them suitable for real-world sports data. They consistently rank high in AUC scores (0.78–0.98), particularly in football and multi-sport datasets. However, their “black-box” nature hinders clinical acceptance due to low interpretability. Moreover, their performance may degrade in high-dimensional datasets with many irrelevant variables unless feature selection is optimized. XGBoost, known for its computational efficiency and superior handling of imbalanced datasets, achieved precision scores of up to 84%. It is particularly effective in football and elite youth cohorts. However, it demands extensive parameter tuning and may overfit on smaller datasets.

SVMs: SVMs demonstrated strong performance, with AUCs ranging from 0.85 to 0.96, in small to mid-sized datasets. Their strength lies in handling high-dimensional spaces; however, the kernel trick complicates both interpretability and scalability. They are best suited for controlled settings, such as basketball or volleyball.

DL: DL models, such as RNNs and FNNs, have demonstrated high accuracy, particularly when processing video or wearable sensor data. However, their need for large annotated datasets and poor explainability limits their broader use. These models are more promising in elite teams with access to rich, continuous monitoring.

Hybrid Models and Ensembles: Techniques combining SVM, RF, and AdaBoost showed balanced performance (e.g., 74.2% overall accuracy), benefiting from ensemble robustness. However, computational load and difficulty in identifying which component drives prediction hinder real-time application. Table 2 summarizes the performance metrics of these ML models in prior studies and contextualizes their real-world usability in sports injury prediction.

Ethical Considerations

Ethical considerations are also essential when designing, governing, and implementing machine-based models, including factors such as honesty, truthfulness, transparency, privacy, and safety. AI-related ethical issues must be disclosed to health professionals and athletes beforehand to ensure compliance with ethical standards.15 The introduction of biases is another direct ethical problem that needs to be taken into consideration, as these biases can lead to incorrect decisions. They show a greater tendency to discriminate based on race, which is a crucial factor in healthcare and its delivery. For example, a heart-related mortality algorithm by the American Heart Association showed that if two patients present with similar symptoms. Still, one is White and the other is Black, the prediction indicated that the white patient is at higher risk, encouraging the doctors to allot more resources to the white patient. Such issues are even more substantial if they are undetected by medical professionals, as they would not be able to stop algorithms from learning or integrating such bias.17,18 Major security challenges also pose a threat; multiple new research studies have acknowledged the susceptibility of ML systems to adversarial ML attacks. These attacks have been noticed on medical systems that employ ML.19

There are also concerns regarding the dehumanization of clinical decision-making due to over-reliance on healthcare professionals. This could also lead to physicians neglecting patients’ values and past experiences when deciding on an intervention and relying solely on algorithms. This also might limit intervention choices.18 Transparent reporting standards, such as TRIPOD-AI and PROBAST-AI, are being developed to enhance ethical compliance in model development and validation.20

Research Gaps and Implications for Practice

Despite promising results, current research exhibits several gaps. Most studies rely on single-center datasets with small sample sizes, which limits their external validity and reproducibility. There is also a lack of standardized injury definitions and uniform data collection protocols, which hinders meta-analysis and model comparison. Additionally, few studies account for recurrent injuries or dynamic player conditions across a season. This review offers novelty by consolidating and critically examining ML approaches across a diverse range of team sports, including underrepresented fields like floorball. We also uniquely highlight model interpretability, real-world adoption barriers, and ethical dimensions that are often overlooked in prior reviews. Our integration of emerging concepts such as SHAP, adversarial ML, and federated learning adds forward-looking value to the current literature.

Using the GRADE-Narrative method, we observed that most studies offer very low to low certainty of evidence, mainly due to small sample sizes, heterogeneity in ML models, lack of standardized injury definitions, and high risk of publication bias. For example, while RF and XGBoost models showed high predictive metrics, the absence of external validation in most studies reduced overall confidence. Studies applying SVMs and hybrid models demonstrated slightly higher consistency in reporting; however, the limited interpretability of DL models and inconsistent follow-up protocols in the included research further downgraded certainty ratings.

Future Directions

To enhance practical relevance, future research should prioritize multicenter, longitudinal studies that represent diverse athletic populations. Techniques like federated learning can be employed to aggregate data from multiple sources without compromising privacy.21 Transfer learning may also allow knowledge sharing across different sports or populations with limited data.22 Explainable AI methods, such as SHAP and LIME, should be integrated to improve understanding and facilitate real-world adoption by coaches and sports medicine professionals.2

Practitioners should note that ML tools are most effective when used in conjunction with expert knowledge and experience.23–25 For maximum benefit, interdisciplinary collaboration is crucial, and end-users must be trained to interpret model outputs effectively. The translation of model predictions into actionable training and rehabilitation strategies will determine the actual impact of ML on athlete health.26,27 Injury prediction in sports presents unique challenges that necessitate the development of tailored ML models. No single model universally outperforms others; rather, model selection should be context-dependent, balancing accuracy with interpretability and resource availability. Future studies should focus on comparative validations across different sports using unified datasets to establish model benchmarks.28–30

Conclusion

In this review, we showed that ML is proving to be a transformative force in sports injury prediction, offering promising solutions for early detection, athlete monitoring, and performance optimization. This review emphasized that tree-based models such as RF and XGBoost currently lead the field due to their adaptability, ability to handle nonlinear data, and overall robustness. DL models, while powerful for multimodal data such as videos and sensor outputs, remain constrained by interpretability issues that limit their clinical applicability. Despite promising accuracy metrics, real-world implementation remains challenging. Common barriers include small sample sizes, heterogeneous methodologies, inconsistent injury definitions, and limited generalizability.31

Furthermore, ethical considerations, particularly data privacy, transparency, and informed consent, must be embedded within future ML applications in sports. Cost constraints, lack of standard data protocols, and a reliance on ‘black-box’ algorithms further reduce model trust and adoption by practitioners and coaches. To move forward, future research should focus on developing explainable models using frameworks such as SHAP, standardizing injury definitions, and building multicenter datasets for improved model validation and generalization. Collaboration between sports scientists, medical professionals, data analysts, and ethicists will be critical in transforming ML tools into practical, athlete-centered solutions. Finally, integrating ML into routine athlete care must prioritize accuracy, interpretability, fairness, and ethical use, ensuring that technological innovation genuinely serves to protect and enhance the well-being and performance of athletes.

References
  1. Emery CA, Pasanen K. Current trends in sport injury prevention. Best Pract Res Clin Rheumatol Ask ChatGPT. 2019;33(1):3–15. https://doi.org/10.1016/j.berh.2019.02.009
  2. Emery CA, Tyreman H. Sport participation, sport injury, risk factors and sport safety practices in Calgary and area junior high schools. Paediatr Child Health. 2009;14(7):439–44.
  3. Tranaeus U, Gledhill A, Johnson U, Podlog L, Wadey R, Wiese Bjornstal D, et al. 50 years of research on the psychology of sport injury: a consensus statement. Sports Med. 2024;54(7):1733–48. https://doi.org/10.1007/s40279-024-02045-w
  4. Tee JC, McLaren SJ, Jones B. Sports injury prevention is complex: we need to invest in better processes, not singular solutions. Sports Med. 2020;50(4):689–702. https://doi.org/10.1007/s40279-019-01232-4
  5. Reis FJ, Alaiti RK, Vallio CS, Hespanhol L. Artificial intelligence and machine-learning approaches in sports: concepts, applications, challenges, and future perspectives. Brazil J Phys Ther. 2024;28:101083. https://doi.org/10.1016/j.bjpt.2024.101083
  6. Page MJ, McKenzie JE, Bossuyt PM, Boutron I, Hoffmann TC, Mulrow CD, et al. The PRISMA 2020 statement: an updated guideline for reporting systematic reviews. Int J Surg. 2021;88:105906.
  7. Murad MH, Mustafa RA, Schünemann HJ, Sultan S, Santesso N. Rating the certainty in evidence in the absence of a single estimate of effect. Evid Based Med. 2017;22(3):85–7. https://doi.org/10.1136/ebmed-2017-110668
  8. Huang X. Predictive models: regression, decision trees, and clustering. Appl Comput Eng. 2024;79:124–33. https://doi.org/10.54254/2755-2721/79/20241551
  9. Vargas M, Biggs D, Larraín T, Alvear A, Pedemonte JC, de Anestesiología R. Inteligencia artificial en medicina: Métodos de modelamiento (Parte I). Rev Chil Anest. 2022;51(5):527–34. https://doi.org/10.25237/revchilanestv5129061230
  10. Cust EE, Sweeting AJ, Ball K, Robertson S. Machine and deep learning for sport-specific movement recognition: a systematic review of model development and performance. J Sports Sci. 2019;37(5):568–600. https://doi.org/10.1080/02640414.2018.1521769
  11. Grossberg S. Recurrent neural networks. Scholarpedia. 2013;8(2):1888. https://doi.org/10.4249/scholarpedia.188
  12. Hecksteden A, Schmartz GP, Egyptien Y, Aus der Fünten K, Keller A, Meyer T. Forecasting football injuries by combining screening, monitoring and machine learning. Sci Med Football. 2023;7(3):214–28. https://doi.org/10.1080/24733938.2022.2095006
  13. Rommers N, Rössler R, Verhagen E, Vandecasteele F, Verstockt S, Vaeyens R, et al. A machine learning approach to assess injury risk in elite youth football players. Med Sci Sports Exerc. 2020;52(8):1745–51. https://doi.org/10.1249/MSS.0000000000002305
  14. Freitas DN, Mostafa SS, Caldeira R, Santos F, Fermé E, Gouveia ÉR, et al. Predicting noncontact injuries of professional football players using machine learning. Dwyer D, editor. PLoS One. 2025;20(1):e0315481.
  15. Ruiz-Pérez I, López-Valenciano A, Hernández-Sánchez S, Puerta-Callejón JM, De Ste Croix M, Sainz de Baranda P, et al. A field-based approach to determine soft tissue injury risk in elite futsal using novel machine learning techniques. Front Psychol. 2021;12:610210. Available from: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7892460/
  16. Oliver JL, Ayala F, De Ste Croix MBA, Lloyd RS, Myer GD, Read PJ. Using machine learning to improve our understanding of injury risk and prediction in elite male youth football players. J Sci Med Sport. 2020;23(11):1044–8.
  17. Char DS, Shah NH, Magnus D. Implementing machine learning in health care – addressing ethical challenges. N Engl J Med. 2018;378(11):981–3. https://doi.org/10.1056/NEJMp1714229
  18. O’Reilly-Shah VN, Gentry KR, Walters AM, Zivot J, Anderson CT, Tighe PJ. Bias and ethical considerations in machine learning and the automation of perioperative risk assessment. Br J Anaesth. 2020;125(6):843–6. https://doi.org/10.1016/j.bja.2020.07.040
  19. Rasheed K, Qayyum A, Ghaly M, Al-Fuqaha A, Razi A, Qadir J. Explainable, trustworthy, and ethical machine learning for healthcare: a survey. Comput Biol Med. 2022;149:106043. https://doi.org/10.1016/j.compbiomed.2022.106043
  20. Collins GS, Dhiman P, Andaur Navarro CL, Ma J, Hooft L, Reitsma JB, et al. Protocol for development of a reporting guideline (TRIPOD-AI) and risk of bias tool (PROBAST-AI) for diagnostic and prognostic prediction model studies based on artificial intelligence. BMJ Open. 2021;11:e048008. https://doi.org/10.1136/bmjopen-2020-048008
  21. Ng D, Lan X, Yao MM, Chan WP, Feng M. Federated learning: a collaborative effort to achieve better medical imaging models for individual sites that have small labelled datasets. Quant Imaging Med Surg. 2021;11(2):852–7. https://doi.org/10.21037/qims-20-595
  22. Hosna A, Merry E, Gyalmo J, Alom Z, Aung Z, Azim MA. Transfer learning: a friendly introduction. J Big Data. 2022;9(1):102. https://doi.org/10.1186/s40537-022-00652-w
  23. Saberisani R, Barati AH, Zarei M, Santos P, Gorouhi A, Ardigò LP, et al. Prediction of football injuries using GPS-based data in Iranian professional football players: a machine learning approach. Front Sports Act Living. 2025;7:1425180. https://doi.org/10.3389/fspor.2025.1425180
  24. Taborri J, Molinaro L, Santospagnuolo A, Vetrano M, Vulpiani MC, Rossi S. A machine-learning approach to measure the anterior cruciate ligament injury risk in female basketball players. Sensors. 2021;21(9):3141. https://doi.org/10.3390/s21093141
  25. Jauhiainen S, Kauppi JP, Krosshaug T, Bahr R, Bartsch J, Äyrämö S. Predicting ACL injury using machine learning on data from an extensive screening test battery of 880 female elite athletes. Am J Sports Med. 2022;50(11):2917–24. https://doi.org/10.1177/03635465221112095
  26. de Leeuw AW, van der Zwaard S, van Baar R, Knobbe A. Personalized machine learning approach to injury monitoring in elite volleyball players. Eur J Sport Sci. 2022;22(4):511–20. https://doi.org/10.1080/17461391.2021.1887369
  27. Henriquez M, Sumner J, Faherty M, Sell T, Bent B. Machine learning to predict lower extremity musculoskeletal injury risk in student athletes. Front Sports Act Living. 2020;2:576655. https://doi.org/10.3389/fspor.2020.576655
  28. Jauhiainen S, Kauppi JP, Leppänen M, Pasanen K, Parkkari J, Vasankari T, et al. New machine learning approach for detection of injury risk factors in young team sport athletes. Int J Sports Med. 2020;42(02):175–82. https://doi.org/10.1055/a-1231-5304
  29. Rossi A, Pappalardo L, Cintia P, Iaia FM, Fernàndez J, Medina D. Effective injury forecasting in soccer with GPS training data and machine learning. PLoS One. 2018;13(7):e0201264. https://doi.org/10.1371/journal.pone.0201264
  30. Amendolara A, Pfister D, Settelmayer M, Shah M, Wu V,
  31. Donnelly S, et al. An overview of machine learning applications in sports injury prediction. Cureus. 2023;15:e46170. Available from: https://assets.cureus.com/uploads/review_article/pdf/177498/20231029-9676-1s7wljl.pdf
  32. Van Eetvelde H, Mendonça LD, Ley C, Seil R, Tischer T. Machine learning methods in sport injury prediction and prevention: a systematic review. J Exp Orthop. 2021;8(1):27. https://doi.org/10.1186/s40634-021-00346-x


Premier Science
Publishing Science that inspires