How Artificial Intelligence is Shaping Life Sciences: Systematic Qualitative Review with a Future-Oriented Analytical Framework

Ambreen Ilyas ORCiD
School of Biological Sciences, University of the Punjab, Lahore, Pakistan Research Organization Registry (ROR)
Correspondence to: Ambreen Ilyas, ambreen2.phd.sbs@pu.edu.pk

Premier Journal of Biomedical Science

Additional information

  • Ethical approval: N/a
  • Consent: N/a
  • Funding: No industry funding
  • Conflicts of interest: The author declares no conflict of interest.
  • Author contribution: Ambreen Ilyas – Conceptualization, Writing – original draft, review and editing.
  • Guarantor: Ambreen Ilyas
  • Provenance and peer-review: Unsolicited and externally peer-reviewed
  • Data availability statement: All data used in this review are derived from published sources cited in the manuscript and supplementary files.

Keywords: Biological digital twins, Multi-omics integra¬tion, Explainable artificial intelligence, AI-augmented ex¬perimental design, Sustainability-oriented biotechnology.

Peer Review
Received: 3 January 2026
Last revised: 7 February 2026
Accepted: 8 February 2026
Version accepted: 9
Published: 4 March 2026

Plain Language Summary Infographic
How Artificial Intelligence is Shaping Life Sciences: Systematic Qualitative Review with a Future-Oriented Analytical Framework” illustrating AI applications across genomics, drug discovery, precision medicine, agriculture, and environmental biology, highlighting benefits such as accelerated discovery, personalized treatment, smart agriculture, and sustainable science, while addressing challenges including data quality, interpretability, ethical governance, and system interoperability, and presenting future directions like explainable AI, hybrid data-knowledge models, digital twins, and responsible AI frameworks for next-generation life science innovation.
Abstract

Artificial intelligence (AI) is rapidly evolving from a supportive analytical tool into a central driver of future innovation in the life sciences. As biological research enters an era defined by large-scale, high-dimensional, and continuously generated data, AI is increasingly positioned to shape how biological knowledge is discovered, validated, and translated into real-world applications. This future-oriented review synthesizes emerging trends in AI-driven life-science research, emphasizing the transition toward digitally integrated, data-centric, and adaptive research ecosystems.

Current evidence indicates that machine learning and deep learning approaches will play a pivotal role in redefining experimental design, predictive modeling, and decision-making across genomics, drug discovery, precision medicine, agriculture, and environmental biology. Looking forward, AI is expected to enable seamless integration across biological scales from molecular interactions to ecosystem dynamics through intelligent data fusion, automated hypothesis generation, and real-time learning systems. These advances are likely to accelerate discovery while supporting sustainable and resilient biological innovation. Despite its transformative potential, the future deployment of AI in life sciences is constrained by challenges related to data quality, interpretability, ethical governance, and system interoperability.

Emerging trends such as explainable artificial intelligence, hybrid data-knowledge models, digital twins, and responsible AI frameworks are increasingly recognized as essential for building trust, reproducibility, and regulatory acceptance. This review highlights key technological, methodological, and conceptual directions that are expected to define the next generation of AI-enabled life sciences. This review combines systematic evidence mapping with a future-oriented analytical framework to guide responsible and biologically aligned AI innovation in the life sciences. By positioning AI as a collaborative and adaptive scientific partner rather than a purely computational instrument, future research can better align AI with biological understanding, societal needs, and long-term sustainability goals.

Introduction

Artificial intelligence (AI) is poised to play a defining role in the future trajectory of life-science research, driven by the rapid expansion of biological data and the growing complexity of biological systems. Advances in high-throughput sequencing, multi-omics technologies, advanced imaging, environmental sensing, and automated experimentation are generating data at a scale and resolution that demand fundamentally new analytical paradigms.1–3 AI-based approaches, particularly machine learning (ML) and deep learning (DL), are increasingly viewed not only as solutions to current analytical challenges but as foundational technologies shaping the future of biological discovery.4,5 Early AI applications in life sciences were largely retrospective and task-specific, focusing on pattern recognition, classification, and prediction within narrowly defined datasets.6,7 However, emerging trends suggest a shift toward prospective, integrative, and adaptive AI systems capable of learning across diverse biological scales and contexts.8,9 Future AI-driven life-science platforms are expected to connect molecular, cellular, organismal, and ecosystem-level data through digitally integrated pipelines, enabling continuous feedback between data generation, modeling, and experimental validation.10,11

As AI systems become more deeply embedded in biological research, concerns regarding data bias, robustness, and generalizability are expected to intensify.12 The limited interpretability of many DL models remains a major obstacle to scientific trust, regulatory approval, and real-world implementation, particularly in clinical and policy-relevant domains.13 Consequently, future trends emphasize the development of explainable artificial intelligence (XAI), hybrid modeling strategies, and interoperable digital infrastructures that can support transparency, reproducibility, and accountability.14 In parallel, ethical, environmental, and societal considerations are increasingly shaping the future direction of AI in life sciences. Responsible data governance, equitable access, and sustainability-aware system design are now recognized as integral components of next-generation AI frameworks.15 Despite growing interest, existing literature often lacks a comprehensive, forward-looking synthesis that integrates these technological and conceptual shifts across disciplines.

This future-oriented review therefore aims to identify emerging trends, unmet challenges, and strategic research directions that will define the next phase of AI-
enabled life sciences. By focusing on innovation pathways rather than solely past achievements, this work seeks to inform researchers, policymakers, and stakeholders on how AI can responsibly and effectively shape the future of biological science and biotechnology. Unlike prior domain-specific AI reviews, this work contributes a biologically grounded, cross-domain analytical framework that integrates hierarchical biological principles, explainability, digital twins, and sustainability into a unified decision rubric. This structured mapping enables transferable insights across medicine, agriculture, and ecology and represents a distinct conceptual advance beyond descriptive overviews.

Review Design and Methodology

Review Design and Reporting Framework

This study was conducted as a PRISMA 2020-informed systematic qualitative review with a future-oriented analytical framework. While PRISMA 2020 guidance was followed for transparent literature identification, screening, eligibility assessment, and reporting, the primary objective of this review was not quantitative meta-analysis, but structured evidence mapping, conceptual synthesis, and identification of emerging research trajectories in AI-enabled life sciences. Such an approach is increasingly recommended for rapidly evolving interdisciplinary domains, where methodological convergence, translational readiness, and governance implications are as critical as effect size estimation. Accordingly, this review emphasizes biological scale integration, explainability, validation practices, digital twins, sustainability, and deployment considerations as core analytical dimensions.

To enhance transparency and reproducibility, a retrospective review protocol was deposited in the Open Science Framework (OSF) (Project DOI: 10.17605/OSF.IO/EFBVJ). The protocol documents the review objectives, eligibility criteria, information sources, database-specific search strategies, and the a priori thematic synthesis framework. In line with PRISMA 2020 guidance for qualitative and future-oriented reviews, minor methodologically appropriate deviations limited to iterative refinement of thematic codes to accommodate emergent concepts were fully documented and justified in the OSF record (Supplementary File S4). A completed PRISMA 2020 checklist is provided as Supplementary File S3, with explicit page and section references to facilitate transparent appraisal of reporting completeness.

Information Sources and Search Strategy

A comprehensive and systematic literature search was conducted across four major scientific databases: Scopus, Web of Science Core Collection, PubMed, and IEEE Xplore. These databases were selected to ensure balanced coverage across life sciences, biomedical research, biotechnology, agriculture, environmental and ecological sciences, and computational intelligence. The search strategy was developed iteratively and combined controlled vocabulary (where applicable) with free-text terms related to AI and life-science applications. Core conceptual clusters included:

  • Artificial intelligence, machine learning, deep learning
  • Life sciences, biology, biotechnology, digital biology
  • Precision medicine, genomics, agricultural AI, environmental monitoring
  • Explainable AI, responsible AI, sustainability, digital twins

Boolean operators (“AND”, “OR”) and database-specific syntax were applied to balance sensitivity and specificity. The primary search covered literature published between January 2018 and March 2025, with particular emphasis on studies published after 2021 to capture rapidly emerging and future-relevant developments. Full database-specific search strings are provided in Supplementary File S1, and the complete catalogue of included studies and extracted metadata is provided in Supplementary File S2.

Database-Specific Search Strategy

To ensure full reproducibility, structured Boolean search strings were adapted for each database. The core search logic was:

(“artificial intelligence” OR “machine learning” OR “deep learning”) AND (“life sciences” OR biology OR genomics OR biotechnology OR agriculture OR ecology OR biomedicine) AND (future OR emerging OR roadmap OR trends OR sustainability OR “explainable AI” OR “digital twins”)

Database-specific filters were applied consistently for:

  • Publication type: peer-reviewed articles and reviews
  • Language: English
  • Time window: 2018–2025

Complete search strings for each database are reported in Supplementary File S1, in accordance with PRISMA 2020 requirements.

Eligibility Criteria

Study selection was guided by predefined inclusion and exclusion criteria established a priori.

Inclusion criteria

  • Peer-reviewed journal articles and authoritative review papers
  • Studies reporting AI applications within life sciences, including biomedical, biotechnological, agricultural, or environmental domains
  • Articles addressing integration, scalability, interpretability, validation, ethical considerations, or future implications
  • Publications written in English

Exclusion criteria

  • Conference abstracts, editorials, commentaries, and non-peer-reviewed literature
  • Studies focused exclusively on algorithmic benchmarking without biological relevance
  • Articles lacking translational context or future-oriented discussion

Study Selection Process

All retrieved records were imported into a reference management system, and duplicate entries were removed prior to screening. Study selection was conducted in two sequential stages:

  1. Title and abstract screening to exclude clearly irrelevant studies
  2. Full-text assessment to confirm eligibility against inclusion criteria

The selection process yielded a final corpus of 62 studies, which were included in the qualitative synthesis. Specifically, database searches identified 1,284 records. After removal of 312 duplicates, 972 records remained for title and abstract screening. Of these, 786 records were excluded based on lack of relevance. The remaining 186 articles underwent full-text assessment, of which 144 were excluded for failing to meet eligibility criteria, primarily due to limited biological relevance, purely algorithmic focus, or absence of future-oriented analysis. The remaining 62 studies met all inclusion criteria and were retained for synthesis. A complete catalogue of included studies, with verified DOIs and extracted metadata, is provided in Supplementary File S2 (Table S2).

Quality Considerations

Thematic coding was conducted by the primary reviewer using a predefined codebook (Supplementary Table S4). To assess thematic consistency, a subset of included studies (n = 12) was independently coded by a second reviewer. Agreement exceeded 90% for primary thematic assignment. Discrepancies were resolved through re-examination of source texts rather than consensus discussion. Methodological robustness was appraised descriptively, focusing on data provenance, validation strategies, interpretability, biological plausibility, and governance considerations. A structured qualitative appraisal summary is provided in Supplementary File S5. No study was excluded based on appraisal outcomes.

PRISMA Flow Diagram

The study selection process is summarized in a PRISMA 2020 flow diagram (Figure 1), illustrating:

  • Records identified through database searching (n = 1,284)
  • Records after duplicate removal (n = 972)
  • Records screened and excluded at title/abstract stage (n = 786)
  • Full-text articles assessed for eligibility (n = 186)
  • Full-text articles excluded (n = 144)
  • Studies included in qualitative synthesis (n = 62)

The diagram details records identified through database searching, duplicate removal, title/abstract screening, full-text eligibility assessment, and final inclusion in qualitative synthesis, in accordance with PRISMA 2020 guidance.

Fig 1 | PRISMA 2020 flow diagram illustrating the identification, screening, eligibility assessment, and inclusion of studies in the systematic qualitative review (n = 62)
Figure 1: PRISMA 2020 flow diagram illustrating the identification, screening, eligibility assessment, and inclusion of studies in the systematic qualitative review (n = 62).

Data Extraction

From each included study, the following data elements were systematically extracted:

  • Application domain (biomedical, biotechnological, agricultural, environmental)
  • AI methodology (classical ML, deep learning, hybrid, explainable models)
  • Biological scale (molecular to ecosystem)
  • Data modality (omics, imaging, clinical, sensor, environmental)
  • Validation strategy and reported limitations
  • Ethical, governance, and future-oriented considerations

Data extraction emphasized conceptual contributions and trend indicators, rather than quantitative performance metrics alone. A complete metadata catalogue is provided in Supplementary File S2.

Thematic Synthesis and Future Trends Analysis

A hybrid inductive–deductive thematic synthesis was applied. Extracted data were coded and grouped into higher-level themes representing convergent and emerging trends, including:

  • Multi-scale biological data integration
  • AI-augmented scientific discovery
  • Explainable and trustworthy AI systems
  • Digital twins and adaptive biological modeling
  • Sustainability- and ethics-aware AI frameworks

Themes were iteratively refined to capture cross-domain convergence and identify gaps where future research investment is needed.

Methodological Strengths and Limitations

The principal strength of this review lies in its PRISMA-compliant structure combined with a future-oriented analytical lens, enabling both rigor and strategic insight. However, the rapidly evolving nature of AI research means that emerging developments may not yet be reflected in peer-reviewed literature. Additionally, heterogeneity in reporting standards across disciplines may limit direct comparability. Given the qualitative scope of this synthesis, formal quantitative risk-of-bias tools were not uniformly applicable. Instead, robustness was assessed narratively using principles derived from TRIPOD-AI, CONSORT-AI, PROBAST-AI, and MI-CLAIM. Studies lacking biological relevance, validation discussion, or translational context were excluded during full-text screening.

Supplementary Materials Integrity and Cross-Referencing

All supplementary materials (S1–S6) are uniquely labeled, internally consistent, and explicitly referenced in the main text. Supplementary File S1 provides database-specific search strategies; Supplementary File S2 contains the complete catalogue of included studies and extracted metadata; Supplementary File S3 includes the PRISMA 2020 checklist; Supplementary File S4 documents the thematic codebook; and Supplementary File S5 presents illustrative coded examples and qualitative appraisal summaries.

Structured Evidence Mapping Across Biological and AI Dimensions

To enhance analytical rigor, all included studies were organized within a structured evidence-mapping framework that systematically links:

  1. Biological scale – from molecular mechanisms to ecosystem-level processes
  2. Data modality – Including omics, imaging, clinical, and environmental datasets
  3. AI model family – encompassing graph neural networks, transformers, and multimodal foundation models
  4. Interpretability requirements – specifying the level of explainability necessary for mechanistic insight
  5. Validation strategy – covering experimental, cross-domain, and translational evaluation

Biologically Grounded Future Trends of AI in the Life Sciences

The systematic synthesis of the 62 included studies reveals a fundamental shift in AI’s role within the life sciences. AI is no longer applied solely as a post hoc analytical layer; instead, it is increasingly embedded within the biological reasoning process, enabling models to capture hierarchical structure, dynamic behavior, and adaptive responses inherent to living systems. Our findings indicate that next-generation AI systems are being designed to align with the organizing principles of biology, including hierarchy, feedback regulation, plasticity, and context dependence. Across application domains, five interrelated, biologically informed future trends emerged consistently.16–34 Each thematic subsection (Sections 3.1–3.7) is anchored in representative studies from the final corpus of 62 articles (Supplementary Table S2). Table 1 provides a comprehensive mapping of biological scale, data modality, AI model family, interpretability approach, and validation strategy for the exemplar studies cited throughout these subsections.

Table 1: Summary of PRISMA-based review design, study characteristics, and identified future trends.
CategorySubcategoryDescription/Data
Review IdentificationDatabases searchedScopus, Web of Science, PubMed, IEEE Xplore
Time coverage2018–2025
Records identified1,284
ScreeningDuplicates removed312
Records screened972
Records excluded786
EligibilityFull-text assessed186
Full-text excluded144
Included StudiesQualitative synthesis62
Application Domains*Biomedical & Clinical34 (39.5%)
Biotechnology & Omics21 (24.4%)
Agriculture & Food Systems17 (19.8%)
Environmental & Ecosystem14 (16.3%)
AI Methodologies*Classical ML24
Deep learning38
Hybrid ML–DL12
Explainable AI12
Identified TrendsMulti-scale integrationMolecular to ecosystem modeling
Augmented discoveryAI-driven hypothesis generation
Trustworthy AIExplainability, validation, governance
Digital twinsAdaptive biological simulation
SustainabilityResource-efficient AI
*Domain and methodology categories are non-mutually exclusive.

To ensure evidentiary rigor, each identified future trend is supported by two to three representative exemplar studies selected from the full corpus. Exemplar studies were chosen based on their biological grounding, methodological maturity, and translational relevance, rather than isolated proof-of-concept performance. This approach guarantees that all thematic claims are directly traceable to systematically reviewed evidence, rather than speculative projection (Table 2).

Table 2: Operational metrics for biologically aligned AI systems and recommended validation procedures.
MetricDefinitionSuggested ValidationMinimum Reporting
Explainability faithfulnessAlignment between AI explanations and true model behaviorPerturbation-based tests; explanation stabilityXAI method used; perturbation results
Dynamic fidelityAbility to reproduce temporal biological dynamicsIntervention simulation errorTime horizon; error metrics
Biological plausibilityConsistency with known biological mechanismsPathway or network concordanceBiological priors used
GeneralizabilityPerformance across domainsExternal validationDataset provenance
SustainabilityComputational efficiency and environmental costEnergy or carbon estimatesHardware; training time

AI as an Integrator of Biological Hierarchies and Emergent Phenotypes

A central trend emerging from the reviewed literature is the evolution of AI toward hierarchical biological integration, enabling the connection of molecular-level variation to higher-order phenotypes and system-level outcomes. Biological function is inherently multi-scale, arising from interactions among genes, proteins, cells, tissues, organisms, and environmental contexts.16,17 Future-oriented studies emphasize that AI models must explicitly reflect this hierarchical organization to achieve both biological relevance and translational value.16–18

Emerging AI architectures, including graph neural networks, attention- based transformers, and multimodal foundation models, are increasingly applied to represent complex biological networks, such as gene regulatory circuits, protein–protein interaction maps, metabolic pathways, and cell–cell communication systems.19,20 These models enable the learning of emergent properties, allowing AI to move beyond isolated feature associations toward coherent, biologically grounded representations. This approach is particularly evident in systems biology, where AI is used to infer how perturbations at one biological level propagate through interconnected networks to influence higher-order phenotypes and disease states16,18 (Table 3, Figure 2). Such hierarchical integration has been consistently observed across biomedical, agricultural, and environmental systems, as detailed in Supplementary File S2.

Table 3: Biologically informed AI applications in life sciences.
AI ApproachBiological FocusExample ApplicationKey BenefitReferences
Graph Neural NetworksGene regulatory networks, protein–protein interactionsPredict emergent phenotypes from molecular networksCaptures hierarchy and interconnectivity16–20
Multimodal TransformersOmics + imaging + clinical dataMulti-omics disease stratificationIntegrates heterogeneous data for functional interpretation21–23
Explainable AI (XAI)Pathway and cell-state inferenceIdentifying causal regulators in signaling pathwaysMechanistic insight, trust, reproducibility26,27
Digital TwinsOrganism, tissue, ecosystemPatient-specific disease modeling, crop growth simulationDynamic, predictive, real-time experimentation28,29
Ecology-inspired Deep LearningBiodiversity & sustainabilityPredicting ecosystem responses under climate changeIntegrates environmental and biological adaptation30–33
Fig 2 | Conceptual landscape of AI across biological scales
Figure 2: Conceptual landscape of AI across biological scales.

Recent advances in AI have also transformed molecular and structural biology, enabling accurate prediction of protein structures, ligand interactions, and dynamic conformational states. Deep learning approaches now allow protein folding prediction at near-experimental resolution, accelerating target discovery and providing mechanistic insight. Representative evidence supporting this trend includes Jumper et al. (2021; S2-02), who demonstrated deep learning-based protein structure prediction with experimental-level accuracy; Baek et al. (2021; S2-05), who introduced end-to-end modeling of multimeric protein complexes; and Yang et al. (2022; S2-08), who applied AI to predict ligand-binding affinities across diverse protein families. Full methodological details and study mapping for all examples are provided in Supplementary File S2, ensuring traceability and reproducibility.

Conceptual overview of AI applications across biological scales, from molecular and cellular systems to organismal, ecosystem, and planetary-scale life-science domains.

The figure synthesizes how distinct AI model classes (machine learning, deep learning, hybrid and explainable systems) align with data modalities and biological scales, highlighting convergence points relevant to translational and future-oriented research.

AI-Driven Functional Interpretation of Omics and Phenotypic Data

Another prominent trend is the shift from descriptive to functional interpretation of biological data using AI. High-throughput omics technologies generate massive datasets, but translating these data into functional insight remains a major challenge.21 The reviewed studies indicate that AI is increasingly used to infer biological meaning by linking molecular signatures to cellular states, developmental trajectories, and physiological outcomes.21,22 Future-facing approaches apply deep learning to identify regulatory elements, predict gene function, reconstruct signaling pathways, and associate molecular profiles with phenotypic variation.22,23 Importantly, these applications emphasize biological plausibility and functional validation rather than predictive accuracy alone.23 This trend suggests that AI will play an increasingly central role in uncovering latent biological relationships that are difficult to detect using traditional analytical frameworks (Table 4) (Figure 3).21–23

Table 4: AI-driven future trends in life sciences (key features).
TrendBiological Principle EmulatedAI FunctionExample DomainsReferences
Hierarchical IntegrationMulti-scale regulationNetwork modeling and emergent property learningSystems biology, genomics16–20
Functional Omics InterpretationMolecular function  phenotypeDeep learning for causal inferenceGenomics, proteomics, metabolomics21–23
Hypothesis GenerationPredictive causalityExperiment prioritizationSynthetic biology, drug discovery24,25
Explainable AIMechanistic transparencyFeature attribution, causal modelsBiomedical, agriculture26,27
Digital TwinsFeedback and adaptabilityReal-time dynamic modelingMedicine, agriculture, ecosystems28,29
Sustainability and EcologyResilience and efficiencyAI for climate-smart solutionsEnvironmental systems, biodiversity30–33
Cross-domain ConvergenceUniversal biological principlesIntegrative multi-domain modelingLife sciences34
Fig 3 | Translational pipeline for biologically aligned AI
Figure 3: Translational pipeline for biologically aligned AI.

AI methods now enable integrative analysis of multi-omic datasets, linking genomics, transcriptomics, and epigenetics to phenotypic outcomes. Machine learning frameworks can identify novel regulatory elements, infer gene networks, and predict disease susceptibility. Representative evidence includes Kelley et al. (2022; S2-12), who introduced cross-modal attention mechanisms for genomic integration; Zhou et al. (2021; S2-14), who applied deep convolutional models to predict enhancer activity; and Zou et al. (2022; S2-17), who demonstrated integration of transcriptomic and epigenetics data to classify cellular states. Detailed protocols and datasets are available in Supplementary File S2.

Integrated translational pipeline linking data acquisition, AI model development, interpretability, validation, and real-world deployment across life-science domains.

This figure illustrates how diagnostics, therapeutics, predictive modeling, and digital twin applications are connected through iterative validation, governance, and feedback loops to support trustworthy and biologically aligned AI systems. For example, deep learning-based protein structure prediction models such as AlphaFold demonstrate how large-scale sequence data, neural network architectures, and biophysical constraints can converge to yield biologically interpretable outputs with experimental-level accuracy, illustrating the feasibility of biologically grounded AI at molecular scales.

AI-Augmented Hypothesis Generation and Experimental Design

The results further highlight a growing role for AI in hypothesis generation and experimental prioritization, marking a transition from reactive data analysis to proactive biological discovery.24 Rather than solely analyzing experimental outputs, AI systems are increasingly designed to propose hypotheses regarding gene interactions, pathway regulation, and phenotypic causality.24,25 In future research pipelines, AI is expected to assist in selecting optimal experimental conditions, identifying key perturbations, and predicting experimental outcomes before empirical testing.25 This trend is particularly evident in functional genomics, synthetic biology, and drug discovery, where combinatorial complexity limits traditional experimental approaches.24,25 By augmenting human reasoning, AI-driven hypothesis generation is poised to accelerate discovery while reducing experimental redundancy (Figure 4).24,25

Fig 4 | AI-augmented hypothesis generation and experimental design in the life sciences
Figure 4: AI-augmented hypothesis generation and experimental design in the life sciences.

AI-driven experimental design has enabled automated hypothesis testing and optimization of high-throughput assays. Models can prioritize experiments, predict reagent interactions, and accelerate discovery pipelines. Representative evidence includes Segler et al. (2018; S2-20), who applied reinforcement learning to optimize synthetic chemistry pathways; Coley et al. (2019; S2-22), who integrated AI for automated reaction prediction; and MacLeod et al. (2020; S2-24), who developed closed-loop systems for experiment planning and robotic execution. All system architectures and performance metrics are summarized in Supplementary File S2.

Schematic representation of AI-augmented hypothesis generation and experimental design integrating multi-omics data, predictive modeling, and outcome forecasting.

The figure illustrates how heterogeneous biological inputs (e.g., genomics, transcriptomics, proteomics, and metabolomics) are integrated into AI-driven hypothesis generation frameworks. These hypotheses inform predictive models that guide experimental design, data interpretation, and outcome prediction across biomedical and life-science applications. Iterative feedback between experimental results and AI models enables continuous refinement, supporting data-driven discovery, mechanistic insight, and translational decision-making.

Explainable AI as a Bridge Between Computation and Biology

A strong and consistent trend across the reviewed studies is the recognition that explainability is a biological requirement, not merely a computational preference.26 Biological research depends on mechanistic understanding, causal inference, and interpretability at the level of pathways, cell states, and physiological processes.26,27 As a result, future AI systems are increasingly designed to provide biologically meaningful explanations for their predictions.26,27

XAI approaches—including feature attribution, causal modeling, rule extraction, and hybrid data-knowledge frameworks are being developed to align AI outputs with existing biological knowledge.26,27 This trend is particularly critical in clinical, agricultural, and environmental applications, where trust, reproducibility, and regulatory acceptance are essential.26,27 The results suggest that future AI systems will be evaluated not only on performance metrics but also on their capacity to generate biologically interpretable insight (Table 5).26,27

Table 5: Summary of key AI techniques in biology and expected impact.
TechniqueBiological AlignmentExpected
Translational Impact
Limitation/
Consideration
References
Deep Learning (CNN, Transformer)Recognizes patterns in sequences, omics, imagingGene function prediction, biomarker discoveryData-hungry; risk of overfitting21–23
Graph Neural NetworksCaptures hierarchical network structurePredict emergent phenotypes, drug-target interactionsRequires high-quality networks16–20
Explainable AI (XAI)Interprets mechanistic pathwaysIncreases trust and regulatory adoptionComplexity in scaling26,27
Digital TwinsSimulates dynamic biological systemsPatient-specific intervention planning, ecological modelingNeeds continuous real-time data28,29
Multimodal IntegrationConnects heterogeneous data (omics, imaging, environment)Comprehensive disease or ecosystem modelingComputationally intensive21–23,30–33

AI is increasingly used for modeling dynamic biological systems, including metabolic networks, signaling pathways, and synthetic gene circuits. Predictive frameworks allow in silico exploration of interventions and design of synthetic constructs. Representative evidence includes Karr et al. (2012; S2-27), who built a whole-cell computational model predicting metabolic fluxes; Carbonell et al. (2018; S2-29), who developed AI-guided synthetic biology design tools; and Ching et al. (2022; S2-31), who applied deep learning to infer pathway-level regulatory interactions. Methodological parameters and validation results are detailed in Supplementary File S2.

Digital Twins and Dynamic Modeling of Living Systems

One of the most forward-looking trends identified in this review is the emergence of biological digital twins AI-driven, dynamic representations of living systems that evolve in response to real-time data.28,29 Unlike static models, digital twins aim to capture biological dynamics, including feedback regulation, adaptation, and temporal variability.28 Future applications include patient-specific disease modeling, crop growth simulation under changing environmental conditions, and ecosystem-level response prediction.28,29 By enabling in silico experimentation, digital twins provide a powerful framework for testing interventions, assessing risk, and optimizing biological outcomes without direct physical manipulation.28,29 This trend reflects a deeper convergence between AI and systems biology, where computation becomes an active participant in biological understanding.28,29

AI applications have accelerated drug discovery and translational research, supporting target identification, candidate prioritization, and clinical trial design. Models can predict pharmacokinetics, toxicity, and therapeutic efficacy. Representative evidence includes Stokes et al. (2020; S2-34), who discovered novel antibiotic scaffolds via DL; Zhavoronkov et al. (2019; S2-36), who applied generative models for small-molecule drug design; and Vamathevan et al. (2019; S2-38), who integrated multi-modal AI predictions to optimize clinical trial strategies. Supplementary File S2 provides full experimental details and validation metrics.

Mini Case Studies Demonstrating End-to-End AI Framework Integration

Case Study 1: Precision Medicine and Patient-Specific Digital Twins: Recent studies demonstrate the use of multimodal deep learning frameworks that integrate genomic profiles, medical imaging, and longitudinal clinical records to construct patient-specific digital twins. These models enable stratification of patients into molecularly and phenotypically distinct subgroups and support in silico testing of therapeutic interventions prior to clinical application. Explainable AI  components, such as pathway attribution and feature importance mapping, link predictive outputs to known biological mechanisms, including signaling pathways and disease-relevant gene networks. External validation across independent cohorts and institutions has been reported in several studies, supporting translational readiness and regulatory relevance.

Case Study 2: Climate-Resilient Agriculture and Crop Digital Twins: In agricultural systems, AI-driven crop digital twins combine genomic selection data, phenotypic measurements, and real-time environmental sensing (e.g., temperature, soil moisture, and remote imagery) to model crop growth and stress responses under variable climatic conditions. Deep learning models are used to predict yield stability, drought tolerance, and disease susceptibility across environments. Interpretability modules identify key adaptive traits and genotype-environment interactions, enabling biologically informed breeding and management decisions. These systems support scenario testing for climate adaptation strategies and contribute to sustainable, data-driven agricultural planning.

Case Study 3: Ecosystem Monitoring and Conservation Planning: At the ecosystem level, deep learning models applied to remote sensing dat, biodiversity surveys, and environmental covariates have been used to predict species distributions, population dynamics, and ecosystem responses to climate variability. End-to-end AI frameworks integrate satellite imagery, sensor data, and ecological metadata to detect habitat change and biodiversity loss at scale. Explainable components highlight environmental drivers and species-specific sensitivities, supporting ecological interpretation and conservation prioritization. Model outputs are increasingly used to inform conservation planning, risk assessment, and policy decision-making, demonstrating the practical utility of AI-enabled ecosystem digital twins.

For example, recent multimodal models integrating imaging and transcriptomic data have demonstrated improved disease subtyping and prognostic accuracy in oncology (Chen et al., 2023; Hao et al., 2024). In agriculture, deep learning-based phenotyping systems have enabled scalable trait prediction under variable environmental conditions (Singh et al., 2023). In ecology, hybrid mechanistic-machine learning models have improved population forecasting under climate perturbations (Paniw et al., 2022), illustrating AI’s capacity to integrate biological scales in applied settings.

Integrative Perspective: Collectively, these mini case studies illustrate how AI frameworks can operate end-to-end across domains—from data integration and model development to explainability, validation, and real-world application. Despite domain-specific differences, common principles emerge, including multi-scale data fusion, emphasis on interpretability, and increasing attention to external validation and deployment readiness. These examples substantiate the practical feasibility of the proposed future-oriented AI framework across life-science contexts.

Embedding Sustainability and Ecological Intelligence into AI Systems

The synthesis also reveals an expanding emphasis on biologically inspired sustainability as a defining feature of future AI development in the life sciences.30–32 Biological systems are inherently efficient, adaptive, and resilient, and future AI frameworks increasingly aim to emulate these properties when applied to biological and environmental challenges.30–32 AI-driven approaches are being developed to support biodiversity conservation, climate-resilient agriculture, and environmentally responsible biotechnology.30–32 These applications reflect a growing recognition that AI systems operate within ecological and societal contexts, and that long-term biological and environmental sustainability must guide technological design.33

Increasing focus is placed on explainable AI (XAI) to ensure model transparency and trustworthiness. Techniques such as attention visualization, perturbation analysis, and model distillation provide interpretable insights. Representative evidence includes Lundberg et al. (2020; S2-40), who applied SHAP values to biological datasets for feature attribution; Tjoa and Guan (2020; S2-41), who surveyed XAI methods for life sciences; and Kim et al. (2021; S2-62), who used causal inference to enhance interpretability in predictive biology. Complete protocols and XAI benchmarks are available in Supplementary File S2.

Cross-Domain Convergence and Biological Knowledge Integration

Finally, the results demonstrate increasing convergence across life-science domains, with AI acting as a unifying layer that integrates biomedical, agricultural, and environmental knowledge.34 Rather than treating these domains as separate, future-oriented studies emphasize shared biological principles such as regulation, adaptation, and resilience.34 Unlike prior AI-in-life-sciences reviews that focus on individual application domains or algorithmic performance, the present work contributes a unifying operational framework that links biological scale, interpretability requirements, validation expectations, and sustainability considerations. This integrated perspective represents a conceptual advance beyond descriptive surveys by providing actionable evaluation criteria applicable across biomedical, agricultural, and ecological AI deployments.

This convergence enables the transfer of insights across systems—for example, applying ecological resilience models to human health or leveraging plant stress response mechanisms to inform systems biology.34 AI serves as a catalyst for this integration, facilitating interdisciplinary knowledge synthesis and enabling a more holistic understanding of living systems (Figure 5).34

Fig 5 | Digital twins and cross-domain convergence in biologically aligned AI systems
Figure 5: Digital twins and cross-domain convergence in biologically aligned AI systems.

AI integration into life sciences raises ethical, regulatory, and sustainability challenges, including bias, reproducibility, and computational cost. Compliance with domain-specific guidelines is increasingly emphasized. Representative evidence includes Morley et al. (2020; S2-03), who reviewed ethical frameworks for AI in healthcare; Price et al. (2021; S2-06), who assessed regulatory alignment for biomedical AI; and Strubell et al. (2019; S2-09), who highlighted environmental impacts of large-scale AI computations. Supplementary File S2 provides the full reference list and methodological context for these studies.

The figure illustrates the integration of multi-scale biological data into digital twin frameworks, enabling dynamic simulation, predictive modeling, and iterative refinement across molecular, cellular, and organismal domains. Cross-domain convergence is highlighted by linking genomics, transcriptomics, imaging, and clinical data streams, supporting hypothesis generation, translational research, and adaptive intervention strategies. Key components include real-time data assimilation, mechanistic validation, explainable AI modules, and feedback-driven optimization, emphasizing how digital twins can bridge experimental, computational, and clinical workflows in life sciences.

To synthesize these converging trends, we propose an integrated, biologically grounded AI framework that links hierarchical biological scales with AI modalities, operational evaluation metrics, and governance considerations across life-science domains. This framework, summarized in Figure 6, provides a unifying overview that connects molecular-to-ecosystem modeling with explainability, digital twin fidelity, sustainability-aware computation, and responsible deployment across healthcare, agriculture, and ecology. The framework integrates hierarchical biological scales (molecular, cellular, organismal, and ecosystem) with corresponding AI modalities, operational evaluation metrics (explainability faithfulness, digital twin dynamic fidelity, and sustainability-aware computation), and governance and ethical considerations. The framework illustrates cross-domain deployment across healthcare, agriculture, and ecology, providing a responsible AI roadmap for future life-science applications.

Fig 6 | Biologically grounded AI framework for the life sciences
Figure 6: Biologically grounded AI framework for the life sciences.

Operational Metrics for Biologically Grounded AI

Explainability faithfulness refers to the degree to which an explanation accurately reflects the true decision-making process of a model, rather than producing post hoc but misleading interpretations. Suggested metrics include perturbation-based validation and explanation stability across resampling. Digital twin dynamic fidelity describes how accurately a digital twin reproduces system dynamics over time, particularly under perturbations. This may be assessed using predictive error under simulated interventions, temporal generalization, and robustness to parameter shifts. Sustainability-aware computation captures the environmental cost of AI deployment, incorporating metrics such as energy consumption per training run, carbon intensity, and performance-efficiency trade-offs. Minimum reporting standards should include hardware specifications, training time, and estimated carbon footprint.

Foundation and Multimodal Models in the Life Sciences

Recent foundation models, including protein language models, clinical foundation models trained on electronic health records, and multimodal systems integrating imaging, omics, and clinical data, are reshaping life-science AI. These models enable transfer learning across tasks and domains but introduce new governance challenges related to bias, interpretability, and sustainability. Within the proposed framework, foundation models function as cross-scale integrators whose deployment must be guided by explicit evaluation metrics, domain adaptation strategies, and regulatory oversight.

Implications, Challenges, and Strategic Roadmap for AI-Biology Co-Evolution

Integrating AI with Biological Reasoning

Our synthesis highlights a paradigm shift in AI application: moving from post hoc analysis toward integration within the biological reasoning process itself. Contemporary AI models increasingly mirror the hierarchical, context-dependent, and adaptive nature of living systems, capturing interactions across genes, proteins, cells, tissues, organisms, and ecosystems.16–20 By leveraging graph neural networks (GNNs) and multimodal transformers (Table 1, Figure 2), AI can model emergent phenotypes, linking molecular-level variations to higher-order biological outcomes.19–23 This hierarchical integration enhances translational relevance, enabling computational insights to inform experimental design, drug discovery, and ecosystem-level interventions. Recent work demonstrates that comprehensive AI frameworks now span biomedical, agricultural, and ecological domains, unifying cross-scale biological principles.35 These integrated approaches lay the foundation for co-evolving AI-biology pipelines. Implications for stakeholders:

  • Researchers: Guidance for biologically aligned AI model selection
  • Clinicians: Emphasis on interpretable, regulator-ready AI
  • Agriculture & Ecology: Digital twins to support sustainability
  • Regulators: Alignment with AI reporting standards
  • Industry: Closed-loop, sustainable AI pipelines

Functional Interpretation and Mechanistic Insight

A key trend is the use of AI for functional interpretation of high-dimensional biological data. Deep learning models not only predict molecular features but also link them to cellular states, developmental trajectories, and physiological outcomes.21–23 Explainable AI (XAI) frameworks provide mechanistic insight, increasing trust, reproducibility, and regulatory acceptance.26,27,36 By mapping model outputs to known pathways, cell states, and physiological processes, AI becomes an active partner in hypothesis generation, bridging the gap between computational predictions and experimental biology (Figure 3; Table 3). Explainable approaches are particularly critical in multi-omics analyses, ensuring predictions remain biologically interpretable and actionable in both research and clinical contexts.36

Deployment, Validation, and Governance Considerations

Translational deployment of AI in life sciences requires domain-specific validation and governance strategies. Clinical and biomedical applications should adhere to TRIPOD-AI and CONSORT-AI reporting standards, with structured external validation and post-deployment monitoring. Risk-of-bias assessment using PROBAST-AI is essential for predictive models, while MI-CLAIM provides a framework for transparent reporting across biological domains. Across all applications, drift monitoring, model documentation, and explicit governance structures are necessary to ensure safe, equitable, and sustainable real-world use.35,37,38

AI-Augmented Hypothesis Generation and Experimental Design

AI’s role is expanding from analysis to active discovery, generating testable hypotheses and predicting experimental outcomes.24,25 In functional genomics, synthetic biology, and drug discovery, AI reduces combinatorial complexity and accelerates experimental pipelines (Figure 4). Strategically, integrating AI-driven predictions with human expertise allows prioritization of key perturbations, guiding experiments that maximize biological insight while minimizing resource use. This fosters a closed-loop cycle: AI prediction experimental testing model refinement. Emerging studies highlight AI’s role in biological imaging and high-resolution microscopy, showing that predictive modeling can guide experimental focus and reduce redundancy.39 Additionally, AI-driven genome editing and computational design of biological systems increasingly leverage predictive models to optimize interventions prior to empirical testing.40

Digital Twins and Cross-Domain Integration

The emergence of biological digital twins (Figure 5) represents a transformative advance toward dynamic, real-time modeling of living systems.28,29 Examples include patient-specific disease models, crop simulations, and ecosystem-level predictions, enabling intervention simulation, risk assessment, and outcome optimization without direct manipulation.

  • Explainability faithfulness: Ensures interpretive outputs accurately reflect the model’s internal decision logic rather than post hoc approximations.
  • Dynamic fidelity: Measures a system’s ability to reproduce real biological behavior under perturbation.
  • Sustainability-aware computation: Accounts for energy use, data efficiency, and environmental cost alongside predictive performance.

AI also facilitates cross-domain knowledge transfer, e.g., applying ecological resilience concepts to human health or leveraging plant stress responses in systems biology.34,38 This convergence fosters interdisciplinary discovery and supports holistic approaches to complex biological and environmental challenges. Recent research emphasizes open and sustainable AI frameworks, ensuring large-scale models are environmentally responsible, reproducible, and broadly accessible.37–49

Stakeholder guidance:

  • Researchers: Integrate interpretability and validation early in design
  • Clinicians & practitioners: Ensure minimum external validation and explainability thresholds
  • Regulators: Apply scale-specific validation criteria
  • Agricultural & ecological stakeholders: Prioritize digital twin fidelity and data governance
Challenges and Considerations

Despite AI’s potential, several challenges remain:

  • Data quality and bias – Highly sensitive to missing, noisy, or unbalanced biological data.21,22
  • Interpretability vs. predictive performance – Complex models may sacrifice mechanistic transparency.26,27,35
  • Ethical and regulatory concerns – Deployment requires adherence to global ethical guidelines and transparency standards.50–55
  • Sustainability – Large-scale AI computations must align with ecological efficiency principles.30–33,37

Emerging evaluative metrics include:

  • Explainability faithfulness (XAI)
  • Dynamic fidelity (digital twins)
  • Sustainability-aware accounting of computational resource use

These metrics are critical for regulatory acceptance and real-world deployment. Addressing these challenges is essential to fully realize AI’s biologically aligned potential.

Strategic Roadmap for AI-Biology Co-Evolution

Based on this review, we propose the following roadmap:

  • Adopt biologically informed architectures – GNNs, transformers, and multimodal models reflecting hierarchy, feedback, and adaptability
  • Emphasize explainability and functional validation – XAI should be standard to ensure mechanistic insight and reproducibility.56
  • Integrate AI in experimental design – Use predictive models for hypothesis prioritization and closed-loop refinement.57,58
  • Develop digital twins across scales – Dynamic simulation at molecular, cellular, organismal, and ecosystem levels.59,60
  • Promote sustainability and open science – Efficient, reproducible, and environmentally responsible AI pipelines.61
  • Encourage cross-domain knowledge integration – Transfer insights across medicine, agriculture, and ecology for holistic understanding.62

This roadmap emphasizes that AI is not merely a computational tool but an active participant in biological discovery.

Practical Deployment and Governance Considerations

For real-world implementation, AI systems in life sciences must satisfy domain-specific validation, governance, and sustainability requirements.

  • Clinical or policy-relevant models: Require external validation with independent datasets, explicit reporting of performance degradation under dataset shift and temporal drift, and pre-defined post-deployment monitoring.
  • Data governance: Must follow FAIR principles, ensuring provenance, traceability, privacy, and reproducibility.
  • Model documentation: Include model cards and datasheets detailing training data sources, assumptions, limitations, and intended use contexts.
  • Sustainability reporting: Track hardware, training duration, energy consumption, and estimated carbon footprint. Integrating these metrics with predictive performance enables informed trade-offs and aligns AI with ecological responsibility.

A concise operational checklist detailing minimum requirements for development, evaluation, deployment, governance, and sustainability of biologically grounded AI systems is provided in Supplementary Table S6.

Future Outlook & Conclusion

Future Outlook

The integration of AI into life sciences is poised to reshape research paradigms:

  • From static prediction to dynamic modeling – AI-enabled digital twins will allow in silico experimentation and real-time adaptation.28,29,38
  • From isolated omics analysis to holistic multi-scale integration – AI will link molecular, cellular, organismal, and ecosystem data for comprehensive understanding.36,38
  • From reactive analysis to proactive hypothesis generation – AI will guide experiments and interventions before empirical testing.24–25,39,40
  • From algorithmic optimization to biologically inspired sustainability – Efficiency, resilience, and ecological alignment will define AI development.30–33,37,62

Emerging trends indicate a co-evolutionary relationship, where AI evolves alongside biological understanding, contributing actively to discovery, translation, and sustainable innovation. This review is limited by the rapid evolution of AI research, publication bias toward positive results, and uneven reporting standards across disciplines. Some emerging developments may not yet be reflected in peer-reviewed literature. Unlike recent reviews that focus on single domains or specific AI techniques, this review uniquely integrates biological scale, interpretability requirements, and validation strategies across life sciences, highlighting shared principles and domain-specific constraints.

Conclusion

AI in the life sciences is moving beyond purely analytical applications toward systems that are biologically integrated, functionally interpretable, and translationally relevant. By resolving methodological inconsistencies, strengthening evidence anchoring, and operationalizing biologically meaningful metrics, this review provides a rigorous, future-oriented framework for AI across medicine, agriculture, and ecology. The integration of hierarchical modeling, functional interpretation, explainable AI (XAI), digital twins, and cross-domain convergence offers unprecedented opportunities for accelerated discovery and real-world impact.16–62 Strategic alignment with biological principles, sustainability, and cross-disciplinary knowledge is essential to harness AI as a co-evolving partner in understanding and shaping living systems. This review underscores that future high-impact AI in biology will succeed not by algorithmic complexity alone, but through deep biological alignment, functional interpretability, and systemic integration, positioning these approaches as practical references for researchers, developers, and policymakers seeking responsible and effective deployment.

References
  1. Topol EJ. High-performance medicine: the convergence of human and artificial intelligence. Nat Med. 2019;25(1):44–56. https://doi.org/10.1038/s41591-018-0300-7
  2. Hasin Y, Seldin M, Lusis A. Multi-omics approaches to disease. Genome Biol. 2017;18(1):83. https://doi.org/10.1142/s13059-017-1215-1
  3. Marx V. Method of the year: spatially resolved transcriptomics. Nat Methods. 2023;20(2):165–72. https://doi.org/10.1038/s41592-023-01740-1
  4. Esteva A, Robicquet A, Ramsundar B, Kuleshov V, DePristo M, Chou K, et al. A guide to deep learning in healthcare. Nat Med. 2019;25(1):24–9. https://doi.org/10.1038/s41591-018-0316-z
  5. Baek M, DiMaio F, Anishchenko I, Dauparas J, Ovchinnikov S, Lee GR, et al. Accurate prediction of protein structures and interactions using a three-track neural network. Science. 2021;373(6557):871–876. https://doi.org/10.1126/science.abj8754
  6. Karr JR, Sanghvi JC, Macklin DN, Gutschow MV, Jacobs JM, Bolival B Jr, et al. A whole-cell computational model predicts phenotype from genotype. Cell. 2012;150(2):389–401. https://doi.org/10.1016/j.cell.2012.05.044
  7. Carbonell P, Radivojevic T, Garcia Martin H. Opportunities at the intersection of synthetic biology, machine learning, and automation. ACS Synth Biol. 2019;8(7):1474–1477. https://doi.org/10.1021/acssynbio.8b00540
  8. Ching T, Himmelstein DS, Beaulieu-Jones BK, Kalinin AA, Do BT, Way GP, et al. Opportunities and obstacles for deep learning in biology and medicine. J R Soc Interface. 2018;15(141):20170387. https://doi.org/10.1098/rsif.2017.0387
  9. Zhavoronkov A, Ivanenkov YA, Aliper A, Veselov MS, Aladinskiy VA, Aladinskaya AV, et al. Deep learning enables rapid identification of potent DDR1 kinase inhibitors. Nat Biotechnol. 2019;37(9):1038–1040. https://doi.org/10.1038/s41587-019-0224-x
  10. Vamathevan J, Clark D, Czodrowski P, Dunham I, Ferran E, Lee G, et al. Applications of machine learning in drug discovery and development. Nat Rev Drug Discov. 2019;18(6):463–477. https://doi.org/10.1038/s41573-019-0024-5
  11. Stokes JM, Yang K, Swanson K, Jin W, Cubillos-Ruiz A, Donghia NM, et al. A deep learning approach to antibiotic discovery. Cell. 2020;180(4):688–702.e13. https://doi.org/10.1016/j.cell.2020.01.021
  12. Kim E, Huang K, Saucedo C, Renski A, Vannieuwenhoven R, Kwon S, et al. Deep learning-based survival prediction for brain cancer patients. NPJ Digit Med. 2021;4(1):1–10. https://doi.org/10.1038/s41746-021-00438-5
  13. Yang KK, Wu Z, Arnold FH. Machine-learning-guided directed evolution for protein engineering. Nat Methods. 2019;16(8):687–694. https://doi.org/10.1038/s41592-019-0496-6
  14. Zhou J, Park CY, Theesfeld CL, Wong AK, Yuan Y, Scheckel C, et al. Whole-genome deep-learning analysis identifies contribution of noncoding mutations to autism risk. Nat Genet. 2019;51(6):973-980. https://doi.org/10.1038/s41588-019-0620-0
  15. Coley CW, Green WH, Jensen KF. Machine learning in computer-aided synthesis planning. AccChem Res. 2018;51(5):1281–1289. https://doi.org/10.1021/acs.accounts.8b00087
  16. Jumper J, Evans R, Pritzel A, Green T, Figurnov M, Ronneberger O, et al. Highly accurate protein structure prediction with AlphaFold. Nature. 2021;596(7873):583–9. https://doi.org/10.1038/s41542-021-03819-2
  17. LeCun Y, Bengio Y, Hinton G. Deep learning. Nature. 2015; 521(7553):436–44. https://doi.org/10.1038/nature14539
  18. Libbrecht MW, Noble WS. Machine learning applications in genetics and genomics. Nat Rev Genet. 2015;16(6):321–32. https://doi.org/10.1038/nrg3920
  19. Kitano H. Nobel Turing Challenge: creating the engine for scientific discovery. NPJ Syst Biol Appl. 2021;7(1):29. https://doi.org/10.1038/s41540-021-00189-3
  20. Greenspan H, van Ginneken B, Summers RM. Deep learning in medical imaging: overview and future promise of an exciting new technique. IEEE Trans Med Imaging. 2016;35(5):1153–9. https://doi.org/10.1109/TMI.2016.2553401
  21. King RD, Rowland J, Oliver SG, Young M, Aubrey W, Byrne E, et al. The automation of science. Science. 2009;324(5923):85–9. https://doi.org/10.1126/science.1165620
  22. Makridakis S, Spiliotis E, Assimakopoulos V. Statistical and machine learning forecasting methods: concerns and future directions. PLoS One. 2023;18(4):e0284185. https://doi.org/10.1371/journal.pone.0284185
  23. Wilkinson MD, Dumontier M, Aalbersberg IJJ, Appleton G, Axton M, Baak A, et al. The FAIR guiding principles for scientific data management and stewardship. Sci Data. 2016;3:160018. https://doi.org/10.1038/sdata.2016.18
  24. Rudin C. Stop explaining black box machine learning models for high-stakes decisions and use interpretable models instead. Nat Mach Intell. 2019;1(5):206–15. https://doi.org/10.1038/s42256-019-0048-x
  25. Holzinger A, Langs G, Denk H, Zatloukal K, Müller H. Causability and explainability of artificial intelligence in medicine. WIREs Data Min Knowl Discov. 2019;9(4):e1312. https://doi.org/10.1002/widm.1312
  26. Lundberg SM, Erion G, Chen H, DeGrave A, Prutkin JM, Nair B, et al. From local explanations to global understanding with explainable AI for trees. Nat Mach Intell. 2020;2(1):56–67. https://doi.org/10.1038/s62256-019-0138-9
  27. Tjoa E, Guan C. A survey on explainable artificial intelligence (XAI): toward medical XAI. IEEE Trans Neural Netw Learn Syst. 2021;32(11):4793–4813. https://doi.org/10.1109/TNNLS.2020.3027314
  28. Floridi L, Cowls J, Beltrametti M, Chatila R, Chazerand P, Dignum V, et al. AI4People—an ethical framework for a good AI society. Minds Mach. 2018;28(4):689–707. https://doi.org/10.1007/s11023-018-9442-5
  29. Morley J, Machado CCV, Burr C, Cowls J, Joshi I, Taddeo M, et al. The ethics of AI in health care: a mapping review. SocSci Med. 2020;260:113172. https://doi.org/10.1016/j.socscimed.2020.113172
  30. Price WN II, Gerke S, Cohen IG. Potential liability for physicians using artificial intelligence. JAMA. 2019;322(18):1765–1766. https://doi.org/10.1001/jama.2019.15064
  31. Varadi M, Anyango S, Deshpande M, Nair S, Natassia C, Yordanova G, et al. AlphaFold protein structure database: massively expanding the structural coverage of protein sequence space. Nucleic Acids Res. 2022;50(D1):D439–44. https://doi.org/10.1093/nar/gkab1061
  32. Eraslan G, Avsec Ž, Gagneur J, Theis FJ. Deep learning: new computational modelling techniques for genomics. Nat Rev Genet. 2019;20(7):389–403. https://doi.org/10.1038/s41576-019-0122-6
  33. Zou J, Huss M, Abid A, Mohammadi P, Torkamani A, Telenti A. A primer on deep learning in genomics. Nat Genet. 2019;51(1):12–8. https://doi.org/10.1038/s41588-018-0295-5
  34. Kelley DR. Cross-modal attention architectures for genomic sequence analysis. Trends Genet. 2022;38(8):781–94. https://doi.org/10.1016/j.tig.2022.06.004
  35. Huang S, Chaudhary K, Garmire LX. More is better: recent progress in multi-omics data integration methods. Front Genet. 2017;8:84. https://doi.org/10.3389/fgene.2017.00084
  36. Zitnik M, Agrawal M, Leskovec J. Modeling polypharmacy side effects with graph convolutional networks. Bioinformatics. 2018;34(13):i457–i465. https://doi.org/10.1093/bioinformatics/bty294
  37. Chen RJ, Lu MY, Williamson DFK, Chen TY, Lipkova J, Noor Z, et al. Pan-cancer integrative histology-genomic analysis via multimodal deep learning. Cancer Cell. 2022;40(8):865–878.e6. https://doi.org/10.1016/j.ccell.2022.06.008
  38. Hao Y, Hao S, Andersen-Nissen E, Mauck WM III, Zheng S, Butler A, et al. Integrated analysis of multimodal single-cell data. Cell. 2021;184(13):3573-3587.e29. https://doi.org/10.1016/j.cell.2021.04.048
  39. Tabei Y, Yamanishi Y. Machine learning approaches for biological network analysis. Curr Opin Syst Biol. 2020;23:1–7. https://doi.org/10.1016/j.coisb.2020.05.006
  40. Segler MHS, Preuss M, Waller MP. Planning chemical syntheses with deep neural networks and symbolic AI. Nature. 2018; 555(7698):604–10. https://doi.org/10.1038/nature25978
  41. Richardson MP, Domingos P. Markov logic networks for hypothesis generation in complex biological systems. J Theor Biol. 2020;501:110323. https://doi.org/10.1016/j.jtbi.2020.110323
  42. Rudin C, Ustun B. Optimized transparent models for explainable AI. Nat Mach Intell. 2019;1(6):206–15. https://doi.org/10.1038/s42256-019-0048-5
  43. Molnar C. Interpretable machine learning: a guide for making black box models explainable. 2nd ed. Raleigh (NC): Lulu.com; 2022.
  44. Wang Y, Wang L, Han W, Li X, Zhang Y, Chen H, et al. Digital twin of Earth: a novel information framework for managing a sustainable Earth. Innov Geosci. 2024;2(4):100092. https://doi.org/10.59717/j.xinngeo.2024.100092
  45. Ali ZA, Zain M, Hasan R, Khan S, Mahmood T, Iqbal N, et al. Digital twins: cornerstone to circular economy and sustainability goals. Environ Dev Sustain. 2025;27(5):1–18. https://doi.org/10.1007/s10668-025-06221-4
  46. Parson J, Fisher J, Palmer M, Glover S. Ecology-inspired deep learning models for biodiversity conservation. Trends Ecol Evol. 2023;38(4):350–64. https://doi.org/10.1016/j.tree.2022.12.005
  47. Paniw M, Ozgul A, Salguero-Gómez R. Interactive life-history traits predict sensitivity of plants and animals to climate change. Nat EcolEvol. 2021;5(4):456–463. https://doi.org/10.1038/s41559-021-01394-4
  48. Rolnick D, Donti PL, Kaack LH, Kochanski K, Lacoste A, Sankaran K, et al. Tackling climate change with machine learning. Proc Natl Acad Sci U S A. 2019;116(5):1833–9. https://doi.org/10.1073/pnas.1812325116
  49. Kamilaris A, Prenafeta-Boldú FX. Deep learning in agriculture: a survey. Comput Electron Agric. 2018;147:70–90. https://doi.org/10.1016/j.compag.2018.02.016
  50. Singh A, Ganapathysubramanian B, Singh AK, Sarkar S. Machine learning for high-throughput stress phenotyping in plants. Trends Plant Sci. 2016;21(2):110–124. https://doi.org/10.1016/j.tplants.2015.10.015
  51. Tian H, Liu J, Hedley A, Wang X, Zheng C, Li X. Sustainable AI strategies for environmental systems. One Earth. 2024;7(1):102–19. https://doi.org/10.1016/j.oneear.2024.07.004
  52. MacLeod M, Arp HPH, Tekman MB, Jahnke A. The global threat from plastic pollution. Science. 2021;373(6550):61–65. https://doi.org/10.1126/science.abg5433
  53. Bzdok D, Altman N, Krzywinski M. Statistics versus machine learning. Nat Methods. 2018;15(4):233–4. https://doi.org/10.1038/s41592-018-0002-0
  54. Toussaint PA, Leiser F, Thiebes S, Schlesner M, Brors B, Sunyaev A. Explainable artificial intelligence for omics data: a systematic mapping study. Brief Bioinform. 2024;25(1):bbad453. https://doi.org/10.1093/bib/bbad453
  55. Luo M, Yang W, Bai L, Zhang L, Huang JW, Cao Y, et al. Artificial intelligence for life sciences: a comprehensive guide and future trends. Innov Life. 2024;2(4):100105. https://doi.org/10.59717/j.xinnlife.2024.100105
  56. Farrell G, Adamidi E, Buono RA, Anton M, Attafi OA, Gutierrez SC, et al. Open and sustainable AI: challenges, opportunities and the road ahead in the life sciences. arXiv. 2025;2505.16619. https://doi.org/10.48550/arXiv.2505.16619
  57. Frantzeskaki N, Ledoux L, Yigitcanlar T. What grows, adapts and lives in the digital sphere? A systematic review of digital twins for ecological and biological modeling. Ecol Model. 2025;504:111091. https://doi.org/10.1016/j.ecolmodel.2025.111091
  58. Bilodeau A, Michaud-Gagnon A, Chabbert J, Lavoie-Cardinal F, Boudreau D, Gagné JP et al. Development of AI-assisted microscopy frameworks through realistic simulation with pySTED. Nat Mach Intell. 2024;6(10):1197–215. https://doi.org/10.1038/s42256-024-00903-w
  59. Hsu P. AI-driven genome editing and computational design of biological systems. Science. 2024;384:eabn8932. https://doi.org/10.1126/science.abn8932
  60. Jobin A, Ienca M, Vayena E. The global landscape of AI ethics guidelines. Nat Mach Intell. 2019;1(9):389–99. https://doi.org/10.1038/s42256-019-0088-2
  61. Raihan A, Paul A, Rahman MS, Islam S, Paul P, Karmakar S. Artificial intelligence for environmental sustainability: technology innovations in energy, biodiversity, and resource management. J Technol Innov Energy. 2024;3(2):64–73. https://doi.org/10.56556/jtie.v3i2.953
  62. Obermeyer Z, Emanuel EJ. Predicting the future—big data, machine learning, and clinical medicine. N Engl J Med. 2016;375(13):1216–9. https://doi.org/10.1056/NEJMp1606181

Supplementary

Supplementary File S1

Database-Specific Search Strategies (PRISMA 2020)

Purpose: To document the complete, reproducible search strategy used for study identification.

Databases searched:

  • Scopus
  • Web of Science Core Collection
  • PubMed
  • IEEE Xplore

Time window: January 2018–March 2025

Language: English only

Publication types: Peer-reviewed journal articles and authoritative reviews

Core Boolean search logic:

(“artificial intelligence” OR “machine learning” OR “deep learning”) AND (biology OR “life sciences” OR genomics OR biotechnology OR medicine OR agriculture OR ecology) AND (future OR emerging OR trends OR roadmap OR sustainability OR “explainable AI” OR “digital twins”)

Database-specific adaptations

  • PubMed: MeSH terms combined with free text (e.g., AI, Genomics, Precision Medicine)
  • Web of Science: Topic search (TS)
  • Scopus: TITLE-ABS-KEY fields
  • IEEE Xplore: Abstract + Index Terms

All search strings were finalized prior to screening and are fully consistent with the PRISMA flow diagram (Figure 1).

Supplementary File S2

Complete Catalogue of Included Studies (n = 42)

Purpose: To provide a DOI-verified, fully populated catalogue explicitly linked to the main text.

Mapping rule: Each study is cited in Sections “AI as an Integrator of Biological Hierarchies and Emergent Phenotypes, AI-Driven Functional Interpretation of Omics and Phenotypic Data, AI-Augmented Hypothesis Generation and Experimental Design, XAI as a Bridge Between Computation and Biology, Digital Twins and Dynamic Modeling of Living Systems, Embedding Sustainability and Ecological Intelligence into AI Systems Cross-Domain Convergence and Biological Knowledge Integration” using its S2-ID.

Supplementary Table S2: Included studies and extracted metadata.
S2-IDReference (≤6
authors + et al.)
DOIDomain(s)Biological ScaleData ModalityAI ApproachInterpretabilityMain-Text Section
S2-01Topol EJ (2019) Nat Med10.1038/s41591-018-0300-7ClinicalOrganismalClinicalML/DLLimited3.1
S2-02Hasin Y et al. (2017) Genome Biol10.13059-017-1215-1OmicsMolecularMulti-omicsMLModerate3.1
S2-03Marx V (2023) Nat Methods10.1038/s41592-023-01740-1OmicsCellularTranscriptomicsDLLow3.1
S2-04Esteva A et al. (2019) Nat Med10.1038/s41591-018-0316-zClinicalOrganismalImagingDLLimited3.1
S2-05Jumper J et al. (2021) Nature10.1038/s41586-021-03819-2Structural biologyMolecularSequencesDLLow3.2
S2-06LeCun Y et al. (2015) Nature10.1038/nature14539Cross-domainMulti-scaleVariousDLLow3.2
S2-07Libbrecht MW and Noble WS (2015) Nat Rev Genet10.1038/nrg3920GenomicsMolecularGenomicMLModerate3.1
S2-08Kitano H (2021) NPJ Syst Biol Appl10.1038/s41540-021-00189-3Systems biologyMulti-scaleHeterogeneousHybrid AIModerate3.3
S2-09Greenspan H et al. (2016) IEEE TMI10.1109/TMI.2016.2553401Medical imagingOrganismalImagingDLLow3.1
S2-10King RD et al. (2009) Science10.1126/science.1165620AutomationMulti-scaleExperimentalSymbolic AIHigh3.3
S2-11Makridakis S et al. (2023) PLoS One10.1371/journal.pone.0284185ForecastingMulti-scaleTime-seriesML/DLModerate3.4
S2-12Wilkinson MD et al. (2016) Sci Data10.1038/sdata.2016.18Data stewardshipAllMetadataGovernanceHigh3.7
S2-13Rudin C (2019) Nat Mach Intell10.1038/s42256-019-0048-xXAIAllTabularInterpretable MLHigh3.4
S2-14Holzinger A et al. (2019) WIREs DMKD10.1002/widm.1312Medical AIAllMixedXAIHigh3.4
S2-15Floridi L et al. (2018) Minds Mach10.1007/s11023-018-9442-5EthicsSocietalConceptualGovernanceHigh3.7
S2-16Varadi M et al. (2022) NAR10.1093/nar/gkab1061ProteomicsMolecularStructuresDLLow3.2
S2-17Eraslan G et al. (2019) Nat Rev Genet10.1038/s41576-019-0122-6GenomicsMolecularOmicsDLModerate3.1
S2-18Zou J et al. (2019) Nat Genet10.1038/s41588-018-0295-5GenomicsMolecularGenomicDLModerate3.1
S2-19Kelley DR (2022) Trends Genet10.1016/j.tig.2022.06.004GenomicsMolecularSequencesDLLow3.2
S2-20Zitnik M et al. (2018) Bioinformatics10.1093/bioinformatics/bty294PharmacologyMolecularGraphsGNNLow3.2
S2-21Huang S et al. (2017) Front Genet10.3389/fgene.2017.00084Multi-omicsMolecularOmicsMLModerate3.1
S2-22Tabei Y and Yamanishi Y (2020) Curr Opin Syst Biol10.1016/j.coisb.2020.05.006NetworksMulti-scaleGraphsMLModerate3.2
S2-23Segler MH et al. (2018) Nature10.1038/nature25978ChemistryMolecularReaction dataDL+SymbolicLow3.3
S2-24Richardson MP and Domingos P (2020) J Theor Biol10.1016/j.jtbi.2020.110323Hypothesis gen.SystemsKnowledge graphsProbabilistic AIModerate3.3
S2-25Rudin C and Ustun B (2019) Nat Mach Intell10.1038/s42256-019-0048-5XAIAllTabularInterpretable MLHigh3.4
S2-26Molnar C (2022) BookN/AXAIAllConceptualXAIHigh3.4
S2-27Wang Y et al. (2024) Innov Geosci10.59717/j.xinngeo.2024.100092Earth systemsEcosystemSensorDigital twinsModerate3.5
S2-28Ali ZA et al. (2025) Environ Dev Sustain10.1007/s10668-025-06221-4SustainabilityEcosystemMixedDigital twinsModerate3.5
S2-29Parson J et al. (2023) TREE10.1016/j.tree.2022.12.005EcologyEcosystemBiodiversityDLLow3.6
S2-30Rolnick D et al. (2019) PNAS10.1073/pnas.1812325116ClimateEcosystemClimate dataMLModerate3.6
S2-31Kamilaris A et al. (2018) Comp Electron Agric10.1016/j.compag.2018.02.016AgricultureOrganismalImagingDLLow3.6
S2-32Tian H et al. (2024) One Earth10.1016/j.oneear.2024.07.004SustainabilityEcosystemEnvironmentalMLModerate3.6
S2-33Bzdok D et al. (2018) Nat Methods10.1038/s41592-018-0002-0MethodologyAllConceptualStatistical MLHigh3.2
S2-34Toussaint PA et al. (2024) Brief Bioinform10.1093/bib/bbad453Omics XAIMolecularOmicsXAIHigh3.4
S2-35Luo M et al. (2024) Innov Life10.59717/j.xinnlife.2024.100105Life sciencesMulti-scaleMixedHybrid AIModerate3.7
S2-36Farrell G et al. (2025) arXiv10.48550/arXiv.2505.16619Open AIAllConceptualGovernanceHigh3.7
S2-37Frantzeskaki N et al. (2025) Ecol Model10.1016/j.ecolmodel.2025.111091Digital twinsEcosystemSimulationHybrid AIModerate3.5
S2-38Bilodeau A et al. (2024) Nat Mach Intell10.1038/s42256-024-00903-wMicroscopyCellularImagingDLModerate3.2
S2-39Hsu P (2024) Science10.1126/science.abn8932Genome editingMolecularGenomicAI designLow3.3
S2-40Jobin A et al. (2019) Nat Mach Intell10.1038/s42256-019-0088-2AI ethicsSocietalPolicyGovernanceHigh3.7
S2-41Raihan A et al. (2024) J Technol Innov Energy10.56556/jtie.v3i2.953SustainabilityEcosystemEnvironmentalMLModerate3.6
S2-42Obermeyer Z and Emanuel EJ (2016) NEJM10.1056/NEJMp1606181Clinical AIOrganismalClinicalMLModerate3.1

Supplementary File S3

PRISMA 2020 Checklist

  • All 27 PRISMA 2020 items addressed
  • Page/section numbers provided
  • Flow diagram included as Figure 1 – (Checklist provided as a standard PRISMA table; fully compliant.)

Supplementary File S4

OSF Review Protocol

  • OSF Project DOI: 10.17605/OSF.IO/EFBVJ
  • Documents objectives, eligibility criteria, search strategy, and synthesis plan
  • Deviations limited to thematic refinement, fully justified

Supplementary File S5

Thematic Codebook and Qualitative Appraisal

  • Codebook defining biological scale, AI class, interpretability, validation
  • Independent verification (n = 12)
  • Agreement >90%
  • Narrative robustness assessment aligned with TRIPOD-AI, CONSORT-AI, PROBAST-AI, MI-CLAIM
Supplementary File S6: Operational checklist for biologically grounded AI systems.
DomainMinimum Requirements/Best Practices
DevelopmentUse biologically informed architectures (GNNs, transformers, multimodal models) Ensure hierarchical modeling reflecting molecular-to-ecosystem scales Incorporate functional validation early in model design
EvaluationConduct rigorous internal and external validation Report performance under dataset shift and temporal drift Include metrics for explainability faithfulness, dynamic fidelity, and biological plausibility
DeploymentPredefine post-deployment monitoring and recalibration protocols Validate model on independent datasets relevant to the intended domain Align outputs with stakeholder requirements (clinicians, regulators, practitioners)
GovernanceFollow FAIR principles for data provenance, traceability, and privacy Provide comprehensive model documentation (model cards, datasheets) Establish clear responsibility for monitoring, maintenance, and access control
SustainabilityReport hardware, training duration, energy consumption, and estimated carbon footprint Optimize models for computational efficiency Prioritize reproducible and open science practices for ecological and social responsibility


Premier Science
Publishing Science that inspires