You are viewing the site in preview mode

Skip to main content

Impact of large language model (ChatGPT) in healthcare: an umbrella review and evidence synthesis

Abstract

Background

The emergence of Artificial Intelligence (AI), particularly Chat Generative Pre-Trained Transformer (ChatGPT), a Large Language Model (LLM), in healthcare promises to reshape patient care, clinical decision-making, and medical education. This review aims to synthesise research findings to consolidate the implications of ChatGPT integration in healthcare and identify research gaps.

Main body

The umbrella review was conducted following Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines. The Cochrane Library, PubMed, Scopus, Web of Science, and Google Scholar were searched from inception until February 2024. Due to the heterogeneity of the included studies, no quantitative analysis was performed. Instead, information was extracted, summarised, synthesised, and presented in a narrative form. Two reviewers undertook title, abstract, and full text screening independently. The methodological quality and overall rating of the included reviews were assessed using the A Measurement Tool to Assess systematic Reviews (AMSTAR-2) checklist. The review examined 17 studies, comprising 15 systematic reviews and 2 meta-analyses, on ChatGPT in healthcare, revealing diverse focuses. The AMSTAR-2 assessment identified 5 moderate and 12 low-quality reviews, with deficiencies like study design justification and funding source reporting. The most reported theme that emerged was ChatGPT's use in disease diagnosis or clinical decision-making. While 82.4% of studies focused on its general usage, 17.6% explored unique topics like its role in medical examinations and conducting systematic reviews. Among these, 52.9% targeted general healthcare, with 41.2% focusing on specific domains like radiology, neurosurgery, gastroenterology, public health dentistry, and ophthalmology. ChatGPT’s use for manuscript review or writing was mentioned in 17.6% of reviews. Promising applications include enhancing patient care and clinical decision-making, though ethical, legal, and accuracy concerns require cautious integration.

Conclusion

We summarise the identified areas in reviews regarding ChatGPT's transformative impact in healthcare, highlighting patient care, decision-making, and medical education. Emphasising the importance of ethical regulations and the involvement of policymakers, we urge further investigation to ensure the reliability of ChatGPT and to promote trust in healthcare and research.

Background

Recent advancements in Artificial Intelligence (AI) have brought transformative changes across various industries, including healthcare [16]. AI-powered tools and technologies offer the potential to revolutionise healthcare delivery, improving patient outcomes, and enhancing clinical decision-making processes. Among these technologies, ChatGPT (Chat Generative Pre-trained Transformer), a Large Language Model (LLM), developed by OpenAI, has received significant attention within the healthcare sector [12]. As a state-of-the-art Natural Language Processing (NLP) model trained on a vast corpus of text data, ChatGPT can generate human-like responses to text inputs. Operating on deep learning principles and employing a transformer architecture, ChatGPT surpasses traditional rule-based chatbots by not relying on predefined rules or templates for generating responses [23]. Instead, it leverages its extensive pre-trained knowledge to understand and respond to queries in a contextually relevant manner.

ChatGPT is a promising tool for a wide range of diverse and multifaceted applications in consumer health [23]. One of the primary areas of promise is in healthcare education. With its ability to generate informative and educational content, ChatGPT can serve as a valuable resource for medical students, healthcare professionals, and educators [25]. It can assist in the creation of educational materials, answer clinical queries, and facilitate interactive learning experiences. Additionally, ChatGPT can aid medical research by generating human-like text, offering fundamental guidance, and elucidating complex concepts [1]. In clinical practice, ChatGPT has the potential to streamline clinical documentation, patient communication, and decision support tasks, thus improving the workflow efficiency. By automating routine administrative tasks and providing real-time assistance, ChatGPT can help reduce the burden on healthcare professionals and enhance the quality of patient care [14]. Moreover, ChatGPT holds promise in diagnostic assistance and decision support. Its ability to process and analyze medical data, including patient histories, symptoms, and diagnostic tests, enables it to provide valuable insights and recommendations to healthcare providers. In fields such as radiology and pathology, ChatGPT can assist in image interpretation, differential diagnosis, and treatment planning, potentially improving diagnostic accuracy and patient outcomes [21]. Furthermore, it can offer second opinions on dermatological treatments, which may become increasingly reliable as it continues to optimize [33].

Despite its potential benefits, integrating ChatGPT into healthcare practice poses risks, challenges and limitations. Addressing risks associated with ethical concerns regarding patient privacy, data security, and algorithmic bias is crucial for ensuring its safe and responsible use. Moreover, verifying the accuracy, reliability, and trustworthiness of ChatGPT-generated content requires further investigation [34].

Translating promises into reality is always a significant step. The potential uses and benefits of ChatGPT for consumer health have emerged but are not yet fully realised. Further work is necessary to understand for what and how ChatGPT is being used. Similarly, understanding the risks, challenges and limitations of ChatGPT in consumer health can help ensure its appropriate and effective use. Given these considerations, conducting an umbrella review of systematic reviews on ChatGPT in consumer health is imperative. This study aims to provide a comprehensive overview through synthesis and evaluation, including evidence gap synthesis, discerns implications for integration, and highlight areas for further research and development.

Methods

An umbrella review synthesises existing systematic reviews and meta-analyses on a topic, offering a comprehensive overview of evidence from multiple studies. It provides a broader perspective, enhancing research depth and reliability. Thus, it is ideal for exploring ChatGPT’s applications and impacts in healthcare. This umbrella review was conducted according to the Preferred Reporting Items for Systematic Reviews and Meta-Analysis Protocols guidelines (PRISMA) [26]. The checklist recommended by Choi et al. [8] was followed for conducting and reporting this umbrella review. The protocol was registered with the International Prospective Register of Systematic Reviews (PROSPERO) under registration number CRD42024510926.

Search strategy and eligibility criteria

The search strategy for included reviews was conducted across five electronic databases: (i) Cochrane Library (the Cochrane Database of Systematic Reviews); (ii) PubMed; (iii) Web of Science (all databases); (iv) Scopus; and (v) Google Scholar. The review period was up until February 3, 2024, without restrictions on language or publication year.

To capture a wide-ranging collection of reviews covering ChatGPT’s role in consumer health, we performed a comprehensive searching strategy across the database by using a combination of keywords and Boolean operators. For Google Scholar, specific filters for 'review articles' and sorting by relevance were applied, followed by a targeted search query to refine the results further. The detailed search strategy can be found in Additional file 1.

The inclusion criteria were: (i) peer-reviewed Systematic Reviews (SR) and Systematic Reviews with Meta-Analysis (SRMA); (ii) focusing on ChatGPT within the scope of consumer health, specifically highlighting areas such as patient education, health information seeking, digital health interventions, health literacy, and various forms of electronic health services. While prioritising SR and SRMA, the scope extends to any study following PRISMA guidelines, thereby ensuring a broad yet rigorous collection of literature on ChatGPT's impact on consumer health informatics. Studies were excluded if they: (i) were not SR or SRMA; (ii) lacked relevance to ChatGPT in consumer health; (iii) were not entirely in English; or (iv) were only available as abstracts without full texts.

For study selection, two reviewers (AT and ARR) independently performed the literature search across the selected databases, then screened titles and abstracts to eliminate duplicates, and read the full texts of all papers to identify relevant systematic reviews. Any disagreements were resolved by a consensus with a third reviewer (UI).

Assessment of methodological quality

The methodological quality of the included reviews was appraised using A MeaSurement Tool to Assess systematic Reviews (AMSTAR-2) guideline, a comprehensive framework to determine the thoroughness and reliability of the reviews [28]. Quality levels were categorised into high, moderate, low, or critically low, based on the presence of critical flaws and non-critical weaknesses. The AMSTAR-2 appraisal was initially performed by a single reviewer (ARR) and then verified by another (AT), with both agreeing on the evaluation outcomes without any disagreement.

Data extraction and evidence synthesis

The selected articles were manually reviewed, and pertinent information was extracted, synthesised, and summarised in tabular format. We did not perform quantitative analysis in this review given the heterogeneity of included articles and because the meta-analysis had not been performed in most of the articles. The findings were synthesised into main and sub-themes, analysing the most common outcomes and methodological quality of the systematic reviews.

Results

A preliminary search with the key term “ChatGPT” yielded no results in the Cochrane Library. Subsequently, a search was conducted on PubMed using keyword “ChatGPT” with the filter set to “Systematic Reviews” without any restriction for date, yielding 21 records. In addition, after applying the filter of “review articles” and “sort by relevance”, we inserted a predefined search query in the search tab of Google Scholar, producing 433 results. As Google Scholar gives very broad search results, we utilised the methodology described by Haddaway et al. [11] and included the first 50 records for further screening. Out of these, 15 titles were deemed relevant to the objectives of this umbrella review. Searches in Scopus and Web of Science, following predefined criteria, yielded 23 and 5 records, respectively. After removing the duplicates, 74 unique titles were screened by titles and abstracts, leading to the exclusion of 54 studies for being irrelevant to the study objective (40) or not being systematic reviews (14), out of which 20 were deemed eligible for full-text analysis. Three articles were further excluded during full-text study for not solely focusing on ChatGPT [19], being a preprint (not peer-reviewed) [9], and primarily emphasising practical query interactions with ChatGPT, rather than providing a comprehensive analysis of systematic review results [7]. Therefore, 17 articles were included in the final group for analysis and synthesis, as illustrated in the PRISMA flow diagram (Fig. 1). Of these, 15 comprised systematic reviews, with an additional 2 being meta-analyses. The distribution of publication years underscores the topic's emerging relevance, with 13 articles published in 2023 and 4 in 2024 to February, indicating a notable surge in ChatGPT-related research during this period.

Fig. 1
figure 1

PRISMA flow diagram representing the inclusion of systematic reviews and meta-analysis

The methodological quality assessment using AMSTAR-2, as summarised in Table 1, indicated that 5 out of 17 reviews (29.4%) were of moderate quality, while the remaining 12 out of 17 reviews (70.6%) had low quality. Notably, the most frequently lacking quality indicator was the explanation or justification of the selection of study designs for inclusion in the review (item number 3). Furthermore, deficiencies were observed in reporting the sources of funding (item number 10), addressing the risk of bias when discussing results (item number 13), and explaining or discussing observed heterogeneity (item number 14), indicating areas needing improvement in future reviews.

Table 1 Risk of Bias analysis of included systematic reviews using AMSTAR-2 [28]

Tables 2 and 3 present a summary of the methodology aims and key findings of the included articles, offering a comprehensive overview of how ChatGPT is being integrated and evaluated within healthcare settings. These reviews exhibited considerable heterogeneity in terms of their fields and objectives. The majority (82.35% or 14 out of 17) of the reviews focused on elucidating the usage, advantages, and limitations of ChatGPT across various domains within healthcare. The remaining three reviews explored unique topics, including the role of ChatGPT in multiple-choice question-based medical examinations [17], medical research [24], and performance in systematic review tasks [27]. While a majority (52.94% or 9 out of 17) of the reviews concentrated on general healthcare practices, a total of seven reviews delved into specific domains such as radiology [4, 31], neurosurgery [3],Roman, Al-Sharif, & Gharyani, 2023), gastroenterology [15], public health dentistry [32] and ophthalmology [35]. Quantitative analysis of pooled findings was conducted in two studies. It was noted that 3 out of the 17 systematic reviews (17.65%) incorporated ChatGPT to aid in reviewing [10, 27] or writing manuscripts [20], acknowledging the contribution in their acknowledgements section.

Table 2 Summary of methodologies of included systematic reviews and systematic reviews with meta-analysis (n = 17)
Table 3 Objectives and key findings of the included systematic reviews and systematic reviews with meta-analysis (n = 17)

Table 4 outlines the references of articles reporting each theme and sub theme of ChatGPT within the context of healthcare. According to the included articles, the role of ChatGPT in healthcare from both the patient and caregiver perspectives, emerged as the most frequently studied theme (studied in 16 out of 17 articles) [2,3,4,5, 15, 18, 20], Muftić, Kadunić, Mušinbegović, & Almisreb, 2023; [22, 24],Salam, 2023; [27, 31, 32, 35, 36]. The education of patients—in terms of general information gathering about disease—was explored in 11 studies [2,3,4,5, 18, 20, 22, 24, 25, 35, 36].

Table 4 Major themes and sub themes derived from systematic reviews for the umbrella review evidence synthesis

Additional File 2, Table S1 shows the comparison of various versions of ChatGPT used in the included articles. Eight studies out of 17 mentioned the impact of different versions of ChatGPT on tasks they can perform effectively [2, 4, 15, 17, 18, 27, 31, 35]. ChatGPT 3.5 was found to be less precise and needed human verification but its accuracy depends on the quality of training data. It easily integrates into clinical workflows and is a promising educational tool. ChatGPT 4.0 was able to handle complex tasks such as radiology [31] but was less reliable in less complex tasks [4]. However, gastroenterology self-assessment could not be done by both versions in a satisfactory manner [15].

Discussion

This umbrella review synthesised 17 existing systematic reviews and meta-analyses investigating the applications, strengths, limitations, and future directions of using ChatGPT in healthcare. The evidence suggests that ChatGPT has diverse applications, which explored enhancing patient care [10], conducting and reporting systematic reviews [27], advancing healthcare education [25], augmenting clinical decision-making [31], and providing preparatory materials for medical examinations [17]. Several studies suggested that ChatGPT can be employed as a valuable tool in clinical practice, assisting clinicians with patient inquiries, writing medical notes and discharge summaries, and making informed decisions about treatment plans. Additionally, it has the potential to serve as a personalised learning tool, encouraging critical thinking and problem-based learning among medical professionals [18, 36].

ChatGPT has demonstrated remarkable capabilities in generating human-like text and conducting natural language processing for text organisation and summarisation. It can expedite processes such as collecting questionnaire responses or conducting interviews, enhancing the effectiveness and efficiency of epidemiological research. Furthermore, ChatGPT supports researchers in locating essential information, developing hypotheses, and analysing data [24]. In healthcare education, ChatGPT serves as a preparatory tool for medical examinations, where it correctly answers most multiple-choice questions, suggesting its potential utility in evaluating medical knowledge [17]. Specialized applications, such as aiding in surgical planning, image recognition, diagnosis, and patient care in neurosurgery [3, 33] and supporting dentistry practices [32], further highlight its transformative potential. However, it is important to note that ChatGPT cannot replace the holistic care provided by a dentist, as decision-making in dentistry is multidisciplinary and involves patient care beyond diagnosis [32].

Administrative efficiency is another domain where ChatGPT shows promise. Its robust linguistic capabilities make it highly suitable for handling intricate administrative tasks, which can significantly aid in busy healthcare settings. Tasks such as managing medical records, generating discharge summaries, formatting examination reports and drafting referral letters are efficiently managed by AI through initial information structuring and organisation. Subsequent review and confirmation by healthcare professionals facilitate the rapid organisation of clinical data, alleviating both time and manpower burdens. This contributes to improving the healthcare environment and the quality of patient care [3, 10, 25, 27, 36].

While AI holds the potential to assume numerous responsibilities currently undertaken by human physicians, such as diagnosis and medication prescription, several limitations must be considered. Studies have raised concerns regarding ChatGPT’s potential for bias, plagiarism, lack of originality, and ethical and legal dilemmas [2, 22, 25]. It frequently produces erroneous or inconsistent content, including inaccurate citations and fabrications, which constrain its reliability in clinical and academic contexts [5, 15]. Furthermore, ChatGPT has difficulty interpreting specialised medical terminology, integrating into clinical workflows and addressing complex medical inquiries [4, 31, 32]. These limitations can lead to a loss of human critical thinking and involvement, as excessive reliance on AI could reduce the exercise of essential cognitive skills, potentially hindering professional growth and societal advancement [30, 36].

In terms of scientific writing, ChatGPT's linguistic capabilities can assist authors in generating ideas, summarising text, editing language, and proofreading documents. However, it is important to note that under the current International Committee of Medical Journal Editors (ICMJE)/Committee on Publication Ethics (COPE) guidelines, ChatGPT is not eligible for authorship in scientific publications unless these guidelines are updated [25]. Moreover, several ethical concerns, including copyright issues, transparency, and the risk of spreading misinformation, have been raised regarding its use in scientific writing [2, 25]. Given these concerns, it is essential to investigate the research domain from the viewpoints of editors, reviewers and journals to develop appropriate policies. Further research is also needed on educational policy formulation and the integration of ChatGPT into teaching methods and curriculum development [37]. Exploring the intersecting realms of research and education offers another avenue for exploration.

While acknowledging the potential significance of ChatGPT in healthcare, the reviewed studies highlight several challenges. ChatGPT’s integration into healthcare systems requires collaboration between AI developers, healthcare professionals and policymakers to maximise its transformative impact. Since the effectiveness of ChatGPT’s outputs depends on the quality and diversity of its training data [6], it is crucial to ensure that it incorporates a broad range of clinical information that accurately reflects the target patient population [31]. This may involve developing specialised ChatGPT models tailored to specific patient groups or healthcare domains to ensure the relevance and efficacy of its outputs.

To ensure responsible deployment, robust validation mechanisms, including expert review and clinical testing are necessary to address issues like AI hallucination, misinformation and bias. In addition, clear privacy regulations and transparent data usage policies are essential to protect user data and build trust in AI-generated responses. Establishing ethical frameworks, certification standards, and promoting digital literacy through educational initiatives will empower users to understand ChatGPT’s limitations and use it responsibly [13, 38].

With patients increasingly gaining access to ChatGPT, concerns may arise regarding self-diagnosis and the potential for cyberchondria [29]. While empowering patients with information can enhance autonomy and engagement in their healthcare, it also raises concerns about the accuracy and interpretation of medical data. Self-diagnosis based solely on ChatGPT’s outputs could lead to misinterpretation or oversight of critical details, potentially compromising patient safety. Therefore, it is crucial to establish guidelines and educational resources to support patients in using ChatGPT as a supplementary tool rather than a substitute for professional medical advice and diagnosis.

This umbrella review demonstrates both strengths and limitations of ChatGPT. We conducted it by relying on existing systematic reviews and meta-analyses, ensuring methodological rigour through adherence to PRISMA guidelines and the use of the AMSTAR-2 tool for quality assessment. Our stringent criteria for study inclusion aimed to analyse high-quality, relevant research, while meticulous search strategies and transparent selection criteria minimised biases. Despite efforts to standardise methodologies and terminologies, integrating and reconciling inconsistencies across studies posed challenges. While our review provided a comprehensive overview, it lacked detailed insights into specific healthcare contexts, emphasising the need for further primary research. Moreover, Generative AI is a dynamic field that undergoes regular updates, making comparisons between different versions of ChatGPT valuable for future researchers important. Newer versions generally demonstrate enhanced natural language processing capabilities, which can significantly benefit healthcare applications. However, concerns regarding the reliability of newer versions, such as ChatGPT 4.0, which performed poorly in handling simpler queries compared to its predecessor, highlight the need for further investigation into these advancements (Additional File 2). Addressing these limitations will enhance the robustness and applicability of our findings for evidence-based decision-making in healthcare practice. Moreover, longitudinal studies are necessary to examine the broader, long-term impact of ChatGPT on healthcare systems, patient outcomes, workflow efficiency, and provider-patient dynamics. Combining these approaches will ensure a holistic understanding of ChatGPT’s role in advancing healthcare while addressing its limitations.

Conclusions

The ChatGPT’s integration into healthcare as a reliable educational, research and clinical augmentation tool shows immense promise however, its success relies on the establishment of robust regulations and control mechanisms to ensure ethical deployment.

ChatGPT’s version 3.5 was found to be more reliable in certain circumstances while complex tasks can be handled well by the ChatGPT version 4.0. Prioritising ethical considerations is essential to harness AI's potential while preserving trust and integrity in healthcare and research practices. Acknowledging and addressing challenges such as ethical concerns, bias and the potential for overreliance is crucial.

Through collaborative efforts among stakeholders, ChatGPT can significantly enhance healthcare delivery, research innovation and patient outcomes, marking a step forward in ethically responsible use of AI in the healthcare field.

Availability of data and materials

Not applicable.

Abbreviations

AMSTAR-2:

A MeaSurement Tool to Assess systematic Reviews

ChatGPT:

Chat Generative Pre-trained Transformer

SR:

Systematic reviews

SRMA:

Systematic reviews with meta-analysis

ICMJE:

International Committee of Medical Journal Editors

COPE:

Committee on Publication Ethics

PRISMA:

Preferred Reporting Items for Systematic Reviews and Meta-Analysis

References

  1. Ashraf H, Ashfaq H. The role of ChatGPT in medical research: progress and limitations. Ann Biomed Eng. 2023. https://doiorg.publicaciones.saludcastillayleon.es/10.1007/s10439-023-03311-0.

    Article  PubMed  Google Scholar 

  2. Bagde H, Dhopte A, Alam MK, Basri R. A systematic review and meta-analysis on ChatGPT and its utilization in medical and dental research. Heliyon. 2023;9(12): e23050. https://doiorg.publicaciones.saludcastillayleon.es/10.1016/j.heliyon.2023.e23050.

    Article  PubMed  PubMed Central  Google Scholar 

  3. Bečulić H, Begagić E, Skomorac R, Mašović A, Selimović E, Pojskić M. 2024. ChatGPT’s contributions to the evolution of neurosurgical practice and education: A systematic review of benefits, concerns and limitations. Medicinski Glasnik Ljekarske Komore Zenicko-Dobojskog Kantona. https://doiorg.publicaciones.saludcastillayleon.es/10.17392/1661-23

  4. Bera K, O’Connor G, Jiang S, Tirumani SH, Ramaiya N. Analysis of ChatGPT publications in radiology: literature so far. Curr Probl Diagn Radiol. 2024;53(2):215–25. https://doiorg.publicaciones.saludcastillayleon.es/10.1067/j.cpradiol.2023.10.013.

    Article  PubMed  Google Scholar 

  5. Bugaj M, Kliestik T, Lăzăroiu G. Generative artificial intelligence-based diagnostic algorithms in disease risk detection, in personalized and targeted healthcare procedures, and in patient care safety and quality. Contemp Read Law Soc Justice. 2023;15(1):9–26.

    Article  Google Scholar 

  6. Cao Y, Li S, Liu Y, Yan Z, Dai Y, Yu PS, Sun L. 2023. A comprehensive survey of AI-generated content (AIGC): a history of generative AI from GAN to ChatGPT (arXiv:2303.04226). arXiv. http://arxiv.org/abs/2303.04226

  7. Cazzato G, Capuzzolo M, Parente P, Arezzo F, Loizzi V, Macorano E, Marzullo A, Cormio G, Ingravallo G. Chat GPT in diagnostic human pathology: will it be useful to pathologists? A preliminary review with ‘query session’ and future perspectives. AI. 2023;4(4):1010–22. https://doiorg.publicaciones.saludcastillayleon.es/10.3390/ai4040051.

    Article  Google Scholar 

  8. Choi GJ, Kang H. Introduction to umbrella reviews as a useful evidence-based practice. J Lipid Atherosclerosis. 2023;12(1):3.

    Article  Google Scholar 

  9. Gabashvili I. 2023. ChatGPT in dermatology: a comprehensive systematic review. medRxiv, 2023–06.

  10. Garg RK, Urs VL, Agarwal AA, Chaudhary SK, Paliwal V, Kar SK. Exploring the role of ChatGPT in patient care (diagnosis and treatment) and medical research: a systematic review. Health Promot Perspect. 2023;13(3):183.

    Article  PubMed  PubMed Central  Google Scholar 

  11. Haddaway NR, Collins AM, Coughlin D, Kirk S. The role of Google Scholar in evidence reviews and its applicability to grey literature searching. PLoS ONE. 2015;10(9): e0138237.

    Article  PubMed  PubMed Central  Google Scholar 

  12. Harry A. The future of medicine: harnessing the power of AI for revolutionizing healthcare. Int J Multidis Sci Arts. 2023;2(1):36–47.

    Google Scholar 

  13. Hastings J. Preventing harm from non-conscious bias in medical generative AI. Lancet Digit Health. 2024;6(1):e2–3. https://doiorg.publicaciones.saludcastillayleon.es/10.1016/S2589-7500(23)00246-7[publishedOnlineFirst:2023/12/21].

    Article  CAS  PubMed  Google Scholar 

  14. Javaid M, Haleem A, Singh RP. ChatGPT for healthcare services: an emerging stage for an innovative perspective. BenchCouncil Trans Benchmarks Standards Evaluat. 2023;3(1): 100105.

    Article  Google Scholar 

  15. Klang E, Sourosh A, Nadkarni GN, Sharif K, Lahat A. Evaluating the role of ChatGPT in gastroenterology: a comprehensive systematic review of applications, benefits, and limitations. Ther Adv Gastroenterol. 2023;16:17562848231218618. https://doiorg.publicaciones.saludcastillayleon.es/10.1177/17562848231218618.

    Article  Google Scholar 

  16. Lee D, Yoon SN. Application of artificial intelligence-based technologies in the healthcare industry: opportunities and challenges. Int J Environ Res Public Health. 2021;18(1):271.

    Article  PubMed  PubMed Central  Google Scholar 

  17. Levin G, Horesh N, Brezinov Y, Meyer R. Performance of ChatGPT in medical examinations: a systematic review and a meta-analysis. Int J Obstet Gynaecol. 2024;131(3):378–80. https://doiorg.publicaciones.saludcastillayleon.es/10.1111/1471-0528.17641.

    Article  Google Scholar 

  18. Li J, Dada A, Puladi B, Kleesiek J, Egger J. ChatGPT in healthcare: a taxonomy and systematic review. Comput Methods Programs Biomed. 2024;245: 108013. https://doiorg.publicaciones.saludcastillayleon.es/10.1016/j.cmpb.2024.108013.

    Article  PubMed  Google Scholar 

  19. Mitsea E, Drigas A, Skianis C. Digitally assisted mindfulness in training self-regulation skills for sustainable mental health: a systematic review. Behav Sci. 2023;13(12):1008.

    Article  PubMed  PubMed Central  Google Scholar 

  20. Muftić F, Kadunić M, Mušinbegović A, Abd Almisreb A. Exploring medical breakthroughs: a systematic review of ChatGPT applications in healthcare. Southeast Eur J Soft Comput. 2023;12(1):13–41.

    Google Scholar 

  21. Rao A, Kim J, Kamineni M, Pang M, Lie W, Succi MD. Evaluating ChatGPT as an adjunct for radiologic decision-making. medRxiv. 2023;16:2023–02.

    Google Scholar 

  22. Roman A, Al-Sharif L, Al Gharyani M. The expanding role of ChatGPT (chat-generative pre-trained transformer) in neurosurgery: a systematic review of literature and conceptual framework. Cureus. 2023. https://doiorg.publicaciones.saludcastillayleon.es/10.7759/cureus.43502.

    Article  PubMed  PubMed Central  Google Scholar 

  23. Roumeliotis KI, Tselikas ND. ChatGPT and open-AI models: a preliminary review. Future Internet. 2023;15(6):192.

    Article  Google Scholar 

  24. Ruksakulpiwat S, Kumar A, Ajibade A. Using ChatGPT in medical research: current status and future directions. J Multidiscip Healthc. 2023;16:1513–20. https://doiorg.publicaciones.saludcastillayleon.es/10.2147/JMDH.S413470.

    Article  PubMed  PubMed Central  Google Scholar 

  25. Sallam M. ChatGPT utility in healthcare education, research, and practice: systematic review on the promising perspectives and valid concerns. Healthcare. 2023;11(6):887. https://doiorg.publicaciones.saludcastillayleon.es/10.3390/healthcare11060887.

    Article  PubMed  PubMed Central  Google Scholar 

  26. Sarkis-Onofre R, Catalá-López F, Aromataris E, Lockwood C. How to properly use the PRISMA statement. Syst Rev. 2021;10(1):117. https://doiorg.publicaciones.saludcastillayleon.es/10.1186/s13643-021-01671-z.

    Article  PubMed  PubMed Central  Google Scholar 

  27. Schopow N, Osterhoff G, Baur D. Applications of the natural language processing tool ChatGPT in clinical practice: comparative study and augmented systematic review. JMIR Med Inform. 2023;11: e48933.

    Article  PubMed  PubMed Central  Google Scholar 

  28. Shea BJ, Reeves BC, Wells G, Thuku M, Hamel C, Moran J, Moher D, Tugwell P, Welch V, Kristjansson E. AMSTAR 2: a critical appraisal tool for systematic reviews that include randomised or non-randomised studies of healthcare interventions, or both. Bmj. 2017. https://doiorg.publicaciones.saludcastillayleon.es/10.1136/bmj.j4008.

    Article  PubMed  PubMed Central  Google Scholar 

  29. Starcevic V, Berle D, Arnáez S. Recent insights into cyberchondria. Curr Psychiatry Rep. 2020;22(11):56. https://doiorg.publicaciones.saludcastillayleon.es/10.1007/s11920-020-01179-8.

    Article  PubMed  PubMed Central  Google Scholar 

  30. Swaminathan A, Rathnasabapathy M. Role of creativity in problem solving–a review. Rev Int Geograph Educ Online. 2021;11(8):2.

    Google Scholar 

  31. Temperley HC, O’Sullivan NJ, Mac Curtain BM, Corr A, Meaney JF, Kelly ME, Brennan I. Current applications and future potential of C hat GPT in radiology: a systematic review. J Med Imaging Radiat Oncol. 2024;1754–9485:13621. https://doiorg.publicaciones.saludcastillayleon.es/10.1111/1754-9485.13621.

    Article  Google Scholar 

  32. Tiwari A, Kumar A, Jain S, Dhull KS, Sajjanar A, Puthenkandathil R, Paiwal K, Singh R. Implications of ChatGPT in public health dentistry: a systematic review. Cureus. 2023. https://doiorg.publicaciones.saludcastillayleon.es/10.7759/cureus.40367.

    Article  PubMed  PubMed Central  Google Scholar 

  33. Iqbal U, Lee LT, Rahmanti AR, Celi LA, Li YJ. Can large language models provide secondary reliable opinion on treatment options for dermatological diseases? J Am Med Inform Assoc. 2024;31(6):1341–7. https://doiorg.publicaciones.saludcastillayleon.es/10.1093/jamia/ocae067.PMID:38578616;PMCID:PMC11105123.

    Article  PubMed  PubMed Central  Google Scholar 

  34. Wang C, Liu S, Yang H, Guo J, Wu Y, Liu J. Ethical considerations of using ChatGPT in health care. J Med Int Res. 2023;25: e48009.

    Google Scholar 

  35. Wong M, Lim ZW, Pushpanathan K, Cheung CY, Wang YX, Chen D, Tham YC. Review of emerging trends and projection of future developments in large language models research in ophthalmology. Br J Ophthalmol. 2023. https://doiorg.publicaciones.saludcastillayleon.es/10.1136/bjo-2023-324734.

    Article  PubMed  Google Scholar 

  36. Younis HA, Eisa TAE, Nasser M, Sahib TM, Noor AA, Alyasiri OM, Salisu S, Hayder IM, Younis HA. A systematic review and meta-analysis of artificial intelligence tools in medicine and healthcare: applications, considerations, limitations. Motiv Challenges Diagn. 2024;14(1):109. https://doiorg.publicaciones.saludcastillayleon.es/10.3390/diagnostics14010109.

    Article  Google Scholar 

  37. Yu H. The application and challenges of ChatGPT in educational transformation: new demands for teachers’ roles. Heliyon. 2024;10(2): e24289. https://doiorg.publicaciones.saludcastillayleon.es/10.1016/j.heliyon.2024.e24289.

    Article  PubMed  PubMed Central  Google Scholar 

  38. Zack T, Lehman E, Suzgun M, et al. Assessing the potential of GPT-4 to perpetuate racial and gender biases in health care: a model evaluation study. Lancet Digit Health. 2024;6(1):e12–22. https://doiorg.publicaciones.saludcastillayleon.es/10.1016/S2589-7500(23)00225-X[publishedOnlineFirst:2023/12/21].

    Article  CAS  PubMed  Google Scholar 

Download references

Acknowledgements

During the preparation of this work, the authors used ChatGPT for language editing. After using this tool, the authors reviewed and edited the content as needed and took full responsibility for the content of the publication.

Funding

No funding has been obtained for this study.

Author information

Authors and Affiliations

Authors

Contributions

UI and YCJL designed the study concepts. AT and ARR were responsible for conducting the article search, selection, summarization, and synthesis of the eligible articles suitable for inclusion. UI, AT, and ARR prepared the first draft of the article. UI, LTJL, DG, and YCJL contributed significantly to revising and providing a critical review of the manuscript. All authors read and approved the final version of the manuscript.

Corresponding author

Correspondence to Yu-Chuan Jack Li.

Ethics declarations

Ethics approval and consent to participate

Not applicable.

Consent for publication

Not applicable.

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Iqbal, U., Tanweer, A., Rahmanti, A.R. et al. Impact of large language model (ChatGPT) in healthcare: an umbrella review and evidence synthesis. J Biomed Sci 32, 45 (2025). https://doiorg.publicaciones.saludcastillayleon.es/10.1186/s12929-025-01131-z

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doiorg.publicaciones.saludcastillayleon.es/10.1186/s12929-025-01131-z

Keywords