98%
921
2 minutes
20
Large language models have demonstrated impressive capabilities in many domains. But they sometimes generate irrelevant or nonsensical text, or produce outputs that deviate from the provided input, an occurrence commonly referred to as hallucination. To mitigate this issue, we introduce a novel decoding method that incorporates both factual and hallucination prompts (DFHP). It applies contrastive decoding to highlight the disparity in output probabilities between factual prompts and hallucination prompts. Experiments on both multiple-choice and text generation tasks show that our approach significantly improves factual accuracy of large language models without additional training. On the TruthfulQA dataset, the DFHP method significantly improves factual accuracy of the LLaMA model, with an average improvement of 6.4% for the 7B, 13B, 30B, and 65B versions. Its high accuracy in factuality makes it an ideal choice for high reliability tasks like medical diagnosis and legal cases.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11548250 | PMC |
http://dx.doi.org/10.3390/s24217097 | DOI Listing |
Cureus
August 2025
Translational Medicine, Baptist Health South Florida, Miami, USA.
In this report, we present the case of a 39-year-old immunocompetent female with acute meningitis/encephalitis secondary to human herpesvirus 6 (HHV-6). Her initial symptoms included fever, hallucinations, and tremors, which prompted a broad diagnostic workup for infectious and autoimmune causes of encephalopathy. Her cerebrospinal fluid (CSF) initially tested negative for viral pathogens.
View Article and Find Full Text PDFPLOS Digit Health
September 2025
Department of Dermatology, Stanford University, Stanford, California, United States of America.
Large Language Models (LLMs) are increasingly deployed in clinical settings for tasks ranging from patient communication to decision support. While these models demonstrate race-based and binary gender biases, anti-LGBTQIA+ bias remains understudied despite documented healthcare disparities affecting these populations. In this work, we evaluated the potential of LLMs to propagate anti-LGBTQIA+ medical bias and misinformation.
View Article and Find Full Text PDFFront Artif Intell
August 2025
Department of Computer Science, Lahore Leads University, Lahore, Pakistan.
Introduction: The advent of large language models and their applications have gained significant attention due to their strengths in natural language processing.
Methods: In this study, ChatGPT and DeepSeek are utilized as AI models to assist in diagnosis based on the responses generated to clinical questions. Furthermore, ChatGPT, Claude, and DeepSeek are used to analyze images to assess their potential diagnostic capabilities, applying the various sensitivity analyses described.
Front Neurosci
August 2025
Department of Neurology, National Neuroscience Institute, King Fahad Medical City, Riyadh, Saudi Arabia.
Background: Brucellosis, a zoonotic disease caused by species, remains endemic in regions such as Saudi Arabia. While neurobrucellosis is a serious complication, its presentation with parkinsonian features and psychiatric manifestations is exceedingly rare, with only five such cases reported in the literature. These case reports add to the limited data on atypical presentations of neurobrucellosis.
View Article and Find Full Text PDFLiver Int
October 2025
Department of Diagnostic and Interventional Radiology, University Hospital RWTH Aachen, Aachen, Germany.
Background And Aims: Large language models (LLMs) can potentially support clinicians in their daily routine by providing easy access to information. Yet, they are plagued by stating incorrect facts and hallucinating when queried. Increasing the context by providing external databases while prompting LLMs may decrease the risk of misinformation.
View Article and Find Full Text PDF