Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

20

Article Abstract

Large language models have demonstrated impressive capabilities in many domains. But they sometimes generate irrelevant or nonsensical text, or produce outputs that deviate from the provided input, an occurrence commonly referred to as hallucination. To mitigate this issue, we introduce a novel decoding method that incorporates both factual and hallucination prompts (DFHP). It applies contrastive decoding to highlight the disparity in output probabilities between factual prompts and hallucination prompts. Experiments on both multiple-choice and text generation tasks show that our approach significantly improves factual accuracy of large language models without additional training. On the TruthfulQA dataset, the DFHP method significantly improves factual accuracy of the LLaMA model, with an average improvement of 6.4% for the 7B, 13B, 30B, and 65B versions. Its high accuracy in factuality makes it an ideal choice for high reliability tasks like medical diagnosis and legal cases.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11548250PMC
http://dx.doi.org/10.3390/s24217097DOI Listing

Publication Analysis

Top Keywords

hallucination prompts
12
contrastive decoding
8
factual hallucination
8
large language
8
language models
8
improves factual
8
factual accuracy
8
factual
5
improving factuality
4
factuality contrastive
4

Similar Publications

In this report, we present the case of a 39-year-old immunocompetent female with acute meningitis/encephalitis secondary to human herpesvirus 6 (HHV-6). Her initial symptoms included fever, hallucinations, and tremors, which prompted a broad diagnostic workup for infectious and autoimmune causes of encephalopathy. Her cerebrospinal fluid (CSF) initially tested negative for viral pathogens.

View Article and Find Full Text PDF

Evaluating anti-LGBTQIA+ medical bias in large language models.

PLOS Digit Health

September 2025

Department of Dermatology, Stanford University, Stanford, California, United States of America.

Large Language Models (LLMs) are increasingly deployed in clinical settings for tasks ranging from patient communication to decision support. While these models demonstrate race-based and binary gender biases, anti-LGBTQIA+ bias remains understudied despite documented healthcare disparities affecting these populations. In this work, we evaluated the potential of LLMs to propagate anti-LGBTQIA+ medical bias and misinformation.

View Article and Find Full Text PDF

Introduction: The advent of large language models and their applications have gained significant attention due to their strengths in natural language processing.

Methods: In this study, ChatGPT and DeepSeek are utilized as AI models to assist in diagnosis based on the responses generated to clinical questions. Furthermore, ChatGPT, Claude, and DeepSeek are used to analyze images to assess their potential diagnostic capabilities, applying the various sensitivity analyses described.

View Article and Find Full Text PDF

Background: Brucellosis, a zoonotic disease caused by species, remains endemic in regions such as Saudi Arabia. While neurobrucellosis is a serious complication, its presentation with parkinsonian features and psychiatric manifestations is exceedingly rare, with only five such cases reported in the literature. These case reports add to the limited data on atypical presentations of neurobrucellosis.

View Article and Find Full Text PDF

Background And Aims: Large language models (LLMs) can potentially support clinicians in their daily routine by providing easy access to information. Yet, they are plagued by stating incorrect facts and hallucinating when queried. Increasing the context by providing external databases while prompting LLMs may decrease the risk of misinformation.

View Article and Find Full Text PDF