AmericasNLI: Machine translation and natural language inference systems for Indigenous languages of the Americas.

Katharina Kann , Abteen Ebrahimi , Manuel Mager , Arturo Oncevay , John E Ortega , Annette Rios , Angela Fan , Ximena Gutierrez-Vasques , Luis Chiruzzo , Gustavo A Giménez-Lugo , Ricardo Ramos , Ivan Vladimir Meza Ruiz , Elisabeth Mager , Vishrav Chaudhary , Graham Neubig , Alexis Palmer , Rolando Coto-Solano , Ngoc Thang Vu

Front Artif Intell

Institute for Natural Language Processing, University of Stuttgart, Stuttgart, Germany.

Published: December 2022

Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

Little attention has been paid to the development of human language technology for truly low-resource languages-i.e., languages with limited amounts of digitally available text data, such as Indigenous languages. However, it has been shown that pretrained multilingual models are able to perform crosslingual transfer in a zero-shot setting even for low-resource languages which are unseen during pretraining. Yet, prior work evaluating performance on unseen languages has largely been limited to shallow token-level tasks. It remains unclear if zero-shot learning of deeper semantic tasks is possible for unseen languages. To explore this question, we present AmericasNLI, a natural language inference dataset covering 10 Indigenous languages of the Americas. We conduct experiments with pretrained models, exploring zero-shot learning in combination with model adaptation. Furthermore, as AmericasNLI is a multiway parallel dataset, we use it to benchmark the performance of different machine translation models for those languages. Finally, using a standard transformer model, we explore translation-based approaches for natural language inference. We find that the zero-shot performance of pretrained models without adaptation is poor for all languages in AmericasNLI, but model adaptation continued pretraining results in improvements. All machine translation models are rather weak, but, surprisingly, translation-based approaches to natural language inference outperform all other models on that task.

Download full-text PDF	Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC9755662	PMC
http://dx.doi.org/10.3389/frai.2022.995667	DOI Listing

Publication Analysis

Top Keywords

natural language

language inference

machine translation

indigenous languages

languages

languages americas

languages limited

unseen languages

zero-shot learning

pretrained models

Similar Publications

Implementing a Resource-Light and Low-Code Large Language Model System for Information Extraction from Mammography Reports: A Pilot Study.

J Imaging Inform Med

September 2025

Department of Diagnostic, Interventional and Pediatric Radiology (DIPR), Inselspital, Bern University Hospital and University of Bern, Bern, Switzerland.

Fabio Dennstädt , Simon Fauser , Nikola Cihoric , Max Schmerder , Paolo Lombardo

Large language models (LLMs) have been successfully used for data extraction from free-text radiology reports. Most current studies were conducted with LLMs accessed via an application programming interface (API). We evaluated the feasibility of using open-source LLMs, deployed on limited local hardware resources for data extraction from free-text mammography reports, using a common data element (CDE)-based structure.

View Article and Find Full Text PDF

Similar Publications

Benchmarking genomic language models.

Nat Methods

September 2025

Nature Methods, .

Lin Tang

View Article and Find Full Text PDF

Similar Publications

Decompensation of degenerative lumbar stenosis: do patients need immediate surgery?

Eur Spine J

September 2025

Centre Hospitalier Universitaire de Tours, Tours, France.

Marie Duigou , Louis-Marie Terrier , Alexia Planty-Bonjour , Christophe Destrieux , Ilyess Zemmoura

Purpose: Degenerative lumbar spinal stenosis (DLSS) represents an increasing challenge due to the aging population. The natural course of untreated DLSS is largely unknown. For the acute DLSS decompensations, the main concern remains the opportunity and timing of surgery, i.

View Article and Find Full Text PDF

Similar Publications

Social Decision Preferences for Close Others are Embedded in Neural and Linguistic Representations.

J Neurosci

September 2025

Department of Psychology, University of California, Los Angeles.

João F Guassi Moreira , L Concepción Esparza , Jennifer A Silvers , Carolyn Parkinson

Humans frequently make decisions that impact close others. Prior research has shown that people have stable preferences regarding such decisions and maintain rich, nuanced mental representations of their close social partners. Yet, if and how such mental representations shape social decisions preferences remains to be seen.

View Article and Find Full Text PDF

Similar Publications

Performance of GPT-4o combined with retrieval-augmented generation on nutritionist licensing exam questions.

Endocr J

September 2025

Institute of Liberal Arts and Science, Kanazawa University, Kanazawa, Japan.

Yu Ishikawa , Akitaka Higashi , Nozomu Arai , Daisuke Ozo , Wataru Hasegawa

GPT-4o, a general-purpose large language model, has a Retrieval-Augmented Variant (GPT-4o-RAG) that can assist in dietary counseling. However, research on its application in this field remains lacking. To bridge this gap, we used the Japanese National Examination for Registered Dietitians as a standardized benchmark for evaluation.

View Article and Find Full Text PDF

Similar Publications