Sentence-level Relation Semantics Learning via Contrastive Sentences.

IEEE Trans Pattern Anal Mach Intell

Published: September 2025


Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

20

Article Abstract

Sentence-level semantics plays a key role in language understanding. There exist subtle relations and dependencies among sentence-level samples, which is to be exploited. For example, in relational triple extraction, existing models overemphasize extraction modules, ignoring the sentence-level semantics and relation information, which causes (1) the semantics fed to extraction modules is relation-unaware; (2) each sample is trained individually without considering inter-sample dependency. To address these issues, we first propose the model-agnostic multi-relation detection task, which incorporates relation information into text encoding to generate the relation-aware semantics. Then we propose the model-agnostic multi-relation supervised contrastive learning, which leverages the relation-derived inter-sample dependencies as a supervised signal to learn discriminative semantics via drawing together or pushing away the sentence-level semantics regarding whether they share the same/similar relations. Besides, we design the reverse label frequency weighting and hierarchical label embedding mechanisms to alleviate label imbalance and integrate relation hierarchy. Our method can be applied to any RTE model and we conduct extensive experiments on five backbones by augmenting them with our method. Experimental results on four public benchmarks show that our method can bring significant and consistent improvements to various backbones and model analysis further verify the effectiveness of our method.

Download full-text PDF

Source
http://dx.doi.org/10.1109/TPAMI.2025.3607794DOI Listing

Publication Analysis

Top Keywords

sentence-level semantics
12
relation semantics
8
extraction modules
8
propose model-agnostic
8
model-agnostic multi-relation
8
semantics
7
sentence-level
5
sentence-level relation
4
semantics learning
4
learning contrastive
4

Similar Publications

Sentence-level semantics plays a key role in language understanding. There exist subtle relations and dependencies among sentence-level samples, which is to be exploited. For example, in relational triple extraction, existing models overemphasize extraction modules, ignoring the sentence-level semantics and relation information, which causes (1) the semantics fed to extraction modules is relation-unaware; (2) each sample is trained individually without considering inter-sample dependency.

View Article and Find Full Text PDF

Given oral language's role in writing proficiency and to address measurement issues in oral and written language, we trialed complementary scoring metrics in language sample analysis (LSA) with the sentence-level Picture Word Writing Curriculum-Based Measure (CBM-W). Using the Picture Word CBM-W samples of 123 students with writing difficulties, we investigated (1a) alternate form reliability, (1b) criterion-related validity with existing Picture Word CBM-W metrics, (2) criterion-related validity with a standardized written expression measure, and (3) sensitivity to growth from fall to spring for LSA and Picture Word CBM-W scoring mechanisms. Pearson product-moment correlations, Spearman's correlations, and Bonferroni-corrected paired-samples -tests revealed two promising LSA metrics with evidence of technical quality and sensitivity to growth as a complementary scoring mechanism for Picture Word CBM-W: mean length of T-Unit in morphemes (MLTU-M) using the mean of two forms in the fall, and number of different words (NDW) using the mean of two forms in fall and spring.

View Article and Find Full Text PDF

Infusing clinical knowledge into language models by subword optimisation and embedding initialisation.

Comput Biol Med

September 2025

University College London, Institute of Health Informatics, 222 Euston Rd., London, NW1 2DA, UK; School of Health and Wellbeing, University of Glasgow, UK. Electronic address:

Objective: This study introduces a novel tokenisation methodology, K-Tokeniser, to infuse clinical knowledge into language models for clinical text processing.

Methods: Technically, at initialisation stage, K-Tokeniser populates global representations of tokens based on semantic types of domain concepts (such as drugs or diseases) from either a domain ontology like Unified Medical Language System or the training data of the task related corpus. At training or inference stage, sentence level localised context will be utilised for choosing the optimal global token representation to realise the semantic-based tokenisation.

View Article and Find Full Text PDF

Objectives: This paper aims to measure the government's attention to the health industry accurately, which is crucial for understanding policy directions and resource allocation strategies.

Methods: Addressing the limitations of traditional word frequency methods, such as restricted word segmentation and ambiguous terms, the BERTopic (Bidirectional Encoder Representations from Transformers Topic Modeling) is applied to measure government attention at the sentence level. Rule matching in an ambiguous dictionary, which is expanded by utilizing the word2vec model, is to achieve accurate identification of unclassified topics.

View Article and Find Full Text PDF

Patients have distinct information needs about their hospitalization that can be addressed using clinical evidence from electronic health records (EHRs). While artificial intelligence (AI) systems show promise in meeting these needs, robust datasets are needed to evaluate the factual accuracy and relevance of AI-generated responses. To our knowledge, no existing dataset captures patient information needs in the context of their EHRs.

View Article and Find Full Text PDF