A PHP Error was encountered

Severity: Warning

Message: file_get_contents(https://...@gmail.com&api_key=61f08fa0b96a73de8c900d749fcb997acc09&a=1): Failed to open stream: HTTP request failed! HTTP/1.1 429 Too Many Requests

Filename: helpers/my_audit_helper.php

Line Number: 197

Backtrace:

File: /var/www/html/application/helpers/my_audit_helper.php
Line: 197
Function: file_get_contents

File: /var/www/html/application/helpers/my_audit_helper.php
Line: 271
Function: simplexml_load_file_from_url

File: /var/www/html/application/helpers/my_audit_helper.php
Line: 3165
Function: getPubMedXML

File: /var/www/html/application/controllers/Detail.php
Line: 597
Function: pubMedSearch_Global

File: /var/www/html/application/controllers/Detail.php
Line: 511
Function: pubMedGetRelatedKeyword

File: /var/www/html/index.php
Line: 317
Function: require_once

Graphic association learning: Multimodal feature extraction and fusion of image and text using artificial intelligence techniques. | LitMetric

Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

20

Article Abstract

With the advancement of technology in recent years, the application of artificial intelligence in real life has become more extensive. Graphic recognition is a hot spot in the current research of related technologies. It involves machines extracting key information from pictures and combining it with natural language processing for in-depth understanding. Existing methods still have obvious deficiencies in fine-grained recognition and deep understanding of contextual context. Addressing these issues to achieve high-quality image-text recognition is crucial for various application scenarios, such as accessibility technologies, content creation, and virtual assistants. To tackle this challenge, a novel approach is proposed that combines the Mask R-CNN, DCGAN, and ALBERT models. Specifically, the Mask R-CNN specializes in high-precision image recognition and segmentation, the DCGAN captures and generates nuanced features from images, and the ALBERT model is responsible for deep natural language processing and semantic understanding of this visual information. Experimental results clearly validate the superiority of this method. Compared to traditional image-text recognition techniques, the recognition accuracy is improved from 85.3% to 92.5%, and performance in contextual and situational understanding is enhanced. The advancement of this technology has far-reaching implications for research in machine vision and natural language processing and open new possibilities for practical applications.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11417159PMC
http://dx.doi.org/10.1016/j.heliyon.2024.e37167DOI Listing

Publication Analysis

Top Keywords

natural language
12
language processing
12
artificial intelligence
8
advancement technology
8
image-text recognition
8
mask r-cnn
8
recognition
6
graphic association
4
association learning
4
learning multimodal
4

Similar Publications