98%
921
2 minutes
20
Scene text recognition (STR) methods have struggled to attain high accuracy and fast inference speed. Auto-Regressive (AR)-based models implement the recognition in a character-by-character manner, showing superiority in accuracy but with slow inference speed. Alternatively, Parallel Decoding (PD)-based models infer all characters in a single decoding pass, offering faster inference speed but generally worse accuracy. To realize the dual goals of "AR-level accuracy and PD-level speed", we propose a Context Perception Parallel Decoder (CPPD) to perceive the related context and predict the character sequence in a PD pass. CPPD devises a character counting module to infer the occurrence count of each character, and a character ordering module to deduce the content-free reading order and positions. Meanwhile, the character prediction task associates the positions with characters. They together build a comprehensive recognition context, which benefits the decoder to focus accurately on characters with the attention mechanism, thereby improving the recognition accuracy. We construct a series of CPPD models and also plug the proposed modules into existing STR decoders. Experiments on both English and Chinese benchmarks demonstrate that the CPPD models achieve highly competitive accuracy while running much faster than existing leading models. Moreover, the plugged models achieve significant accuracy improvements.
Download full-text PDF |
Source |
---|---|
http://dx.doi.org/10.1109/TPAMI.2025.3545453 | DOI Listing |
Front Plant Sci
August 2025
College of Mathematics and Computer Science, Yan'an University, Yan'an, Shaanxi, China.
To address the challenge of real-time kiwifruit detection in trellised orchards, this paper proposes YOLOv10-Kiwi, a lightweight detection model optimized for resource-constrained devices. First, a more compact network is developed by adjusting the scaling factors of the YOLOv10n architecture. Second, to further reduce model complexity, a novel C2fDualHet module is proposed by integrating two consecutive Heterogeneous Kernel Convolution (HetConv) layers as a replacement for the traditional Bottleneck structure.
View Article and Find Full Text PDFJ Biopharm Stat
September 2025
Biostatistics and Research Decision Sciences, Merck & Co. Inc., North Wales, Pennsylvania, USA.
A randomized clinical trial with multiple experimental groups and one common control group is often used to speed up development to select the best experimental regimen or to increase the chance of success of clinical trials. Most of the time, multiple dose levels of an experimental drug or multiple combinations of one experimental drug with other drugs comprise multiple experimental groups. Because the experimental drug appears in multiple comparisons with a shared control group, multiple testing adjustments to control the family-wise type I error rate are needed.
View Article and Find Full Text PDFFront Plant Sci
September 2025
College of Big Data, Yunnan Agricultural University, Kunming, China.
Introduction: Accurate identification of cherry maturity and precise detection of harvestable cherry contours are essential for the development of cherry-picking robots. However, occlusion, lighting variation, and blurriness in natural orchard environments present significant challenges for real-time semantic segmentation.
Methods: To address these issues, we propose a machine vision approach based on the PIDNet real-time semantic segmentation framework.
Int J Comput Assist Radiol Surg
September 2025
School of Life and Environmental Sciences, Guilin University of Electronic Technology, Guilin, China.
Objective: Cataract surgery is among the most frequently performed procedures worldwide. Accurate, real-time segmentation of the cornea and surgical instruments is vital for intraoperative guidance and surgical education. However, most existing deep learning-based segmentation methods depend on pixel-level annotations, which are time-consuming and limit practical deployment.
View Article and Find Full Text PDFFront Med (Lausanne)
August 2025
Department of Orthopaedics, The First Affiliated Hospital of Soochow University, Suzhou, China.
Introduction: CT-based classification of distal ulnar-radius fractures requires precise detection of subtle features for surgical planning, yet existing methods struggle to balance accuracy with clinical efficiency. This study aims to develop a lightweight architecture that achieves accurate AO (Arbeitsgemeinschaft für Osteosynthesefragen) typing[an internationally recognized fracture classification system based on fracture location, degree of joint surface involvement, and comminution, divided into three major categories: A (extra-articular), B (partially intra-articular), and C (completely intra-articular)] while maintaining real-time performance. In this task, the major challenges are capturing complex fracture morphologies without compromising detection speed and ensuring precise identification of small articular fragments critical for surgical decision-making.
View Article and Find Full Text PDF