98%
921
2 minutes
20
With the growth of field-programmable gate array (FPGA) hardware resources, streaming DCNN accelerators leverage interconvolutional-layer parallelism to enhance throughput. In existing streaming accelerators, convolution nodes typically adopt layer- or column-based tiling methods, where the tiled input feature map (Ifmap) encompasses all input channels. This approach facilitates the comprehensive calculation of the output feature map (Ofmap) and maximizes interlayer parallelism. The computational granularity, defined in this study as the calculated rows or columns of Ofmap based on each tiled Ifmap data, significantly influences on-chip Ifmap storage and off-chip weight bandwidth (BW). The uniform application of computational granularity across all nodes inevitably impacts the memory-BW tradeoff. This article introduces a novel streaming accelerator with a hybrid computational granularity (HCG) scheme. Each node employs an independently optimized computational granularity, enabling a more flexible memory-BW tradeoff and more effective utilization of FPGA resources. However, this hybrid scheme can introduce pipeline bubbles and increase system pipeline complexity and control logic. To address these challenges, this article theoretically analyzes the impact of computational granularity on individual computing nodes and the overall system, aiming to establish a seamless system pipeline without pipeline bubbles and simplify system design. Furthermore, the article develops a hardware overhead model and employs a heuristic algorithm to optimize computational granularity for each computing node, achieving optimal memory-BW tradeoff and higher throughput. Finally, the effectiveness of the proposed design and optimization methodology is validated through the implementation of a 3-TOPS ResNet-18 accelerator on the Alveo U250 development board under BW constraints of 25, 20, and 15 GB/s. Additionally, accelerators for 4-TOPS VGG-16, 4-TOPS ResNet-34, 5-TOPS ResNet-50, 3-TOPS MobileNetV1, 4-TOPS ConvNeXt-T, and 4-TOPS ResNeXt-50 are implemented, surpassing the performance of most existing works.
Download full-text PDF |
Source |
---|---|
http://dx.doi.org/10.1109/TNNLS.2025.3587694 | DOI Listing |
IEEE Trans Pattern Anal Mach Intell
September 2025
Camouflaged Object Segmentation (COS) faces significant challenges due to the scarcity of annotated data, where meticulous pixel-level annotation is both labor-intensive and costly, primarily due to the intricate object-background boundaries. Addressing the core question, "Can COS be effectively achieved in a zero-shot manner without manual annotations for any camouflaged object?", we propose an affirmative solution. We analyze the learned attention patterns for camouflaged objects and introduce a robust zero-shot COS framework.
View Article and Find Full Text PDFHealth Inf Sci Syst
December 2025
School of Information Science and Automation, Northeastern University, Shenyang, 110819 China.
Accurate prediction of drug-target interactions (DTIs) is crucial for improving the efficiency and success rate of drug development. Despite recent advancements, existing methods often fail to leverage interaction features at multiple granular levels, resulting in suboptimal data utilization and limited predictive performance. To address these challenges, we propose CF-DTI, a coarse-to-fine drug-target interaction model that integrates both coarse-grained and fine-grained features to enhance predictive accuracy.
View Article and Find Full Text PDFPhys Life Rev
September 2025
Institute of Cognitive Sciences and Technologies, National Research Council, Rome, Italy. Electronic address:
We present a novel computational model employing hierarchical active inference to simulate reading and eye movements. The model characterizes linguistic processing as inference over a hierarchical generative model, facilitating predictions and inferences at various levels of granularity, from syllables to sentences. Our approach combines the strengths of large language models for realistic textual predictions and active inference for guiding eye movements to informative textual information, enabling the testing of predictions.
View Article and Find Full Text PDFIEEE J Biomed Health Inform
September 2025
The personalization of cancer treatment through drug combinations is critical for improving healthcare outcomes, increasing effectiveness, and reducing side effects. Computational methods have become increasingly important to prioritize synergistic drug pairs because of the vast search space of possible chemicals. However, existing approaches typically rely solely on global molecular structures, neglecting information exchange between different modality representations and interactions between molecular and fine-grained fragments, leading to limited understanding of drug synergy mechanisms for personalized treatment.
View Article and Find Full Text PDFCogn Neurodyn
December 2025
Department of Molecular Medicine, University of Rome Sapienza, Piazzale Aldo Moro 5, Rome, 00185 Lazio region Italy.
Person identification method based on electroencephalograms (EEG) signals, or so called brainprint recognition is a novel way to distinguish identities with advantages of high security. However, existing methods neglect the distribution difference between training and test data, and the large distance between projected features in the latent space makes the performance of the model degrade in the unseen domain data. In this paper, we propose channel aggregated based generalized contrastive learning framework, which combines multiple modules to overcome this challenge.
View Article and Find Full Text PDF