Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

20

Article Abstract

Inspired by the perceived saturation of human visual system, this paper proposes a two-stream hybrid networks to simulate binocular vision for salient object detection (SOD). Each stream in our system consists of unsupervised and supervised methods to form a two-branch module, so as to model the interaction between human intuition and memory. The two-branch module parallel processes visual information with bottom-up and top-down SODs, and output two initial saliency maps. Then a polyharmonic neural network with random-weight (PNNRW) is utilized to fuse two-branch's perception and refine the salient objects by learning online via multi-source cues. Depend on visual perceptual saturation, we can select optimal parameter of superpixel for unsupervised branch, locate sampling regions for PNNRW, and construct a positive feedback loop to facilitate perception saturated after the perception fusion. By comparing the binary outputs of the two-stream, the pixel annotation of predicted object with high saturation degree could be taken as new training samples. The presented method constitutes a semi-supervised learning framework actually. Supervised branches only need to be pre-trained initial, the system can collect the training samples with high confidence level and then train new models by itself. Extensive experiments show that the new framework can improve performance of the existing SOD methods, that exceeds the state-of-the-art methods in six popular benchmarks.

Download full-text PDF

Source
http://dx.doi.org/10.1109/TIP.2021.3074796DOI Listing

Publication Analysis

Top Keywords

salient object
8
object detection
8
visual perceptual
8
perceptual saturation
8
two-stream hybrid
8
hybrid networks
8
two-branch module
8
training samples
8
detection based
4
visual
4

Similar Publications

Camouflaged Object Segmentation (COS) faces significant challenges due to the scarcity of annotated data, where meticulous pixel-level annotation is both labor-intensive and costly, primarily due to the intricate object-background boundaries. Addressing the core question, "Can COS be effectively achieved in a zero-shot manner without manual annotations for any camouflaged object?", we propose an affirmative solution. We analyze the learned attention patterns for camouflaged objects and introduce a robust zero-shot COS framework.

View Article and Find Full Text PDF

The impact of scene inversion on early scene-selective activity.

Biol Psychol

September 2025

Department of Psychology, Wright State University, Dayton OH. Electronic address:

Category-selectivity is a ubiquitous property of high-level visual cortex manifested in distinct cortical responses to faces, objects, and scenes. These signatures emerge early during visual processing, with each category sensitive to specific types of visual information at different time points. However, it is still not clear what information is extracted during early scene-selective processing, as scenes are rich, complex, and multidimensional stimuli.

View Article and Find Full Text PDF

Computational saliency map models have facilitated quantitative investigations into how bottom-up visual salience influences attention. Two primary approaches to modeling salience computation exist: one focuses on functional approximation, while the other explores neurobiological implementation. The former provides sufficient performance for applying saliency map models to eye-movement data analysis, whereas the latter offers hypotheses on how neuronal abnormalities affect visual salience.

View Article and Find Full Text PDF

When planning reach-to-grasp movements, individuals frequently face a tradeoff between biomechanical comfort (i.e., avoiding effortful actions) and "socio-emotional comfort" (i.

View Article and Find Full Text PDF

In blind individuals, language processing activates not only classic language networks, but also the "visual" cortex. What is represented in visual areas when blind individuals process language? Here, we show that area V5/MT in blind individuals, but not other visual areas, responds differently to spoken nouns and verbs. We further show that this effect is present for concrete nouns and verbs, but not abstract or pseudo nouns and verbs.

View Article and Find Full Text PDF