Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

20

Article Abstract

Under the framework of spectral clustering, the key of subspace clustering is building a similarity graph, which describes the neighborhood relations among data points. Some recent works build the graph using sparse, low-rank, and l -norm-based representation, and have achieved the state-of-the-art performance. However, these methods have suffered from the following two limitations. First, the time complexities of these methods are at least proportional to the cube of the data size, which make those methods inefficient for solving the large-scale problems. Second, they cannot cope with the out-of-sample data that are not used to construct the similarity graph. To cluster each out-of-sample datum, the methods have to recalculate the similarity graph and the cluster membership of the whole data set. In this paper, we propose a unified framework that makes the representation-based subspace clustering algorithms feasible to cluster both the out-of-sample and the large-scale data. Under our framework, the large-scale problem is tackled by converting it as the out-of-sample problem in the manner of sampling, clustering, coding, and classifying. Furthermore, we give an estimation for the error bounds by treating each subspace as a point in a hyperspace. Extensive experimental results on various benchmark data sets show that our methods outperform several recently proposed scalable methods in clustering a large-scale data set.

Download full-text PDF

Source
http://dx.doi.org/10.1109/TNNLS.2015.2490080DOI Listing

Publication Analysis

Top Keywords

subspace clustering
12
large-scale data
12
similarity graph
12
unified framework
8
framework representation-based
8
representation-based subspace
8
out-of-sample large-scale
8
data
8
data framework
8
graph cluster
8

Similar Publications

Cluster synchronization via graph Laplacian eigenvectors.

Chaos

September 2025

Department of Mathematics and Statistics, University of Vermont, Burlington, Vermont 05405, USA.

Almost equitable partitions (AEPs) have been linked to cluster synchronization in oscillatory systems, highlighting the importance of structure in collective network dynamics. We provide a general spectral framework that formalizes this connection, showing how eigenvectors associated with AEPs span a subspace of the Laplacian spectrum that governs partition-induced synchronization behavior. This offers a principled reduction of network dynamics, allowing clustered states to be understood in terms of quotient graph projections.

View Article and Find Full Text PDF

Single-cell RNA sequencing (scRNA-seq) offers significant opportunities to reveal cellular heterogeneity and diversity. Accurate cell type identification is critical for downstream analyses and understanding the mechanisms of heterogeneity. However, challenges arise from the high dimensionality, sparsity, and noise of scRNA-seq data.

View Article and Find Full Text PDF

The bipartite graph structure has shown its promising ability in facilitating the subspace clustering and spectral clustering algorithms for large-scale datasets. To avoid the post-processing via k-means during the bipartite graph partitioning, the constrained Laplacian rank (CLR) is often utilized for constraining the number of connected components (i.e.

View Article and Find Full Text PDF

The set of local modes and density ridge lines are important summary characteristics of the data-generating distribution. In this work, we focus on estimating local modes and density ridges from point cloud data in a product space combining two or more Euclidean and/or directional metric spaces. Specifically, our approach extends the (subspace constrained) mean shift algorithm to such product spaces, addressing potential challenges in the generalization process.

View Article and Find Full Text PDF

The identification of cell types by clustering singlecell RNA sequencing (scRNA-seq) data is a fundamental step in the downstream analysis of single-cell data. However, great challenges remain owing to the inherent characteristics of scRNAseq data, including high dimensionality, high noise, and high sparsity. In this study, we propose a proximity enhanced graph convolutional sparse subspace clustering method scPEGSSC for scRNA-seq data.

View Article and Find Full Text PDF