ResearchHub | Open Science Community

Xiaochun Cao

Author with expertise in Visual Object Tracking and Person Re-identification

Achievements

Cited Author

Key Stats

Upvotes received:

Publications:

(35% Open Access)

Cited by:

7,074

h-index:

i10-index:

256

Reputation

Biology

< 1%

Chemistry

< 1%

Economics

< 1%

How is this calculated?

Publications

Joint Optic Disc and Cup Segmentation Based on Multi-Label Deep Network and Polar Transformation

Huazhu Fu et al.Jan 9, 2018

Glaucoma is a chronic eye disease that leads to irreversible vision loss. The cup to disc ratio (CDR) plays an important role in the screening and diagnosis of glaucoma. Thus, the accurate and automatic segmentation of optic disc (OD) and optic cup (OC) from fundus images is a fundamental task. Most existing methods segment them separately, and rely on hand-crafted visual feature from fundus images. In this paper, we propose a deep learning architecture, named M-Net, which solves the OD and OC segmentation jointly in a one-stage multi-label system. The proposed M-Net mainly consists of multi-scale input layer, U-shape convolutional network, side-output layer, and multi-label loss function. The multi-scale input layer constructs an image pyramid to achieve multiple level receptive field sizes. The U-shape convolutional network is employed as the main body network structure to learn the rich hierarchical representation, while the side-output layer acts as an early classifier that produces a companion local prediction map for different scale layers. Finally, a multi-label loss function is proposed to generate the final segmentation map. For improving the segmentation performance further, we also introduce the polar transformation, which provides the representation of the original image in the polar coordinate system. The experiments show that our M-Net system achieves state-of-the-art OD and OC segmentation result on ORIGA data set. Simultaneously, the proposed method also obtains the satisfactory glaucoma screening performances with calculated CDR value on both ORIGA and SCES datasets.

Artificial Intelligence

Ophthalmology

Paper

Artificial Intelligence

781

Save

Diversity-induced Multi-view Subspace Clustering

Xiaochun Cao et al.Jun 1, 2015

In this paper, we focus on how to boost the multi-view clustering by exploring the complementary information among multi-view features. A multi-view clustering framework, called Diversity-induced Multi-view Subspace Clustering (DiMSC), is proposed for this task. In our method, we extend the existing subspace clustering into the multi-view domain, and utilize the Hilbert Schmidt Independence Criterion (HSIC) as a diversity term to explore the complementarity of multi-view representations, which could be solved efficiently by using the alternating minimizing optimization. Compared to other multi-view clustering methods, the enhanced complementarity reduces the redundancy between the multi-view representations, and improves the accuracy of the clustering results. Experiments on both image and video face clustering well demonstrate that the proposed method outperforms the state-of-the-art methods.

Genetics

Artificial Intelligence

Paper

Genetics

595

Save

Generalized Latent Multi-View Subspace Clustering

Changqing Zhang et al.Oct 24, 2018

Subspace clustering is an effective method that has been successfully applied to many applications. Here, we propose a novel subspace clustering model for multi-view data using a latent representation termed Latent Multi-View Subspace Clustering (LMSC). Unlike most existing single-view subspace clustering methods, which directly reconstruct data points using original features, our method explores underlying complementary information from multiple views and simultaneously seeks the underlying latent representation. Using the complementarity of multiple views, the latent representation depicts data more comprehensively than each individual view, accordingly making subspace representation more accurate and robust. We proposed two LMSC formulations: linear LMSC (lLMSC), based on linear correlations between latent representation and each view, and generalized LMSC (gLMSC), based on neural networks to handle general relationships. The proposed method can be efficiently optimized under the Augmented Lagrangian Multiplier with Alternating Direction Minimization (ALM-ADM) framework. Extensive experiments on diverse datasets demonstrate the effectiveness of the proposed method.

Artificial Intelligence

Law

Paper

Artificial Intelligence

534

Save

Low-Rank Tensor Constrained Multiview Subspace Clustering

Changqing Zhang et al.Dec 1, 2015

In this paper, we explore the problem of multiview subspace clustering. We introduce a low-rank tensor constraint to explore the complementary information from multiple views and, accordingly, establish a novel method called Low-rank Tensor constrained Multiview Subspace Clustering (LT-MSC). Our method regards the subspace representation matrices of different views as a tensor, which captures dexterously the high order correlations underlying multiview data. Then the tensor is equipped with a low-rank constraint, which models elegantly the cross information among different views, reduces effectually the redundancy of the learned subspace representations, and improves the accuracy of clustering as well. The inference process of the affinity matrix for clustering is formulated as a tensor nuclear norm minimization problem, constrained with an additional L2,1-norm regularizer and some linear equalities. The minimization problem is convex and thus can be solved efficiently by an Augmented Lagrangian Alternating Direction Minimization (AL-ADM) method. Extensive experimental results on four benchmark datasets show the effectiveness of our proposed LT-MSC method.

Artificial Intelligence

Computer Vision And Pattern Recognition

Paper

Artificial Intelligence

455

Save

Latent Multi-view Subspace Clustering

Changqing Zhang et al.Jul 1, 2017

In this paper, we propose a novel Latent Multi-view Subspace Clustering (LMSC) method, which clusters data points with latent representation and simultaneously explores underlying complementary information from multiple views. Unlike most existing single view subspace clustering methods that reconstruct data points using original features, our method seeks the underlying latent representation and simultaneously performs data reconstruction based on the learned latent representation. With the complementarity of multiple views, the latent representation could depict data themselves more comprehensively than each single view individually, accordingly makes subspace representation more accurate and robust as well. The proposed method is intuitive and can be optimized efficiently by using the Augmented Lagrangian Multiplier with Alternating Direction Minimization (ALM-ADM) algorithm. Extensive experiments on benchmark datasets have validated the effectiveness of our proposed method.

Genetics

Artificial Intelligence

Paper

Genetics

433

Save

Cluster-Based Co-Saliency Detection

Huazhu Fu et al.Apr 25, 2013

Co-saliency is used to discover the common saliency on the multiple images, which is a relatively underexplored area. In this paper, we introduce a new cluster-based algorithm for co-saliency detection. Global correspondence between the multiple images is implicitly learned during the clustering process. Three visual attention cues: contrast, spatial, and corresponding, are devised to effectively measure the cluster saliency. The final co-saliency maps are generated by fusing the single image saliency and multiimage saliency. The advantage of our method is mostly bottom-up without heavy learning, and has the property of being simple, general, efficient, and effective. Quantitative and qualitative experiments result in a variety of benchmark datasets demonstrating the advantages of the proposed method over the competing co-saliency methods. Our method on single image also outperforms most the state-of-the-art saliency detection methods. Furthermore, we apply the co-saliency method on four vision applications: co-segmentation, robust image distance, weakly supervised learning, and video foreground detection, which demonstrate the potential usages of the co-saliency map.

Philosophy

Artificial Intelligence

Paper

Philosophy

401

Save

Low-Light Image Enhancement via a Deep Hybrid Network

Wenqi Ren et al.Apr 16, 2019

Camera sensors often fail to capture clear images or videos in a poorly lit environment. In this paper, we propose a trainable hybrid network to enhance the visibility of such degraded images. The proposed network consists of two distinct streams to simultaneously learn the global content and the salient structures of the clear image in a unified network. More specifically, the content stream estimates the global content of the low-light input through an encoder-decoder network. However, the encoder in the content stream tends to lose some structure details. To remedy this, we propose a novel spatially variant recurrent neural network (RNN) as an edge stream to model edge details, with the guidance of another auto-encoder. The experimental results show that the proposed network favorably performs against the state-of-the-art low-light image enhancement algorithms.

Artificial Intelligence

Computer Vision And Pattern Recognition

Paper

Artificial Intelligence

391

Save

Deep People Counting in Extremely Dense Crowds

Chuan Wang et al.Oct 13, 2015

People counting in extremely dense crowds is an important step for video surveillance and anomaly warning. The problem becomes especially more challenging due to the lack of training samples, severe occlusions, cluttered scenes and variation of perspective. Existing methods either resort to auxiliary human and face detectors or surrogate by estimating the density of crowds. Most of them rely on hand-crafted features, such as SIFT, HOG etc, and thus are prone to fail when density grows or the training sample is scarce. In this paper we propose an end-to-end deep convolutional neural networks (CNN) regression model for counting people of images in extremely dense crowds. Our method has following characteristics. Firstly, it is a deep model built on CNN to automatically learn effective features for counting. Besides, to weaken influence of background like buildings and trees, we purposely enrich the training data with expanded negative samples whose ground truth counting is set as zero. With these negative samples, the robustness can be enhanced. Extensive experimental results show that our method achieves superior performance than the state-of-the-arts in term of the mean and variance of absolute difference.

Artificial Intelligence

Biochemistry

Paper

Artificial Intelligence

370

Save

High Capacity Reversible Data Hiding in Encrypted Images by Patch-Level Sparse Representation

Xiaochun Cao et al.Apr 30, 2015

Reversible data hiding in encrypted images has attracted considerable attention from the communities of privacy security and protection. The success of the previous methods in this area has shown that a superior performance can be achieved by exploiting the redundancy within the image. Specifically, because the pixels in the local structures (like patches or regions) have a strong similarity, they can be heavily compressed, thus resulting in a large hiding room. In this paper, to better explore the correlation between neighbor pixels, we propose to consider the patch-level sparse representation when hiding the secret data. The widely used sparse coding technique has demonstrated that a patch can be linearly represented by some atoms in an over-complete dictionary. As the sparse coding is an approximation solution, the leading residual errors are encoded and self-embedded within the cover image. Furthermore, the learned dictionary is also embedded into the encrypted image. Thanks to the powerful representation of sparse coding, a large vacated room can be achieved, and thus the data hider can embed more secret messages in the encrypted image. Extensive experiments demonstrate that the proposed method significantly outperforms the state-of-the-art methods in terms of the embedding rate and the image quality.

Artificial Intelligence

Theoretical Computer Science

Paper

Artificial Intelligence

342

Save

Image Deblurring via Extreme Channels Prior

Yanyang Yan et al.Jul 1, 2017

Camera motion introduces motion blur, affecting many computer vision tasks. Dark Channel Prior (DCP) helps the blind deblurring on scenes including natural, face, text, and low-illumination images. However, it has limitations and is less likely to support the kernel estimation while bright pixels dominate the input image. We observe that the bright pixels in the clear images are not likely to be bright after the blur process. Based on this observation, we first illustrate this phenomenon mathematically and define it as the Bright Channel Prior (BCP). Then, we propose a technique for deblurring such images which elevates the performance of existing motion deblurring algorithms. The proposed method takes advantage of both Bright and Dark Channel Prior. This joint prior is named as extreme channels prior and is crucial for achieving efficient restorations by leveraging both the bright and dark information. Extensive experimental results demonstrate that the proposed method is more robust and performs favorably against the state-of-the-art image deblurring methods on both synthesized and natural images.

Artificial Intelligence

Media Technology

Paper

Artificial Intelligence

318

Save