ResearchHub | Open Science Community

MH

Mahantesh Halappanavar

Author with expertise in Role of Microglia in Neurological Disorders

Achievements

Open Access Advocate

Cited Author

Key Stats

Upvotes received:

0

Publications:

3

(100% Open Access)

Cited by:

16

h-index:

21

/

i10-index:

50

Reputation

Biology

< 1%

Chemistry

< 1%

Economics

< 1%

Show more

How is this calculated?

Publications

FastPG: Fast clustering of millions of single cells

Tom Bodenheimer et al.Jun 20, 2020

Abstract Current single-cell experiments can produce datasets with millions of cells. Unsupervised clustering can be used to identify cell populations in single-cell analysis but often leads to interminable computation time at this scale. This problem has previously been mitigated by subsampling cells, which greatly reduces accuracy. We built on the graph-based algorithm PhenoGraph and developed FastPG which has the same cell assignment accuracy but is on average 27x faster in our tests. FastPG also has higher cell assignment accuracy than two other fast clustering methods, FlowSOM and PARC. Availability FastPG is available here: https://github.com/sararselitsky/FastPG

Artificial Intelligence

16

Paper

Artificial Intelligence

Save

AGS-GNN: Attribute-guided Sampling for Graph Neural Networks

Siddhartha Das et al.Aug 24, 2024

We propose AGS-GNN, a novel attribute-guided sampling algorithm for Graph Neural Networks (GNNs). AGS-GNN exploits the node features and the connectivity structure of a graph while simultaneously adapting for both homophily and heterophily in graphs. In homophilic graphs, vertices of the same class are more likely to be adjacent, but vertices of different classes tend to be adjacent in heterophilic graphs. GNNs have been successfully applied to homophilic graphs, but their utility to heterophilic graphs remains challenging. The state-of-the-art GNNs for heterophilic graphs use the full neighborhood of a node instead of sampling it, and hence do not scale to large graphs and are not inductive. We develop dual-channel sampling techniques based on feature-similarity and feature-diversity to select subsets of neighbors for a node that capture adaptive information from homophilic and heterophilic neighborhoods. Currently, AGS-GNN is the only algorithm that explicitly controls homophily in the sampled subgraph through similar and diverse neighborhood samples. For diverse neighborhood sampling, we employ submodularity, a novel contribution in this context. We pre-compute the sampling distribution in parallel, achieving the desired scalability. Using an extensive dataset consisting of 35 small (< 100K nodes) and large (- 100K nodes) homophilic and heterophilic graphs, we demonstrate the superiority of AGS-GNN compared to the state-of-the-art approaches. AGS-GNN achieves test accuracy comparable to the best-performing heterophilic GNNs, even outperforming methods that use the entire graph for node classification. AGS-GNN converges faster than methods that sample neighborhoods randomly, and can be incorporated into existing GNN models that employ node or graph sampling.

Artificial Intelligence

Theoretical Computer Science

0

Paper

Artificial Intelligence

Theoretical Computer Science

Save

FuseIM: Fusing Probabilistic Traversals for Influence Maximization on Exascale Systems

Reece Neff et al.May 30, 2024

Probabilistic breadth-first traversals (BPTs) are used in many network science and graph machine learning applications. In this paper, we are motivated by the application of BPTs in stochastic diffusion-based graph problems such as influence maximization. These applications heavily rely on BPTs to implement a Monte-Carlo sampling step for their approximations. Given the large sampling complexity, stochasticity of the diffusion process, and the inherent irregularity in real-world graph topologies, efficiently parallelizing these BPTs remains significantly challenging. In this paper, we present a new algorithm to fuse a massive number of concurrently executing BPTs with random starts on the input graph. Our algorithm is designed to fuse BPTs by combining separate probabilistic traversals into a unified frontier. To show the general applicability of the fused BPT technique, we have incorporated it into two state-of-the-art influence maximization parallel implementations (gIM and Ripples). Our experiments on up to 4K nodes of the OLCF Frontier supercomputer (32,768 GPUs and 196K CPU cores) show strong scaling behavior, and that fused BPTs can improve the performance of these implementations up to 182.13× (avg. 75.15×) and 359.86× (avg. 135.17×) for gIM and Ripples, respectively.

Artificial Intelligence

Signal Processing

0

Paper

Artificial Intelligence

Signal Processing

Save