Research Interests
My research focus is developing and analyzing novel methods for finding structure in noisy, high-dimensional data using tools from probability, random matrix theory, graph theory, linear algebra, harmonic analysis, and machine learning. Modern data sets often have an enormous number of features, with each observation taking values in a high-dimensional Euclidean space, and yet the data contains an underlying structure that is low-dimensional. This low-dimensional structure may arise because all of the sample points lie on a low-dimensional subspace or manifold, or because the data is well separated into distinct clusters under some metric. My research involves representing this low-dimensional structure with an appropriate data model and then constructing algorithms that can correctly extract the low-dimensional structure with high probability. This process involves careful analysis of noise, sampling, and the effects of the data dimension, in order to quantify in which regimes one can successfully extract the low-dimensional structure. In order to improve the state-of-art in data analysis, these algorithms must be computationally efficient in addition to accurate. Thus an important component of my research is developing fast numerical implementations which minimize the dependence on the ambient dimension and are log linear in the sample size. Although I am a mathematician by training, I also pursue inter-disciplinary collaborations where I can utilize tools from machine learning in domain specific areas including cybersecurity and molecular biology.
Publications
BISPECTRUM UNBIASING FOR DILATION INVARIANT MULTI-REFERENCE ALIGNMENT
IEEE Transactions on Signal Processing, Vol. 72, pp. 3761-3775, 2024. Pre-print on arxiv.
FERMAT DISTANCES: METRIC APPROXIMATION, SPECTRAL CONVERGENCE, AND CLUSTERING ALGORITHMS
N García Trillos, A Little, D McKenzie, J Murphy. Journal of Machine Learning Research (JMLR), Vol. 25, No. 176, pp. 1-65, 2024. Pre-print on arxiv.
LINEAR DISTANCE METRIC LEARNING WITH NOISY LABELS
M Alishahi, A Little, J Phillips. Journal of Machine Learning Research (JMLR), Vol. 25, No. 121, 2024. Pre-print on arxiv.
CLUSTERING AND VISUALIZATION OF SINGLE-CELL RNA-seq DATA USING PATH METRICS
A Manousidaki, A Little, Y Xie. PLOS Computational Biology, Vol. 20, No. 5, pp. e1012014, 2024. Pre-print on biorxiv.
On Generalizations of the Nonwindowed Scattering Transform
A Chua, M Hirn, and A Little. Applied and Computational Harmonic Analysis, Vol. 68, 2024. Pre-print on arxiv.
Largest Angle Path Distance for Multi-Manifold Clustering
H Chen, A Little, A Narayan. 2023 International Conference on Sampling Theory and Applications (SampTA), Yale University, 2023. IEEE link.
Power Spectrum Unbiasing for Dilation-Invariant Multi-reference Alignment
M Hirn, A Little. Journal of Fourier Analysis and Applications, Vol. 29, No. 4, 2023. ShareIt link.
AN ANALYSIS OF Classical Multidimensional SCALING WITH APPLICATIONS TO CLUSTERING
A Little, Y Xie, Q Sun. Information and Inference: A Journal of the IMA, Vol. 12, Issue 1, 2023. Pre-print on arxiv.
Taxonomy of Benchmarks in Graph Representation Learning
R Liu, S Cantürk, F Wenkel, D Sandfelder, D Kreuzer, A Little, S McGuire, L O'Bray, M Perlmutter, B Rieck, M Hirn, G Wolf, L Rampášek. Proceedings of the First Learning on Graphs Conference, PMLR, Vol. 198, 2022.
Balancing Geometry and Density: Path Distances on High-Dimensional Data
A Little, D McKenzie, J Murphy. SIAM Journal on Mathematics of Data Science (SIMODS), Vol. 4, No. 1, 2022. Pre-print on arxiv.
Wavelet invariants for statistically robust multi-reference alignment
M Hirn, A Little. Information and Inference: A Journal of the IMA, Vol. 10, Issue 4, 2021. Pre-print on arxiv.
Path-Based Spectral Clustering: Guarantees, Robustness to Outliers, and Fast Algorithms
A Little, M Maggioni, J Murphy. Journal of Machine Learning Research, Vol. 21, No. 6, 2020.
Feature Design for Protein Interface Hotspots using KFC2 and Rosetta
F Seeger, A Little, Y Chen, T Woolf, H Cheng and J Mitchell. Research in Data Science, pp. 177-197, Springer, 2019.
Translating Evidence into Practice: Interpreting Measures of Risk
L Hart, A Little. The Nurse Practitioner, Vol. 42, No. 2, 2017.
S-STEM: Mathematics, Engineering, and Physics Scholars
LA Clements, H Wang, A Little, WB Lane, and H Duong. American Society for Engineering Education (ASEE) Annual Conference & Exposition, 2017.
Multiscale geometric methods for data sets I: Multiscale SVD, noise and curvature
A Little, M Maggioni, L Rosasco. Applied and Computational Harmonic Analysis (ACHA), Vol. 43, Issue 3, 2017.
Spectral Clustering Technique for Classifying Network Attacks
A Little, X Mountrouidou, D Moseley. IEEE International Conference on Intelligent Data and Security (IDS), New York City, April 2016.
A Multiscale Spectral Method for Learning Number of Clusters
A Little, A Byrd. 14th IEEE International Conference on Machine Learning and Applications (ICMLA), Miami, Dec. 2015.
Multi-Resolution Geometric Analysis for Data in High Dimensions
G. Chen, A.V. Little, M. Maggioni. Excursions in Harmonic Analysis, Vol. 1, Editors T.D. Andrews et al., Birkhauser, 2013.
Multiscale Geometric Methods for Estimating Intrinsic Dimension
A Little, M Maggioni, L Rosasco. 9th International Conference on Sampling Theory and Applications (SampTA), Singapore, May 2011.
Some recent advances in MULTISCALE geometric analysis of point clouds
G Chen, A Little, M Maggioni, L Rosasco. Wavelets and Multiscale Analysis: Theory and Applications, Editors J. Cohen and A. Zayed, Birkhauser, 2011.
Multiscale Estimation of Intrinsic Dimensionality of Data Sets
A Little, Y Jung, M Maggioni. Association for the Advancement of Artificial Intelligence (AAAI) Fall Symposium (FS-09-04), 2009.
Estimation of Intrinsic Dimensionality of Samples from Noisy Low- dimensional Manifolds in High Dimensions with Multiscale SVD
J Lee, A Little, Y Jung, M Maggioni. 15th IEEE Workshop on Statistical Signal Processing (SSP), Cardiff, 2009.
Positive Solutions to a Diffusive Logistic Equation with Constant Yield Harvesting
T Ladner, A Little, K Marks, A Russell. Rose-Hulman Undergraduate Math Journal, Vol. 6, Issue 1, 2005.