Alexander J Root

Abstract

The rapid growth in the size of deep learning models strains the capabilities of dense computation paradigms. Leveraging sparse computation has become increasingly popular for training and deploying large-scale models, but existing deep learning frameworks lack extensive support for sparse operations. However, existing approaches either require manual scheduling expertise or rely on exhaustive search taking hours to days, which are both incompatible with the interactive development essential to machine learning research. We present three algorithmic contributions that enable fast, automatic optimization of sparse tensor computations. First, we develop a heuristic-based loop ordering algorithm that avoids asymptotic performance cliffs while compiling in milliseconds rather than hours. Second, we introduce a tiling algorithm specialized for mixed sparse-dense computations that achieves cache performance comparable to hand-optimized kernels. Third, we present a format inference algorithm that automatically selects appropriate sparse tensor formats for intermediate and output tensors based on operation semantics. These algorithms are grounded in the mathematical properties of sparse tensor algebra, making them predictable and robust across diverse workloads. We implement these techniques in Scorch, a prototype sparse tensor compiler for PyTorch that demonstrates their practical effectiveness. With only minimal code changes, our approach achieves 1.05-5.80x speedups over PyTorch Sparse on end-to-end tasks including graph neural networks, sparse autoencoders, and sparse transformers, while maintaining the interactive compilation speeds essential for ML development.

Article

Code

Code is publicly available here.

BibTeX


      @article{yan2026,
  title={Fast Autoscheduling for Sparse ML Frameworks},
  author={Bobby Yan  and Alexander J Root  and Trevor Gale  and David Broman  and Fredrik Kjolstad },
  journal={Proceedings of the 24th ACM/IEEE International Symposium on Code Generation and Optimization (CGO)},
  year={2026},
  month={February}
}