Compiler Support for Sparse Tensor Convolutions
Peiming Liu, Google Research
Alexander J Root, Stanford University
Anlun Xu, Google Cloud
Yinying Li, Google Research
Fredrik Kjolstad, Stanford University
Aart J.C. Bik, Google Research
Abstract
This paper extends prior work on sparse tensor algebra compilers to generate asymptotically efficient code for tensor expressions with affine subscript expressions. Our technique enables compiler support for a wide range of sparse computations, including sparse convolutions and pooling that are widely used in ML and graphics applications. We propose an approach that gradually rewrites compound subscript expressions to simple subscript expressions with loops that exploit the sparsity pattern of the input sparse tensors. As a result, the time complexity of the generated kernels is bounded by the number of stored elements and not by the shape of the tensors. Our approach seamlessly integrates into existing frameworks and is compatible with recent advances in compilers for sparse computations, including the flexibility to efficiently handle arbitrary combinations of different sparse tensor formats. The implementation of our algorithm is open source and upstreamed to the MLIR sparse compiler. Experimental results show that our method achieves 19.5x speedup when compared with the state-of-the-art compiler-based method at 99.9% sparsity. The generated sparse kernels start to outperform dense convolution implementations at about 80% sparsity.
Article
ACM Digital LibraryCode
Code is publicly available here.
BibTeX
@inproceedings{liu2024spconv,
author = {Liu, Peiming and Root, Alexander J and Xu, Anlun and Li, Yinying and Kjolstad, Fredrik and Bik, Aart J.C.},
title = {Compiler Support for Sparse Tensor Convolutions},
year = {2024},
issue_date = {October 2024},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
volume = {8},
number = {OOPSLA2},
url = {https://doi.org/10.1145/3689721},
doi = {10.1145/3689721},
abstract = {This paper extends prior work on sparse tensor algebra compilers to generate asymptotically efficient code for tensor expressions with affine subscript expressions. Our technique enables compiler support for a wide range of sparse computations, including sparse convolutions and pooling that are widely used in ML and graphics applications. We propose an approach that gradually rewrites compound subscript expressions to simple subscript expressions with loops that exploit the sparsity pattern of the input sparse tensors. As a result, the time complexity of the generated kernels is bounded by the number of stored elements and not by the shape of the tensors. Our approach seamlessly integrates into existing frameworks and is compatible with recent advances in compilers for sparse computations, including the flexibility to efficiently handle arbitrary combinations of different sparse tensor formats. The implementation of our algorithm is open source and upstreamed to the MLIR sparse compiler. Experimental results show that our method achieves 19.5x speedup when compared with the state-of-the-art compiler-based method at 99.9% sparsity. The generated sparse kernels start to outperform dense convolution implementations at about 80% sparsity.},
journal = {Proc. ACM Program. Lang.},
month = oct,
articleno = {281},
numpages = {29},
keywords = {code generation, convolution, iteration graphs, merge lattices, performance, sparse data structures, sparse tensor algebra, sparse tensors}
}