Photo by Oleksandr Zinenko

IMPACT 2023

13th International Workshop on Polyhedral Compilation Techniques

January 16th, 2023 | Toulouse, France

In conjunction with HiPEAC 2023, January 16-18 2023

Polyhedral compilation techniques receive more attention than ever from both researchers and practitioners. Thanks to a unified formalism for parallelization and memory optimization, polyhedral techniques play a central role in a wide range of domain-specific languages and optimizing compilers, with high impact in competitive areas such as machine learning, scientific computing and media processing.

IMPACT is a unique event focusing on polyhedral compilation. The workshop brings together researchers and practitioners for a high-quality one-day event including a keynote, paper presentations, and work-in-progress discussions. The workshop has a well established tradition of high quality, extensive peer reviews, with a PC balancing core polyhedral expertise with applications and broader computing systems research.

Program

Pierre Baudis Convention Centre

Room: Foyer St Exupéry

HiPEAC session site

Opening and Keynote (10:00 - 11:00)

Opening
[Slides]
Keynote: Compilation and Optimization of Static Control Flow Programs: Polyhedral Compilation at Large.
Louis-Noël Pouchet
[Abstract]
The process of efficiently mapping an algorithm to a particular hardware is a constantly evolving challenge, not least due to the increasing complexity of both the target hardware and the software stack to be used in modern computing. To cope with these challenges, a broad range of optimizing compilation techniques are developed, with the aim to simplify programming for the user, while automating its optimization for performance. These developments have dramatically increased the role and expectations of optimizing compilers: we expect from them high performance of the application for a wide variety of execution contexts, which is critical for embedded and mainstream computing as well as the push to exascale computing at the high end. A key to success is the ability to capture the semantics of the original program, and execution behavior of the machine, as accurately as possible at compile-time. Static control flow programs, where every branch taken at runtime can be exactly modeled at compile-time, model a rich class of numerical algorithms, and have been the target of polyhedral compilation. In this talk, I will attempt to pave a way between programs, compilers and machines where polyhedral analysis and optimizations have delivered successful approaches to automatic optimization.
[Bio]
Louis-Noël Pouchet is a tenured Associate Professor of Computer Science at Colorado State University, with a Joint Appointment in the Electrical and Computer Engineering department at CSU. He received his Ph.D. in Computer Science in 2010 from University of Paris-Sud 11 and INRIA, France, was a postdoctoral fellow (2010-2012) and then Research Assistant Professor (2014-2016) in the Computer Science and Engineering department at Ohio State University before joining CSU in 2016, and was a Visiting Assistant Professor at University of California Los Angeles (2012-2014), as a member of the NSF Center for Domain-Specific Computing. He works on a variety of topics for high-performance computing, including optimizing compilers for CPUs, GPUs and FPGAs, domain-specific languages and optimizations, high-level synthesis, and distributed systems for large-scale scientific computing. Pouchet is a leading expert in polyhedral compilation, with contributions in modeling, scheduling and reconstructing static control flow programs, and proving their equivalence. His research is funded by the U.S. National Science Foundation, the U.S. Department of Energy, and Intel. He has co-authored 100+ publications; he is the recipient of an NSF CAREER grant, the William J. McCalla IEEE best paper award at ICCAD’15, and has received several best paper awards at top conferences. He is the main author of the PoCC polyhedral compiler, of the ROSE/PolyOpt software toolset (a complete polyhedral compilation framework for the LLNL ROSE compiler), and of the popular PolyBench/C benchmarking suite, which has been used in 400+ publications so far.
[Slides]

Break (11:00 — 11:30)

Session 1: Theory and Techniques (11:30 — 13:00) [25 min + 5min Q&A]

Maximal Atomic irRedundant Sets: a Usage-based Dataflow Partitioning Algorithm.
Corentin Ferry, Sanjay Rajopadhye and Steven Derrien.
[Abstract]
Programs admitting a polyhedral representation can be transformed in many ways for locality and parallelism, notably loop tiling. Data flow analysis can then compute dependence relations between iterations and between tiles. When tiling is applied, certain iteration-wise dependences cross tile boundaries, creating the need for inter-tile data communication. Previous work computes it as the flow-in and flow-out sets of iteration tiles. In this paper, we propose a partitioning of the flow-out of a tile into the maximal sets of iterations that are entirely consumed and incur no redundant storage or transfer. The computation is described as an algorithm and performed on a selection of polyhedral programs. We then suggest possible applications of this decomposition in compression and memory allocation.
[Slides] [Paper] [BibTeX]
```
@inproceedings{ferry23-irredundant,
  title     = {{Maximal Atomic irRedundant Sets: a Usage-based Dataflow Partitioning Algorithm}},
  author    = {Ferry, Corentin and Rajopadhye, Sanjay and Derrie, Steven},
  booktitle = {13th International Workshop on Polyhedral Compilation Techniques
               (IMPACT 2023, in conjunction with HiPEAC 2023)},
  year      = 2023,
  url       = {https://impact-workshop.org/impact2023/#ferry23-irredundant},
}
```
Algebraic Tiling.
Clément Rossetti and Philippe Clauss.
[Abstract]
In this paper, we present an ongoing work whose aim is to propose a new loop tiling technique where tiles are characterized by their volumes - the number of embedded iterations - instead of their sizes - the lengths of their edges. Tiles of quasi-equal volumes are dynamically generated while the tiled loops are running, whatever are the original loop bounds, which may be constant or depending linearly of surrounding loop iterators. The adopted strategy is to successively and hierarchically slice the iteration domain in parts of quasi-equal volumes, from the outermost to the innermost loop dimensions. Since the number of such slices can be exactly chosen, quasi-perfect load balancing is reached by choosing, for each parallel loop, the number of slices as being equal to the number of parallel threads, or to a multiple of this number. Moreover, the approach avoids partial tiles by construction, thus yielding a perfect covering of the iteration domain minimizing the loop control cost. Finally, algebraic tiling makes dynamic scheduling of the parallel threads fairly purposeless for the handled parallel tiled loops.
[Slides] [Paper] [BibTeX]
```
@inproceedings{rossetti23-algebraic,
  title     = {{Algebraic Tiling}},
  author    = {Rossetti, Clément and Clauss, Philippe},
  booktitle = {13th International Workshop on Polyhedral Compilation Techniques
               (IMPACT 2023, in conjunction with HiPEAC 2023)},
  year      = 2023,
  url       = {https://impact-workshop.org/impact2023/#rossetti23-algebraic},
}
```
Automatic Algorithm-Based Fault Tolerance (AABFT) of Stencil Computations.
Louis Narmour, Steven Derrien and Sanjay Rajopadhye.
[Abstract]
In this work, we study fault tolerance of transient errors, such as those occurring due to cosmic radiation or hardware component aging and degradation, using Algorithm-Based Fault Tolerance (ABFT). ABFT methods typically work by adding some additional computation in the form of invariant checksums which, by definition, should not change as the program executes. By computing and monitoring checksums, it is possible to detect errors by observing differences in the checksum values. However, this is challenging for two key reasons: (1) it requires careful manual analysis of the input program, and (2) care must be taken to subsequently carry out the checksum computations efficiently enough for it to be worth it. Prior work has shown how to apply ABFT schemes with low overhead for a variety of input programs. Here, we focus on a subclass of programs called stencil applications, an important class of computations found widely in various scientific computing domains. We propose a new compilation scheme to automatically analyze and generate the checksum computations. To the best of our knowledge, this is the first work to do such a thing in a compiler. We show that low overhead code can be easily generated and provide a preliminary evaluation of the tradeoff between performance and effectiveness.
[Slides] [Paper] [BibTeX]
```
@inproceedings{narmour23-aabft,
  title     = {{Automatic Algorithm-Based Fault Tolerance (AABFT) of Stencil Computations}},
  author    = {Narmour, Louis and Derrien, Steven and Rajopadhye, Sanjay},
  booktitle = {13th International Workshop on Polyhedral Compilation Techniques
               (IMPACT 2023, in conjunction with HiPEAC 2023)},
  year      = 2023,
  url       = {https://impact-workshop.org/impact2023/#narmour23-aabft},
}
```

Lunch (13:00 — 14:00)

Session 2: Performance optimization (14:00 — 14:50)

Kernel Merging for Throughput-Oriented Accelerator Generation. [25min + 5min Q&A]
Nicolas Derumigny, Louis-Noël Pouchet and Fabrice Rastello.
[Abstract]
Progresses in High-Level Synthesis has enabled programmers to quickly explore design spaces for high-performance accelerators (e.g. to explore trade-offs between coarse-grain and fine-grain hardware parallelism). However, resource sharing opportunities are often under-exploited by HLS tools, especially in coarse-grain pipelined designs. In this work, we target the issue of generating multi-purpose yet effcient pipelined accelerators, demonstrating our approach on sequences of dense linear algebra computations. We develop polyhedral program analysis to generate the accelerator structure, as well as their profitability criteria. In particular, we leverage cross-loop compute unit reuse to create coarse-grain pipelined designs suited for batched execution of sequences of operations.
[Slides] [Paper] [BibTeX]
```
@inproceedings{derumigny23-merging,
  title     = {{Kernel Merging for Throughput-Oriented Accelerator Generation}},
  author    = {Derumigny, Nicolas and Pouchet, Louis-Noël and Rastello, Fabrice},
  booktitle = {13th International Workshop on Polyhedral Compilation Techniques
               (IMPACT 2023, in conjunction with HiPEAC 2023)},
  year      = 2023,
  url       = {https://impact-workshop.org/impact2023/#derumigny23-merging},
}
```
Superloop Scheduling: Loop Optimization via Direct Statement Instance Reordering. [15min + 5min Q&A]
Cedric Bastoul, Alain Ketterlin and Vincent Loechner.
[Abstract]
Loop optimization in the polyhedral model is supported by the expressiveness of affine scheduling functions to model statement iteration ordering. Discovering the best scheduling remains a grand challenge reinforced by the need to build affine functions. Automatic techniques based on solving systems of affine constraints and/or composing affine scheduling primitives are limited either by the affine modeling or by their primitive set. In this paper we propose a radically different approach to affine scheduling construction called superloop scheduling. We leverage basic-block-level instruction reordering techniques and the capacity to agglomerate statement instances into "super" loops offered by modern trace analysis tools. This approach enables deepest possible reasoning about instruction ordering and a global approach to loop optimization, e.g., where tiling and intra-tile optimization is considered along with other transformations.
[Slides] [Paper] [BibTeX]
```
@inproceedings{bastoul23-superloop,
  title     = {{Superloop Scheduling: Loop Optimization via Direct Statement Instance Reordering}},
  author    = {Bastoul, Cedric and Ketterlin, Alain and Loechner, Vincent},
  booktitle = {13th International Workshop on Polyhedral Compilation Techniques
               (IMPACT 2023, in conjunction with HiPEAC 2023)},
  year      = 2023,
  url       = {https://impact-workshop.org/impact2023/#bastoul23-superloop},
}
```

Session 3: DSLs and MLIR (14:50 — 15:30) [15min + 5min Q&A]

PolyLingual: a Programmable Polyhedral Scheduler.
Tom Hammer and Vincent Loechner.
[Abstract] [Slides] [Paper] [BibTeX]

@inproceedings{hammer23-polylingual,
  title     = {{PolyLingual: a Programmable Polyhedral Scheduler}},
  author    = {Hammer, Tom and Loechner, Vincent},
  booktitle = {13th International Workshop on Polyhedral Compilation Techniques
               (IMPACT 2023, in conjunction with HiPEAC 2023)},
  year      = 2023,
  url       = {https://impact-workshop.org/impact2023/#hammer23-polylingual},
}

Building a Static HLS Pass with FPL.
Kunwar Shaanjeet Singh Grover, Arjun Pitchanathan, Julian Oppermann, Mike Urbach and Tobias Grosser.
[Abstract]
Compiler infrastructure like LLVM/MLIR provides modular and extensible building blocks for reuse in compiler development. The previously existing polyhedral building blocks depended on external solvers. Due to the perceived cost of depending on such external tools, these building blocks were used only in fully polyhedral pipelines. For hybrid or one-off use cases, selective reimplementation has been preferred over taking the dependency; several reimplementations of sub-polyhedral solvers now exist in the ecosystem. With the introduction of a fast Presburger library (FPL) in MLIR, many core polyhedral building blocks are now available in-tree. As a result, it is now possible to use polyhedral techniques within the MLIR-based CIRCT project. Using the development of a static high-level synthesis (HLS) pass in CIRCT as a case study, we show the impact of FPL’s availability upstream.
[Slides] [Paper] [BibTeX]
```
@inproceedings{grover23-hlspass,
  title     = {{Building a Static HLS Pass with FPL}},
  author    = {Grover, Kunwar Shaanjeet Singh and Pitchanathan, Arjun and Oppermann, Julian and
               Urbach, Mike and Grosser, Tobias},
  booktitle = {13th International Workshop on Polyhedral Compilation Techniques
               (IMPACT 2023, in conjunction with HiPEAC 2023)},
  year      = 2023,
  url       = {https://impact-workshop.org/impact2023/#grover23-hlspass},
}
```

Break (15:30 — 16:00)

Session 4: Benchmarks and Internals (16:00 — 16:40) [15min + 5min Q&A]

GeMS: Towards Generating Millions of SCoPs.
Venkatakeerthy S, Nilesh Shah, Anilava Kundu, Shikhar Jain and Ramakrishna Upadrasta.
[Abstract]
We propose an automatic polyhedral loop generator for Generating Millions of SCoPs/loops (GeMS). In this presentation, we first give a motivation and an outline of our generator. Using GeMS, we generate a labeled set of compute and memory bounded kernels by imposing constraints on cache locality. Our initial loop-generator is able to generate 24K kernels for a given configuration. We then illustrate the characteristics of the generated kernels by studying their IPC statistics. Our experimental evaluation involves using various input parameter sizes on two machines that have different cache configurations. We conclude with a listing of possible applications where GeMS could be useful. We also suggest ways to extend GeMS using various core polyhedral compilation libraries and data structures.
[Slides] [Presentation-only]
Which is the Best Farkas Multipliers Elimination Method?.
Nassim Tchoulak, Adilla Susungi, Gianpietro Consolaro, Harenome Razanajato, Nelson Lossing, Zhen Zhang, Cedric Bastoul, Corinne Ancourt and Renwei Zhang.
[Abstract]
The projection method of polyhedrons and the Fourier-Motkzin elimination method are existing approaches to eliminate Farkas multipliers before passing a constraint system to an ILP solver. This paper focuses on experimenting with variations of such elimination process in order to identify which could be the best method. We assess the impact of such variations over requirements such as the effects on the number of constraints generated or the scheduling time when coupled with a variety of solvers.
[Slides] [Presentation-only]

Session 5: Modelling and Sparse matrices (16:40 — 17:30)

Modelling linear algebra kernels as polyhedral volume operations. [25min + 5min Q&A]
Karl F. A. Friebel, Asif Ali Khan, Lorenzo Chelini and Jeronimo Castrillon.
[Abstract]
Linear algebra de-facto dominates an entire branch of current hard- and software design efforts focused on bringing faster and more efficient machine learning and signal processing kernels. For many years, compilers have leveraged the well-defined semantics of linear algebra operations for optimization, with lots of recent research around machine learning. Due to the structure of the operations the polyhedral model is an ideal fit for reasoning about linear algebra programs. In this paper, we present a related model in which such programs are represented as sequences of operations on indexed volumes characterized by their element-wise dependencies. We show how this model opens a way towards efficiently implementing compound kernels by partitioning and specializing over index domains. We demonstrate how memory layout, partitioning (e.g., tiling) and compile-time sparsity are exploited using this model.
[Slides] [Paper] [BibTeX]
```
@inproceedings{friebel23-volume,
  title     = {{Modelling linear algebra kernels as polyhedral volume operations}},
  author    = {Friebel, Karl F. A. and Khan, Asif Ali and Chelini, Lorenzo and Castrillon, Jeronimo},
  booktitle = {13th International Workshop on Polyhedral Compilation Techniques
               (IMPACT 2023, in conjunction with HiPEAC 2023)},
  year      = 2023,
  url       = {https://impact-workshop.org/impact2023/#friebel23-volume},
}
```
Sparse Tetris: Reconstructing Sparse Matrices with Polyhedra. [15min + 5min Q&A]
Gabriel Rodriguez and Louis-Noel Pouchet.
[Abstract]
Sparse data structures are ubiquitous in modern computing to represent sets of nonzero-valued regions within a dense coordinate system, typically when nonzero values are rare in the structure. This avoids storing useless zero values, and bypasses useless computations (e.g., multiplication by zero). Sparse tensors and matrices are widely used in physics simu- lation, graph analytics, or may occur after sparsification of a neural network weight matrix. A sparse representation from the set of nonzero coordinates in a data structure can be built by inspection and compression into multiple arrays, as exemplified with the Compressed Sparse Row (CSR) format. Such formats are not amenable to fast insertion/deletion of a nonzero element in the structure: one or more array resizing and shifting of all its data, is typically required. There exits numerous scenarios where the sparse structure is mostly immutable over time (e.g., a road network), and only the values associated to the nonzero coordinates, that is their associated field(s), are changing during the computation.
[Slides] [Presentation-only]

Community news and closing notes (17:30 — 17:40)

[Slides]

Call For Papers and Presentations

We welcome both theoretical and experimental papers and presentations on all aspects of polyhedral compilation. We also welcome submissions describing preliminary results, crazy new ideas, position papers, experience reports, education material, and available tools, with an aim to stimulate discussions, collaborations, and advances in the field. The following illustrate potential IMPACT papers:

Thorough theoretical discussion of a preliminary idea with an attempt to place it in context but no experimental results.
Experimental results comparing two or more existing ideas, followed by a detailed analysis.
Presentation of an existing idea in a different way, including illustrations of how the idea applies to new use cases, code, architectures, etc. Attribution should as clear as possible.

Topics of interest include, but are not limited to:

program optimization (automatic parallelization, tiling, etc.);
code generation;
data/communication management on GPUs, accelerators and distributed systems;
hardware/high-level synthesis for affine programs;
static analysis;
program verification;
model checking;
theoretical foundations of the polyhedral model;
extensions of the polyhedral model;
scalability and robustness of polyhedral compilation techniques.

Important Dates

Paper and Presentation-only Submissions
Paper deadline:	~~Thursday, November 24, 2022 (AoE)~~
	Extended: Sunday, November 27, 2022 (AoE)
Author notification:	Monday, December 19, 2022
Final version due:	Thursday, January 12, 2023 (AoE)

Workshop:

Monday, January 16, 2023 (in conjunction with HIPEAC)

Submission

Paper Submissions

Paper submissions should not exceed 10 pages (recommended 8 pages), excluding references, formatted as per ACM SIGPLAN proceedings format. Short paper submissions, even with only 2 pages, are welcome as well.

Use version 1.54 or above of the following templates to prepare your manuscript:

https://www.acm.org/publications/proceedings-template

Make sure to use the "sigplan" subformat. Visit

http://sigplan.org/Resources/Author/

for further information on SIGPLAN manuscript formatting. NOTE: older versions of the article template use smaller fonts for submission, please double-check that you are using the recent style file (in particular, various LaTeX distributions ship older versions of the acmart style file, please download the most recent one from ACM).

Submissions should use PDF format and be printable on US Letter or A4 paper. Please submit your manuscripts through EasyChair and while selecting "paper" as topic:

https://easychair.org/conferences/?conf=impact2023

Proceedings will be posted online. If the final version of an accepted paper does not sufficiently address the comments of the reviewers, then it may be accompanied by a note from the program committee. Publication at IMPACT will not prevent later publication in conferences or journals of the presented work. However, simultaneous submission to IMPACT and other workshop, conference, or journal is often prohibited by the policy of other venues. For instance, a manuscript overlapping significantly with IMPACT submission cannot be submitted to PLDI 2023 or any other overlapping SIGPLAN event.

Please make sure that at least one of the authors can attend the workshop if your work is accepted to give a 25 minutes presentation of your work.

Presentation-only Submissions

Submit a summary of your presentation through EasyChair:

https://easychair.org/conferences/?conf=impact2023

Submissions have no formatting requirements but should use PDF formal. Choose the "presentation-only" topic when submitting.

The presentation should take no longer than 25 minutes to present. Please make sure that at least one of the authors can attend the workshop if your work is accepted.

Workshop Chair

	Michael Kruse	Argonne National Laboratory, USA
	Ramakrishna Upadrasta	IIT Hyderabad, India

Program Committee

	Lorenzo Chelini	Intel, Switzerland
	Albert Cohen	Google, France
	Alexandre Eichenberger	IBM, USA
	Renato Golin	Intel, UK
	Tobias Grosser	University of Edinburgh, UK
	Vinod Grover	NVIDIA, USA
	Guillaume Iooss	INRIA, France
	Andreas Kloeckner	UIUC, USA
	Martin Kong	Ohio State University, USA
	Antoine Miné	Sorbonne Université, France
	Diana Picus	AMD, Sweden
	Arjun Pitchanathan	University of Edinburgh, UK
	Louis-Noël Pouchet	Colorado State University, USA
	Benoit Pradelle	Silexica, Germany
	Sanjay Rajopadhye	Colorado State University, USA
	P. Sadayappan	University of Utah, USA
	Daniele Giuseppe Spampinato	Huawei, Switzerland
	Sven Verdoolaege	Cerebras Systems, Belgium
	David G. Wonnacott	Haverford College, USA
	Jie Zhao	State Key Laboratory Zhengzhou, China

Contact Us

Please send any questions related to the IMPACT'23 workshop to Michael Kruse or Ramakrishna Upadrasta.