IMPACT 2022

12th International Workshop on Polyhedral Compilation Techniques

June 20th, 2022 | Budapest, Hungary

In conjunction with HiPEAC 2022, June, 2022

Rooms Brahms I-II

With multicore processors and deep memory hierarchies remaining the main source of performance, and emerging areas such as energy-aware computation requiring research breakthroughs, polyhedral compilation techniques receive more attention than ever from both researchers and practitioners. Thanks to a unified formalism for parallelization and memory optimization, these techniques are now powering domain-specific language compilers yielding best performance in highly competitive areas such as machine learning and numeric simulation. IMPACT is a unique event focusing exclusively on polyhedral compilation that brings together researchers and practitioners for a high-quality one-day event including a keynote, selected paper presentations, and posters for work-in-progress discussions.

Program

Opening and Keynote (10:00 - 11:00)

Welcome notes.
Benoit Meister (Qualcomm Technologies) and Riyadh Baghdadi (New York University Abu Dhabi).
Keynote: Automatic operator generation for deep learning frameworks in the all-scenario context: MindSpore/AKG architecture, features and challenges.
Cédric Bastoul (Huawei's Paris Research Center)
[Abstract]
Artificial intelligence frameworks such as MindSpore, TensorFlow or PyTorch aim at providing high-level abstractions to enable rapid development of new operators and at supporting various architectures to enable computation onto the desired target platform. Translating programs written by AI scientists to efficient implementations on complex parallel hardware is an extreme compilation challenge. To simplify the problem, existing frameworks make a strong use of optimized libraries which provide high-performance implementations of well-known operators. However, those libraries may miss global optimization opportunities and scientists may design new operators not covered by existing libraries. In such scenarios, a more general compiler solution is required. This solution must generate high-performance code that exploits the target architecture and also be fast enough to allow researchers to quickly evaluate their new design. Polyhedral compilation was soon discovered as a promising solution to that challenge because many artificial intelligence and machine learning codes fall within its application domain. This compilation approach raises the code to algebraic abstractions where powerful analyses and transformations are possible, enabling automatic parallelization and optimization. Unfortunately current polyhedral compilation techniques are not yet satisfactory: they miss optimization opportunities on important target architectures and they do not scale up to codes implementing modern deep neural networks. In this talk, I will present AKG (Auto Kernel Generator) which is the automatic AI/DL polyhedral fused operator generator of the Open Source framework MindSpore. I will discuss its architecture, its features and its main challenges in the context of an "all-scenario" framework.
[Bio]
Cedric Bastoul is Chief Scientist of Distributed and Parallel Software and head of Central Software Institute at Huawei's Paris Research Center. He is on leave from his professor position at Strasbourg University, associated with the ICPS team at ICube laboratory and the CAMUS team at Inria. He received the PhD degree in computer science from Pierre & Marie Curie University (now Sorbonne University), France, in 2004. His research interests are in compiler techniques for optimization and parallelization with a strong emphasis on code restructuring using the polyhedral model. He created or maintains several well know tools in the high-level compilation community as the code generator CLooG and the parametric integer programming solver, PIP.
[Slides]

Break and Poster Session (11:00 — 11:30)

Bartók corridor, next to the Bartók room, on the catering area side.

Session 1: Full Paper Session (11:30 — 13:00) [25 min + 5min Q&A]

Higher-Level Synthesis: experimenting with MLIR polyhedral representations for accelerator design.
S. Curzel, S. Jovic, M. Fiorito, A. Tumeo and F. Ferrandi.
[Abstract]
High-Level Synthesis (HLS) tools simplify the design of hardware accelerators by automatically generating Verilog/VHDL code starting from a general purpose software programming language. They include a wide range of optimization techniques in the process, most of them performed on a lowlevel intermediate representation of the code. We believe that introducing optimizations on a higher level of abstraction could significantly contribute to the automated design process results; in particular, polyhedral techniques for the manipulation of loops could have a significant impact on the generated accelerators. This paper focuses on loop pipelining, which we use as a case study to explore the introduction of compiler-based transformations on top of an existing HLS process. We leverage the Multi-Level Intermediate Representation (MLIR) and evaluate the impact of the proposed transformations on an open-source HLS tool that would not otherwise support loop pipelining. Preliminary results confirm that our implementation increased the performance of the generated accelerators, without any modification to the underlying HLS tool.
[Paper]
Polyhedral Binary Decision Diagrams for Representing Non-Convex Polyhedra.
S. Kulkarni and M. Kruse.
[Abstract]
Vector sets arising in application domains such as polyhedral model optimizations are not necessarily convex, thereby ruling out representation by a single polyhedral set. In such scenarios, a commonly employed representation is using multiple polyhedral sets whose union is the vector set to be represented. While this suffices to represent all relevant vector sets for most applications, it is not necessarily efficient. Despite being simple to construct, these representations may require an exponential number of polyhedral sets to entirely cover non-convex vector sets. In this work we introduce the Polyhedral Binary Decision Diagram (PBDD), inspired by binary decision diagrams (BDDs) and quasi-affine selection trees, as an alternative representation of nonconvex vector sets. We implement a proof of concept in Python and show that, when supported by simple structural simplification operations from BDD literature, the PBDD avoids the exponential blow-up arising from compiling a simple program in Polly—in fact, the representation of a set’s complement can be computed in constant time. Several case studies compare the scaling behavior of isl’s union-of-convex-polyhedra implementation and our PBDD implementation with and without simplifications.
[Paper] [Slides] [Video]
Polyhedral Scheduling and Relaxation of Synchronous Reactive Systems.
G. Iooss, A. Cohen, D. Potop-Butucaru, M. Pouzet, V. Bregeon, J. Souyris and Ph. Baufreton.
[Abstract]
The design and implementation of reactive, hard real-time systems involves modeling and generating efficient code for the integration of harmonic multi-periodic tasks. The simple principles of synchronous reactive programming met great scientific and engineering success in the area. A synchronous program orchestrates concurrent computations. It does so while maintaining composability, modularity, functional determinism and real-time execution guarantees. In the case of hard real-time systems, a reactive control program is composed of multi-periodic tasks related through integral ratios. This paper presents a language and optimizing compiler to implement large reactive control systems composed of multi-periodic, synchronous reactive tasks. The same language is used to program a complete control system, from the finest-grained computations to task-level integration, while generating efficient code satisfying real-time constraints. It is evaluated on two real-world applications.
[Paper] [Slides]

Lunch (13:00 — 14:00)

Session 2: Short Paper Session - Mapping Techniques (14:00 — 14:45) [10min + 5min Q&A]

Affine Multibanking for High-Level Synthesis.
I. Lasfar, Ch. Alias, M. Moy, R. Neveu, A. Carré.
[Abstract]
In the last decade, FPGAs appeared as a credible alternative for big data and high-performance computing applications. However, programming an FPGA is tedious: given a function to implement, the circuit must be designed from scratch by the developer. In this short paper, we address the compilation of data placement under parallelism and resource constraints. We propose an HLS algorithm able to partition the data across memory banks, so parallel accesses will target distinct banks to avoid data transfer serialization. Our algorithm is able to reduce the number of banks and the maximal bank size. Preliminary evaluation shows promising results.
[Paper] [Slides]
Progress Report: A Deep Learning Guided Exploration of Affine Unimodular Loop Transformations.
M. Merouani, Kh. Boudaoud, I. Aouadj, N. Tchoulak, F. Benbouzid-Sitayeb and K. Benatchba., H. Leather, R. Baghdadi
[Abstract]
In this paper, we present a work in progress about a deep learning based approach for automatic code optimization in polyhedral compilers. The proposed technique explores com- binations of affine and non-affine loop transformations to find the sequence of transformations that minimizes the exe- cution time of a given program. This exploration is guided by a deep learning based cost model that evaluates the speedup that each sequence of transformations would yield. Prelim- inary results show that the proposed techniques achieve a 2.35x geometric mean speedup over state of the art polyhe- dral compilers (Pluto).
[Paper] [Slides]
A pipeline pattern detection technique in Polly.
D. Talaashrafi, J. Doerfert and M. Moreno-Maza
[Abstract]
The polyhedral model has repeatedly shown howit facilitates various loop transformations, including loop parallelization, loop tiling, and software pipelining. However, parallelism is almost exclusively exploited on a per-loop basis without much work on detecting cross-loop parallelization opportunities. While many problems can be scheduled such that loop dimensions are dependence-free, the resulting loop parallelism does not necessarily maximize concurrent execution, especially not for unbalanced problems. In this work, we introduce a polyhedral-model-based analysis and scheduling algorithm that exposes and utilizes crossloop parallelization through tasking. This work exploits pipeline patterns between iterations in different loop nests, and it is well suited to handle imbalanced iterations. Our LLVM/Polly-based prototype performs schedule modifications and code generation targeting a minimal, language agnostic tasking layer. We present results using an implementation of this API with the OpenMP task construct. For different computation patterns, we achieved speed-ups of up to 3.5× on a quad-core processor while LLVM/Polly alone fails to exploit the parallelism.
[Paper] [Slides]

Session 3: Short Paper Session - Polyhedral Infrastructure (14:45 — 15:30) [10min + 5min Q&A]

Rephrasing polyhedral optimizations with trace analysis.
H. Thievenaz, K. Kimura and Ch. Alias.
[Paper] [Slides]
MOM: Matrix Operations in MLIR.
L. Chelini,. H. Barthels, P. Bientinesi, M. Copik, T. Grosser and D. Spampinato.
[Abstract]
Modern research in code generators for dense linear algebra computations has shown the ability to produce optimized code with a performance which compares and often exceeds the one of state-of-the-art implementations by domain experts. However, the underlying infrastructure is often developed in isolation making the interconnection of logically combinable systems complicated if not impossible. In this paper, we propose to leverage MLIR as a unifying compiler infrastructure for the optimization of dense linear algebra operations. We propose a new MLIR dialect for expressing linear algebraic computations including matrix properties to enable high-level algorithmic transformations. The integration of this new dialect in MLIR enables end-to-end compilation of matrix computations via conversion to existing lower-level dialects already provided by the framework.
[Paper]
Bringing Presburger Arithmetic to MLIR with FPL.
A. Pitchanathan, K. Grover, M. Weber and T. Grosser.
[Abstract]
Presburger arithmetic provides the mathematical core for the polyhedral compilation techniques that drive analytical cache models, loop optimization for ML and HPC, formal verification, and hardware design. Despite many efforts to use Presburger arithmetic in production compilers, the most powerful polyhedral compilation techniques have thus far been confined to optional extensions of the compilation flow as traditional Presburger libraries were developed as standalone projects that were not part of the core compiler toolchain. After developing FPL, a fast new Presburger library, we worked with the LLVM community to make Presburger arithmetic an integral part of the production-focused MLIR compiler framework. We show how developers can use FPL in MLIR and report on our experience in working with the LLVM/MLIR community on integrating FPL into MLIR. With FPL now a native, performant, and approachable component of the LLVM ecosystem, compiler developers can co-develop the Presburger library and polyhedral transformations with ease. As a result, we expect that scientists and compiler engineers will now be able to jointly develop even more IMPACTful polyhedral optimizations.
[Paper]

Break (15:30 — 16:00)

IMPACT'22 Chair Challenge (16:00 — 17:15) [Slides]

Community news and closing notes (17:15 — 17:30)

Important Dates

Full Papers
Abstract deadline (optional):	October 24, 2021 (AoE)
Paper deadline (extended):	~~October 31, 2021 (AoE)~~ November 7th, 2021 (AoE)
Author notification:	November 30, 2021
Final version due:	December 12, 2021

Short Papers
Short paper deadline:	~~December 15, 2021~~ March 27, 2022 (AoE)
Author notification:	May 3, 2022
Final version due:	June 5, 2022

Workshop:

June 20th, 2022 (in conjunction with HIPEAC)

Call For Paper Information

We welcome both theoretical and experimental papers on all aspects of polyhedral compilation and optimization. We also welcome submissions describing preliminary results, crazy new ideas, position papers, experience reports, and available tools, with an aim to stimulate discussions, collaborations, and advances in the field. The following illustrate potential IMPACT papers:

Thorough theoretical discussion of a preliminary idea with an attempt to place it in context but no experimental results.
Experimental results comparing two or more existing ideas, followed by a detailed analysis.
Presentation of an existing idea in a different way, including illustrations of how the idea applies to new use cases, code, architectures, etc. Attribution should as clear as possible.

Topics of interest include, but are not limited to:

program optimization (automatic parallelization, tiling, etc.);
code generation;
data/communication management on GPUs, accelerators and distributed systems;
hardware/high-level synthesis for affine programs;
static analysis;
program verification;
model checking;
theoretical foundations of the polyhedral model;
extensions of the polyhedral model;
scalability and robustness of polyhedral compilation techniques.

Submissions

Short Papers
Short papers are a good venue for projects that are still work-in-progress, tool demonstrations, etc. Submissions should not exceed 2 pages, excluding references.

Full Papers
Submissions should not exceed 10 pages (recommended 8 pages), excluding references.

Proceedings
Proceedings will be posted online. If the final version of an accepted paper does not sufficiently address the comments of the reviewers, then it may be accompanied by a note from the program committee. Publication at IMPACT will not prevent later publication in conferences or journals of the presented work. However, simultaneous submission to IMPACT and other workshop, conference, or journal is often prohibited by the policy of other venues. For instance, a manuscript overlapping significantly with IMPACT submission cannot be submitted to PLDI 2022 or any other overlapping SIGPLAN event.

Format
Both full and short papers should be formatted as per ACM SIGPLAN proceedings format. Please use version 1.54 or above of the following templates to prepare your manuscript: ACM Proceedings Template. Make sure to use the sigplan subformat. Visit the SIGPLAN Author Resources page for further information on SIGPLAN manuscript formatting.
Note: older versions of the article template use smaller fonts for submission, please double-check that you are using the recent style file (in particular, various LaTeX distributions ship older versions of the acmart style file, please download the most recent one from ACM).

Submissions
Submissions should use PDF format and be printable on US Letter or A4 paper. Please submit your manuscripts through EasyChair: https://easychair.org/conferences/?conf=impact22. The same will be used for both short and long papers. Currently only the track of short papers is open.

Posters
Authors of both long and short papers have the opportunity to present their work in the HiPEAC poster session. The poster format is A0 portrait. If possible, posters related to IMPACT will be gathered at the poster session close to each other. IMPACT posters will be in the first block of poster boards in the Bartok corridor closest to the catering area.

Organizers

	Benoit Meister	Qualcomm Technologies, Inc., USA
	Riyadh Baghdadi	New York University, UAE

Program Committee

	Christophe Alias	INRIA & Ecole Normale Superieure de Lyon, France
	Cédric Bastoul	Huawei, France
	Protonu Basu	Facebook, USA
	Somashekaracharya Bhaskaracharya	NVIDIA, USA
	Jeronimo Castrillon	TU Dresden, Germany
	Philippe Clauss	INRIA & U. of Strasbourg, France
	Albert Cohen	Google, France
	Paul Feautrier	Ecole Normale Superieure de Lyon, France
	Tobias Grosser	U. of Edinburgh, UK
	Armin Größlinger	U. of Passau, Germany
	Mary Hall	U. of Utah, USA
	Francois Irigoin	Mines ParisTech, France
	Paul H J Kelly	Imperial College, London, UK
	Martin Kong	U. of Oklahoma, USA
	Sriram Krishnamoorthy	Google, USA
	Michael Kruse	Argonne National Laboratory, USA
	Louis-Noel Pouchet	Colorado State University, USA
	Benoit Pradelle	Xilinx, Germany
	Sanjay Rajopadhye	Colorado State University, USA
	Ponnuswamy Sadayappan	U. of Utah, USA
	Adilla Susungi	Huawei, France
	Sanket Tavarageri	Microsoft, India
	Ramakrishna Upadrasta	IIT-Hyderabad, India
	Sven Verdoolaege	Cerebras Systems, Belgium
	Jie Zhao	State Key Laboratory of Mathematical
		Engineering and Advanced Computing, China
	Oleksandr Zinenko	Google, France

Contact us

Please send any questions to Benoit Meister and Riyadh Baghdadi to impact-chairs@reservoir.com.