With multicore processors and deep memory hierarchies remaining the main source
of performance, and emerging areas such as energy-aware computation requiring
research breakthroughs, polyhedral compilation techniques receive more
attention than ever from both researchers and practitioners. Thanks to a
unified formalism for parallelization and memory optimization, these techniques
are now powering domain-specific language compilers yielding best performance
in highly competitive areas such as machine learning and numeric simulation.
IMPACT is a unique event focusing exclusively on polyhedral compilation that
brings together researchers and practitioners for a high-quality one-day event
including a keynote, selected paper presentations, and posters for
work-in-progress discussions.
Opening and Keynote (10:00 - 11:00)
-
Welcome notes.
Benoit Meister (Qualcomm Technologies) and Riyadh Baghdadi (New York University Abu Dhabi).
-
Keynote: Automatic operator generation for deep learning frameworks in the all-scenario context: MindSpore/AKG architecture, features and challenges.
Cédric Bastoul (Huawei's Paris Research Center)
[Abstract]
Artificial intelligence frameworks such as MindSpore, TensorFlow or PyTorch aim at providing high-level abstractions to enable rapid development of new operators and at supporting various architectures to enable computation onto the desired target platform. Translating programs written by AI scientists to efficient implementations on complex parallel hardware is an extreme compilation challenge. To simplify the problem, existing frameworks make a strong use of optimized libraries which provide high-performance implementations of well-known operators. However, those libraries may miss global optimization opportunities and scientists may design new operators not covered by existing libraries. In such scenarios, a more general compiler solution is required. This solution must generate high-performance code that exploits the target architecture and also be fast enough to allow researchers to quickly evaluate their new design.
Polyhedral compilation was soon discovered as a promising solution to that challenge because many artificial intelligence and machine learning codes fall within its application domain. This compilation approach raises the code to algebraic abstractions where powerful analyses and transformations are possible, enabling automatic parallelization and optimization. Unfortunately current polyhedral compilation techniques are not yet satisfactory: they miss optimization opportunities on important target architectures and they do not scale up to codes implementing modern deep neural networks. In this talk, I will present AKG (Auto Kernel Generator) which is the automatic AI/DL polyhedral fused operator generator of the Open Source framework MindSpore. I will discuss its architecture, its features and its main challenges in the context of an "all-scenario" framework.
[Bio]
Cedric Bastoul is Chief Scientist of Distributed and Parallel Software and head of Central Software Institute at Huawei's Paris Research Center. He is on leave from his professor position at Strasbourg University, associated with the ICPS team at ICube laboratory and the CAMUS team at Inria. He received the PhD degree in computer science from Pierre & Marie Curie University (now Sorbonne University), France, in 2004. His research interests are in compiler techniques for optimization and parallelization with a strong emphasis on code restructuring using the polyhedral model. He created or maintains several well know tools in the high-level compilation community as the code generator CLooG and the parametric integer programming solver, PIP.
[Slides]
Break and Poster Session (11:00 — 11:30)
-
Bartók corridor, next to the Bartók room, on the catering area side.
Session 1: Full Paper Session (11:30 — 13:00) [25 min + 5min Q&A]
-
Higher-Level Synthesis: experimenting with MLIR polyhedral representations for accelerator design.
S. Curzel, S. Jovic, M. Fiorito, A. Tumeo and F. Ferrandi.
[Abstract]
High-Level Synthesis (HLS) tools simplify the design of hardware
accelerators by automatically generating Verilog/VHDL
code starting from a general purpose software programming
language. They include a wide range of optimization techniques
in the process, most of them performed on a lowlevel
intermediate representation of the code. We believe
that introducing optimizations on a higher level of abstraction
could significantly contribute to the automated design
process results; in particular, polyhedral techniques for the
manipulation of loops could have a significant impact on the
generated accelerators. This paper focuses on loop pipelining,
which we use as a case study to explore the introduction
of compiler-based transformations on top of an existing HLS
process. We leverage the Multi-Level Intermediate Representation
(MLIR) and evaluate the impact of the proposed
transformations on an open-source HLS tool that would not
otherwise support loop pipelining. Preliminary results confirm
that our implementation increased the performance of
the generated accelerators, without any modification to the
underlying HLS tool.
[Paper]
-
Polyhedral Binary Decision Diagrams for Representing Non-Convex Polyhedra.
S. Kulkarni and M. Kruse.
[Abstract]
Vector sets arising in application domains such as polyhedral model
optimizations are not necessarily convex, thereby ruling out representation
by a single polyhedral set. In such scenarios, a commonly
employed representation is using multiple polyhedral sets whose
union is the vector set to be represented. While this suffices to
represent all relevant vector sets for most applications, it is not
necessarily efficient. Despite being simple to construct, these representations
may require an exponential number of polyhedral sets
to entirely cover non-convex vector sets.
In this work we introduce the Polyhedral Binary Decision Diagram
(PBDD), inspired by binary decision diagrams (BDDs) and
quasi-affine selection trees, as an alternative representation of nonconvex
vector sets. We implement a proof of concept in Python
and show that, when supported by simple structural simplification
operations from BDD literature, the PBDD avoids the exponential
blow-up arising from compiling a simple program in Polly—in fact,
the representation of a set’s complement can be computed in constant
time. Several case studies compare the scaling behavior of
isl’s union-of-convex-polyhedra implementation and our PBDD
implementation with and without simplifications.
[Paper]
[Slides]
[Video]
-
Polyhedral Scheduling and Relaxation of Synchronous Reactive Systems.
G. Iooss, A. Cohen, D. Potop-Butucaru, M. Pouzet, V. Bregeon, J. Souyris and Ph. Baufreton.
[Abstract]
The design and implementation of reactive, hard real-time
systems involves modeling and generating efficient code for
the integration of harmonic multi-periodic tasks. The simple
principles of synchronous reactive programming met great
scientific and engineering success in the area. A synchronous
program orchestrates concurrent computations. It does
so while maintaining composability, modularity, functional
determinism and real-time execution guarantees. In the case
of hard real-time systems, a reactive control program is composed
of multi-periodic tasks related through integral ratios.
This paper presents a language and optimizing compiler
to implement large reactive control systems composed of
multi-periodic, synchronous reactive tasks. The same language
is used to program a complete control system, from the
finest-grained computations to task-level integration, while
generating efficient code satisfying real-time constraints. It
is evaluated on two real-world applications.
[Paper]
[Slides]
Lunch (13:00 — 14:00)
Session 2: Short Paper Session - Mapping Techniques (14:00 — 14:45) [10min + 5min Q&A]
-
Affine Multibanking for High-Level Synthesis.
I. Lasfar, Ch. Alias, M. Moy, R. Neveu, A. Carré.
[Abstract]
In the last decade, FPGAs appeared as a credible alternative
for big data and high-performance computing applications.
However, programming an FPGA is tedious: given a function
to implement, the circuit must be designed from scratch by
the developer. In this short paper, we address the compilation
of data placement under parallelism and resource constraints.
We propose an HLS algorithm able to partition the data
across memory banks, so parallel accesses will target distinct
banks to avoid data transfer serialization. Our algorithm is
able to reduce the number of banks and the maximal bank
size. Preliminary evaluation shows promising results.
[Paper]
[Slides]
-
Progress Report: A Deep Learning Guided Exploration of Affine Unimodular Loop Transformations.
M. Merouani, Kh. Boudaoud, I. Aouadj, N. Tchoulak, F. Benbouzid-Sitayeb and K. Benatchba., H. Leather, R. Baghdadi
[Abstract]
In this paper, we present a work in progress about a deep
learning based approach for automatic code optimization in
polyhedral compilers. The proposed technique explores com-
binations of affine and non-affine loop transformations to
find the sequence of transformations that minimizes the exe-
cution time of a given program. This exploration is guided by
a deep learning based cost model that evaluates the speedup
that each sequence of transformations would yield. Prelim-
inary results show that the proposed techniques achieve a
2.35x geometric mean speedup over state of the art polyhe-
dral compilers (Pluto).
[Paper]
[Slides]
-
A pipeline pattern detection technique in Polly.
D. Talaashrafi, J. Doerfert and M. Moreno-Maza
[Abstract]
The polyhedral model has repeatedly shown howit facilitates
various loop transformations, including loop parallelization,
loop tiling, and software pipelining. However, parallelism
is almost exclusively exploited on a per-loop basis without
much work on detecting cross-loop parallelization opportunities.
While many problems can be scheduled such that loop
dimensions are dependence-free, the resulting loop parallelism
does not necessarily maximize concurrent execution,
especially not for unbalanced problems.
In this work, we introduce a polyhedral-model-based analysis
and scheduling algorithm that exposes and utilizes crossloop
parallelization through tasking. This work exploits
pipeline patterns between iterations in different loop nests,
and it is well suited to handle imbalanced iterations.
Our LLVM/Polly-based prototype performs schedule modifications
and code generation targeting a minimal, language
agnostic tasking layer. We present results using an implementation
of this API with the OpenMP task construct. For
different computation patterns, we achieved speed-ups of up
to 3.5× on a quad-core processor while LLVM/Polly alone
fails to exploit the parallelism.
[Paper]
[Slides]
Session 3: Short Paper Session - Polyhedral Infrastructure (14:45 — 15:30) [10min + 5min Q&A]
-
Rephrasing polyhedral optimizations with trace analysis.
H. Thievenaz, K. Kimura and Ch. Alias.
[Paper]
[Slides]
-
MOM: Matrix Operations in MLIR.
L. Chelini,. H. Barthels, P. Bientinesi, M. Copik, T. Grosser and D. Spampinato.
[Abstract]
Modern research in code generators for dense linear algebra
computations has shown the ability to produce optimized
code with a performance which compares and often exceeds
the one of state-of-the-art implementations by domain experts.
However, the underlying infrastructure is often developed
in isolation making the interconnection of logically
combinable systems complicated if not impossible. In this
paper, we propose to leverage MLIR as a unifying compiler
infrastructure for the optimization of dense linear algebra
operations. We propose a new MLIR dialect for expressing
linear algebraic computations including matrix properties
to enable high-level algorithmic transformations. The integration
of this new dialect in MLIR enables end-to-end
compilation of matrix computations via conversion to existing
lower-level dialects already provided by the framework.
[Paper]
-
Bringing Presburger Arithmetic to MLIR with FPL.
A. Pitchanathan, K. Grover, M. Weber and T. Grosser.
[Abstract]
Presburger arithmetic provides the mathematical core for
the polyhedral compilation techniques that drive analytical cache models, loop optimization for ML and HPC, formal verification, and hardware design. Despite many efforts
to use Presburger arithmetic in production compilers, the
most powerful polyhedral compilation techniques have thus
far been confined to optional extensions of the compilation
flow as traditional Presburger libraries were developed as
standalone projects that were not part of the core compiler
toolchain. After developing FPL, a fast new Presburger library, we worked with the LLVM community to make Presburger arithmetic an integral part of the production-focused
MLIR compiler framework. We show how developers can
use FPL in MLIR and report on our experience in working
with the LLVM/MLIR community on integrating FPL into
MLIR. With FPL now a native, performant, and approachable
component of the LLVM ecosystem, compiler developers can
co-develop the Presburger library and polyhedral transformations with ease. As a result, we expect that scientists and
compiler engineers will now be able to jointly develop even
more IMPACTful polyhedral optimizations.
[Paper]
Break (15:30 — 16:00)
IMPACT'22 Chair Challenge (16:00 — 17:15)
[Slides]
Community news and closing notes (17:15 — 17:30)
Important Dates
Full Papers | |
Abstract deadline (optional): | October 24, 2021 (AoE) |
Paper deadline (extended): | October 31, 2021 (AoE) November 7th, 2021 (AoE) |
Author notification: | November 30, 2021 |
Final version due: | December 12, 2021 |
Short Papers | |
Short paper deadline: | December 15, 2021 March 27, 2022 (AoE) |
Author notification: | May 3, 2022 |
Final version due: | June 5, 2022 |
Workshop: | June 20th, 2022 (in conjunction
with HIPEAC) |
Call For Paper Information
We welcome both theoretical and experimental papers on all aspects of
polyhedral compilation and optimization. We also welcome submissions
describing preliminary results, crazy new ideas, position papers,
experience reports, and available tools, with an aim to stimulate
discussions, collaborations, and advances in the field. The following
illustrate potential IMPACT papers:
- Thorough theoretical discussion of a preliminary idea with an attempt to
place it in context but no experimental results.
- Experimental results comparing two or more existing ideas, followed by a
detailed analysis.
- Presentation of an existing idea in a different way, including
illustrations of how the idea applies to new use cases, code,
architectures, etc. Attribution should as clear as possible.
Topics of interest include, but are not limited to:
- program optimization (automatic parallelization, tiling, etc.);
- code generation;
- data/communication management on GPUs, accelerators and distributed systems;
- hardware/high-level synthesis for affine programs;
- static analysis;
- program verification;
- model checking;
- theoretical foundations of the polyhedral model;
- extensions of the polyhedral model;
- scalability and robustness of polyhedral compilation techniques.
Submissions
Short Papers
Short papers are a good venue for projects that are still work-in-progress, tool demonstrations, etc.
Submissions should not exceed 2 pages, excluding references.
Full Papers
Submissions should not exceed 10 pages (recommended 8 pages), excluding references.
Proceedings
Proceedings will be posted online. If the final version of an accepted paper does not sufficiently address the comments of the reviewers, then it may be accompanied by a note from the program committee. Publication at IMPACT will not prevent later publication in conferences or journals of the presented work. However, simultaneous submission to IMPACT and other workshop, conference, or journal is often prohibited by the policy of other venues. For instance, a manuscript overlapping significantly with IMPACT submission cannot be submitted to PLDI 2022 or any other overlapping SIGPLAN event.
Format
Both full and short papers should be formatted
as per ACM SIGPLAN proceedings format. Please use
version 1.54 or above of the following templates to prepare your
manuscript: ACM
Proceedings Template. Make sure to use the sigplan
subformat. Visit the SIGPLAN
Author Resources page for further information on SIGPLAN manuscript formatting.
Note: older versions of the article template use smaller fonts for
submission, please double-check that you are using the recent style file
(in particular, various LaTeX distributions ship older versions of the
acmart style file, please download the most recent one from ACM).
Submissions
Submissions should use PDF format and be printable on US Letter or A4
paper. Please submit your manuscripts through EasyChair:
https://easychair.org/conferences/?conf=impact22.
The same will be used for both short and long papers. Currently only
the track of short papers is open.
Posters
Authors of both long and short papers have the opportunity to present
their work in the HiPEAC poster session.
The poster format is A0 portrait.
If possible, posters related to IMPACT will be gathered at the poster
session close to each other.
IMPACT posters will be in the first block of poster boards in the Bartok corridor
closest to the catering area.
Organizers
| Benoit Meister | Qualcomm Technologies, Inc., USA |
| Riyadh Baghdadi | New York University, UAE |
Program Committee
| Christophe Alias | INRIA & Ecole Normale Superieure de Lyon, France |
| Cédric Bastoul | Huawei, France |
| Protonu Basu | Facebook, USA |
| Somashekaracharya Bhaskaracharya | NVIDIA, USA |
| Jeronimo Castrillon | TU Dresden, Germany |
| Philippe Clauss | INRIA & U. of Strasbourg, France |
| Albert Cohen | Google, France |
| Paul Feautrier | Ecole Normale Superieure de Lyon, France |
| Tobias Grosser | U. of Edinburgh, UK |
| Armin Größlinger | U. of Passau, Germany |
| Mary Hall | U. of Utah, USA |
| Francois Irigoin | Mines ParisTech, France |
| Paul H J Kelly | Imperial College, London, UK |
| Martin Kong | U. of Oklahoma, USA |
| Sriram Krishnamoorthy | Google, USA |
| Michael Kruse | Argonne National Laboratory, USA |
| Louis-Noel Pouchet | Colorado State University, USA |
| Benoit Pradelle | Xilinx, Germany |
| Sanjay Rajopadhye | Colorado State University, USA |
| Ponnuswamy Sadayappan | U. of Utah, USA |
| Adilla Susungi | Huawei, France |
| Sanket Tavarageri | Microsoft, India |
| Ramakrishna Upadrasta | IIT-Hyderabad, India |
| Sven Verdoolaege | Cerebras Systems, Belgium |
| Jie Zhao | State Key Laboratory of Mathematical |
| | Engineering and Advanced Computing, China |
| Oleksandr Zinenko | Google, France |
Contact us
Please send any questions to
Benoit Meister and
Riyadh Baghdadi
to
impact-chairs@reservoir.com.