InvasIC

Transregio SFB InvasIC (DFG SFB/TRR89)

The Idea: In the proposed TCRC, we intend to investigate a completely novel paradigm for designing and programming future parallel computing systems called invasive computing. The main idea and novelty of invasive computing is to introduce resource-aware programming support in the sense that a given program gets the ability to explore and dynamically spread its computations to neighbour processors similar to a phase of invasion, then to execute portions of code of high parallelism degree in parallel based on the available (invasible) region on a given multi-processor architecture. Afterwards, once the program terminates or if the degree of parallelism should be lower again, the program may enter a retreat phase, deallocate resources and resume execution again, for example, sequentially on a single processor. In order to support this idea of self-adaptive and resource-aware programming, not only new programming concepts, languages, compilers and operating systems are necessary but also revolutionary architectural changes in the design of MPSoCs (Multi-Processor Systems-on-a-Chip) must be provided so to efficiently support invasion, infection and retreat operations involving concepts for dynamic processor, interconnect and memory reconfiguration.

Necessity and Expected Benefits: Decreasing feature sizes have already led to a rethinking of how to design multi-million transistor system-on-a-chip architectures envisioning dramatically increasing rates of temporary and permanent faults as well as feature variations. The major question will thus be how to deal with this imperfect world [6, 7] in which components will become more and more unreliable. As we can foresee SoCs with 1000 or more processors on a single chip in the year 2020, static and central management concepts to control the execution of all resources might have met their limits long before and are therefore not appropriate. Invasion might provide the required self-organising behaviour to conventional programs for being able not only to tolerate certain types of faults and cope with feature variations, but also to provide scalability, higher resource utilisation numbers and, hopefully, also performance gains by adjusting the amount of allocated resources to the temporal needs of a running application. This thought might open a new way of thinking about parallel algorithm design as well. Based on algorithms utilising invasion and negotiating resources with others, we can imagine that corresponding programs become personalised objects, competing with other applications running simultaneously on an MPSoC.

A1 Subproject

Subproject A1 is divided into two major fields of research. Whereas the first area is devoted to basic concepts, features and analysis of invasive programs in general and independent of a language implementation and target architecture, the second part defines language support for invasive programming concepts. The individual questions addressed are summarised as follows:

The basic concepts of invasion must be tackled and corresponding features and parameters defined. The fundamental system model and programming model for invasion and resource-aware computation must be developed and exercised on various preliminary examples on paper. Different levels of abstraction for invasive processes must be defined, such as static candidates, dynamic instances and executable incarnations (so-called i-lets).
A mathematical model for performance and overhead analysis shall be developed. Dynamic decisions about invasion will depend on data distribution, load balancing and the dynamic state of resources. The definition of corresponding overhead functions for resource utilisation and efficiency is important to show if invasive programming allows to achieve utilisation close to 100%. The overhead functions can be used in the design of invasive algorithms and help to dynamically decide if and where invasion should happen.
An abstract core language, which incorporates the insights from 1.) will be defined. This abstract language will be the basis for a concrete language and its interfaces. The concrete language (which is to be built on top of a host language such as X10) must exploit modern language theory, respect the above-mentioned abstraction levels, and provide interfaces to dynamic resource parameters as well as to existing technology for parallel programming. It should utilise generic mechanisms, instead of introducing lots of low-level, ad-hoc syntactic constructs.
The concrete language must be exercised on real algorithms and problems. Expressiveness and usability must be evaluated for all different types of potential target architectures ranging from tightly- over loosely-coupled Multi-Processor System-on-a-Chip (MPSoC) architectures to high-performance computing (HPC) machines.
Later, an operational semantics of the core language, together with a resource-aware type system will be defined. This allows to prove type safety and is likely to prove other fundamental properties such as resource bounds of the dynamic computational structure, as well.

C3 Subproject

In this subproject, we consider compilation and code generation as well as program transformation and optimisation techniques for non-regular (procedural) as well as task-level and regularly-structured (e. g., loop-level) code. For the concrete language and its interfaces as defined in subproject A1, a compiler has to be developed. The back end will be based on the existing FIRM infrastructure for code optimisation. FIRM provides static single assignment (SSA) form as a basis for program analysis and optimisation.

The compiler must eventually support both loosely- and tightly-coupled as well as heterogeneous and homogeneous invasive multi-processor target architectures. The compiler will probably be based on the existing X10 compiler, but with front end extensions for invasive constructs, and a new or modified back end for SPARC architectures, as well as optimisations for efficient utilisation of invasive hardware, respectively operating system support.

In order to exploit invasive computing at the level of loop programs, symbolic techniques will be necessary that describe sets of processor mappings, schedules and synchronisations of loop computations in dependence of the number P of processors which will only be available after invasion. The main challenge will be to define such parametrised schedules and mappings, together with their proper mathematical foundations, as well as to derive compact case-dependent processor and time mappings. A compiler for invasive loop nests must not only generate functional code, but also control code. In addition, there will be a high potential for several optimisations, such as determination of optimal invasion parameters, generation of the control code for splitting registers, copying of data/program code, inter-processor link configuration and memory-access scheduling based on the run-time parameter P.

It is intended to provide a prototype code generator for loosely-coupled RISC-type of processors as well as for tightly-coupled processor arrays (TCPAs) supporting the acceleration of nested loop programs. The output of a code generator shall thus be assembly-level code for architectures such as investigated in subprojects B2 and B3. The generated codes shall be used as input for the simulation tools developed in subproject C2 as well as for real hardware once the central FPGA demonstrator (subproject Z2) is ready to run first code examples.

Eventually, we envision automatic invasification of a given non-invasive (sequential) program: application of modern program analysis and transformation techniques (i. e., data dependency analysis) might be used to automatically detect candidates for coarse-grained program decomposition, clustering, cloning, infection, invasion, and retreat.

Further informations

Weitere Informationen

Participants

Former Aide
Eduard Frank
Tobias Rapp
Martin Seidel

Department Head
Prof. Gregor Snelting

Former Staff Member
Dr.-Ing. Manuel Mohr
Dr.-Ing. Andreas Zwinkau
Dipl.-Inform. Matthias Braun
Dipl.-Inform. Sebastian Buchwald

Former Students
Julian Oppermann

Publications

2023
A. Fried, M. Stemmer-Grabow, J. Wachter: Register Allocation for Compressed ISAs in LLVM. CC '23
2020
S. Rheindt, A. Fried, O. Lenke, L. Nolte, T. Twardzik, T. Wild, A. Herkersdorf: X-CEL: A Method to Estimate Near-Memory Acceleration Potential in Tile-based MPSoCs. ARCS 2020
2019
S. Rheindt, A. Fried, O. Lenke, L. Nolte, T. Wild, A. Herkersdorf: NEMESYS: Near-Memory Graph Copy Enhanced System-Software. MEMSYS 2019
2018
M. Mohr: Aspects of Code Generation and Data Transfer Techniques for Modern Parallel Architectures.
A. Zwinkau: Resource-aware Programming in a High-level Language -- Improved performance with manageable effort on clustered MPSoCs.
S. Buchwald, A. Fried, S. Hack: Synthesizing an Instruction Selection Rule Library from Semantic Specifications. CGO '18
2017
M. Wagner, D. Lohner: Minimal Static Single Assignment Form. AFP 2017
M. Mohr, C. Tradowsky: Pegasus: Efficient Data Transfers for PGAS Languages on Non-Cache-Coherent Many-Cores. DATE 2017
2016
A. Zwinkau: An X10 Memory Model. X10 2016
S. Ullrich, D. Lohner: Verified Construction of Static Single Assignment Form. AFP 2016
S. Wildermann, M. Bader, L. Bauer, M. Damschen, D. Gabriel, M. Gerndt, M. Glaß, J. Henkel, J. Paul, A. Pöppl, S. Roloff, T. Schwarzer, G. Snelting, W. S. a: Invasive Computing for Timing-Predictable Stream Processing on MPSOCS. it 2016
S. Buchwald, D. Lohner, S. Ullrich: Verified Construction of Static Single Assignment Form. CC 2016
2015
N. C. Böwing: Invasives Verteiltes Job Queue Framework.
C. Jost: Erweiterung des invasiven X10-Compilers um getrennte Übersetzung.
M. Witterauf, A. Tanase, J. Teich, V. Lari, A. Zwinkau, G. Snelting: Adaptive fault tolerance through invasive computing.
S. Buchwald, M. Mohr, I. Rutter: Optimal Shuffle Code with Permutation Instructions. arXiv.org
M. Mohr, S. Buchwald, A. Zwinkau, C. Erhardt, B. Oechslein, J. Schedel, D. Lohmann: Cutting Out the Middleman: OS-Level Support for X10 Activities. X10 2015
S. Buchwald, M. Mohr, A. Zwinkau: Malleable Invasive Applications. ATPS 2015
S. Buchwald: Optgen: A Generator for Local Optimizations. CC 2015
S. Buchwald, M. Mohr, I. Rutter: Optimal Shuffle Code with Permutation Instructions. WADS 2015
2014
M. Braun, S. Buchwald, M. Mohr, A. Zwinkau: Dynamic X10: Resource-Aware Programming for Higher Efficiency.
2013
M. Mohr, A. Grudnitsky, T. Modschiedler, L. Bauer, S. Hack, J. Henkel: Hardware Acceleration for Programs in SSA Form. CASES 2013
H. Bungartz, C. Riesinger, M. Schreiber, G. Snelting, A. Zwinkau: Invasive computing in HPC with X10.
M. Braun, S. Buchwald, S. Hack, R. Leißa, C. Mallon, A. Zwinkau: Simple and Efficient Construction of Static Single Assignment Form. CC 2013
2012
D. Rausch: Implementierung einer verteilten Breitensuche in X10.
K. Fischnaller: Optimierung von Stencil-Algorithmen für invasive Architekturen.
C. Frieler: Entwicklung eines parallelen PBQP-Lösers mit X10.
M. Braun, S. Buchwald, M. Mohr, A. Zwinkau: An X10 Compiler for Invasive Architectures.
A. Zwinkau: Resource Awareness for Efficiency in High-Level Programming Languages.
2011
F. Hannig, S. Roloff, G. Snelting, J. Teich, A. Zwinkau: Resource-aware programming and simulation of MPSoC architectures through extension of X10. SCOPES 2011