Telescoping Languages Project Description
NOTE: The grant supporting the Telescoping Languages project ended in 2008. This site is retained for archival purposes, but is not actively updated.
Telescoping Languages: Support for Rapid Construction of High-Performance Applications
At Rice we have originated the telescoping languages strategy to generate high-performance compilers for scientific domain languages. The basic idea is to preprocess a library of components for a target domain to produce a compiler for that domain that understands and optimizes the library components as if they were primitive operations in the base language. This strategy is depicted in the figure below.
Our work to date has focused on extensions to Matlab as defined by toolboxes such as signal processing, image processing, etc. We have built a very preliminary prototype of the Palomar compiler generation system, but we plan to expand the work in several different directions:
- Component integration systems. Preprocessing of component libraries can make it possible to produce specialized versions for use in contexts that might occur in programs that invoke these components. One preliminary study has made it possible to generate Fortran versions of a library for solution of large-scale eigenvalue problems (ARPACK) from a Matlab prototype. Our current system, called LibGen, produces eight different versions from each Matlab prototype routine, specialized on input matrix type (real versus complex, sparse versus dense, symmetric versus nonsymmetric).
- Compilation of statistical languages. We have embarked on a project to compile programs written in the S family of languages, which includes S-PLUS and R, into efficent C equivalents. The strategy is to preprocess the open-source run-time library for R to produce versions that are specialized to different contexts that might be needed in a calling program and use type analysis to select the right variant when the R application is presented to the compiler.
- Parallel Matlab. We are exploring the use of telescoping languages to effect a compiler-based parallelization of Matlab. The underlying idea is to distribute the Matlab arrays across a parallel machine using different data distribution strategies. The underlying idea is to use compiler techology developed for High Performance Fortran (HPF) to preprocess the base library routines that carry out operations on distributed arrays and translate them into scalable parallel code, including communication, that implements these operations.
- Automatic pretuning of components. One form of context to which components can be specialized is the platform on which they will be run. Thus the telescoping language concept encompasses in-advance optimization of components for a particular machine architecture. We are exploring a strategy that selects loop optimization parameters, such as factors by which to block and unroll, by running the code and selecting the best variant using heuristic search. This makes it possible to select and integrate an optimized version of each component invoked by a user program once the target architecture is known.