The main function of parallelizing compilers is to analyze sequential programs, in particular the loop structure, to detect hidden parallelism and automatically restructure sequential programs into parallel subtasks that are executed on a multiprocessor. This article describes the design and implementation of an efficient parallelizing compiler to parallelize loops and achieve high speedup rates on multiprocessor systems. It is well known that the execution efficiency of a loop can be enhanced if the loop is executed in parallel or partially parallel, such as in a DOALL or DOACROSS loop. This article also reviews a practical parallel loop detector (PPD) that is implemented in our PFPC on finding the parallelism in loops. The PPD can extract the potential DOALL and DOACROSS loops in a program by verifying array subscripts. In addition, a new model by using knowledge-based approach is proposed to exploit more loop parallelisms in this paper. The knowledge-based approach integrates existing loop transformations and loop scheduling algorithms to make good use of their ability to extract loop parallelisms. Two rule-based systems, called the KPLT and IPLS, are then developed using repertory grid analysis and attribute-ordering tables respectively, to construct the knowledge bases. These systems can choose an appropriate transform and loop schedule, and then apply the resulting methods to perform loop parallelization and obtain a high speedup rate. For example, the IPLS system can choose an appropriate loop schedule for running on multiprocessor systems. Finally, a runtime technique based on the inspector/executor scheme is proposed in this article for finding available parallelism on loops. Our inspector can determine the wavefronts of a loop with any complex indirected array-indexing pattern by building a DEF-USE table. The inspector is fully parallel without any synchronization. Experimental results show that the new method can resolve any complex data dependence patterns where no previous research can. One of the ultimate goals is to construct a high-performance and portable FORTRAN parallelizing compiler on shared-memory multiprocessors. We believe that our research may provide more insight into the development of a high-performance parallelizing compiler.
Concurrency and Computation: Practice and Experience 13(3):181-208