Departmental Bulletin Paper C言語自動並列化トランスレータの開発 : タスク粒度に着目したコードリストラクチャリング手法の実現
Implementation of Parallel Code Generator under Static Execution Control and Proposal of Performance Tuning Tool for Automatic Parallelizing Translator for C Programs

近藤, 竜也  ,  甲斐, 宗徳

In our automatic parallelizing translator for sequential C programs with MPI, a set of all statements in a block scope is defined as a compound task. In this paper, at first, we implemented a parallelism analyzer for the inner levels of hierarchy of scopes in any compound task. By using this analyzer, we analyzed single loops and nested loops whose processing time may take the most of total processing time of a program in general. Although it seems that a loop has no parallelism at a glance, the loop may be restructured to have parallelism by eliminating data dependencies, called loop distribution. In addition, in order to reduce more processing time of for-loops, a code restructuring method, that has extract the efficiency of cache memory, has been implemented. These implementation result in reducing parallel processing time remarkably.

