|
|
|
|
The OptiMMA project will enable the mapping of emerging, dynamic software applications on complex Multi-Processor Systems-on-Chip (MP-SoC).
During the first 2.5 years of the project we have realized a prototype tool-assisted approach that aims at optimizing applications for heterogeneous systems-on-chip. The input to the toolchain consists of a software description and a platform descripton. The software description eventually is a high-level synchronous data flow model annotated with timing informations. This model can either be given by hand (with timing estimates in that case) or is constructed starting from an existing application. The platform description is a high-level description detailing the processing elements, memories and interconnect that describe the target hardware.
The toolchain was validated on two application case studies: a 3D game engine and a medical imaging application. First of all we have applied it on a Wavelet Subdivsion Surface (WSS) based 3D scalable graphics game engine (3D-WSS) to demonstrate the research objectives on generic heterogeneous multiprocessor SoC. We used the toolchain starting from the source code of the application and mapped it to an extended TI- OMAP3530 Multiprocessor comprised of four RISC processors (Strong ARM 1100x) and two VLIWs with eight FUs each (TI-C64X+) connected by a bus. The StrongARM 1100x supports DVFS (Dynamic Voltage Frequency Scaling) knobs which can run at (1.48V, 133MHz) and (1.96V, 200MHz). The TI-C64X+ also supports DVFS knobs which can run at (1.8V, 500MHz) and (1.2V, 200MHz).
Our experimental results for the game engine show energy savings from 50% up to a factor of 8 while satisfying all application constraints. In another experiment we focused on the performance of the constraint-based optimization process and compared it to state-of-the-art design-time otimization tools. These can roughly be subdivided into two groups, the industrial tools and the academic tools. The former explore mappings of tasks to processing elements and take worst case assumptions for memories and communication. The academic tools go one step further and take either memories or communication cost into account while using worst-case assumption for the other. This contrasts to our approach that simultaneously takes everything into account. Fo software we used a medical imaging application that helps physicians detect brain tumors by extracting contours from images of the brain (cavity detector application). We used the toolchain to do the design-time exploration. The hardware platform was the one we described earlier. Our evaluation shows that compared to industrial tools we gain from ∼70% to 4× on performance axis or between 25% to 70% on energy axis. Compared to academic tools we gain between 30% to 2× on performance axis or between 5% to 25% on energy axis. While producing better results our tools are also faster in producing these results.
|
|
Home Project Scope Partners Meetings Archive Dissemination WIKI Reflections |