Architecture Virtualization on Field Programmable Gate Arrays For High-Performance Computing and Productivity Improvement

Speaker:  Christophe Bobda – Gainesville, FL, United States
Topic(s):  Applied Computing


The continuous decreasing size of transistors and the capability of packing more transistors on smaller area has led the transition from high-clocked single processors to multicores. However, pure multicore solutions without customized computing components will not be able to provide the required performance in many computation fields.  Studies are predicting that increase in power as result of chip density will drastically reduce the usage of manycore processors to a maximum of 75%.
Increasing the number of cores on a chip will not be enough to meet the performance requirement in application fields like image processing, oil and gas exploration, and programmatic financial trading, which requires complex 3D convolutions of several large data arrays and tight requirements on memory and IO. Heterogeneous architectures made upon general purpose processors and specialized computing components, and intelligent methods to dynamically adapt chip resource to run-time computational and power consumption requirements can improve the resource usage of future heterogeneous multicore. Reconfigurable logic like FPGAs can be used for this purpose, as part of a multiprocessor on the same die or as separate co-processor. For such a platform however to be successful, viable programing environments must be provided to the huge community of software developers, so they can port existing programs or write new ones for the target platform, without the need to change their habits and learn special languages.

Attempts to reduce the programming burdens of reconfigurable systems have been mostly limited to the development of C-like languages and compiler, capable of compiling a subset of the C-language, extended with special constructs to capture low-level hardware behavior.
While those languages reduce the efforts of building the target hardware from a reference C-implementation as opposed to a recoding in a hardware description language, their main focus
is still the generation of a piece of hardware. Therefore designers must be well aware of hardware design techniques and signal related issues, bit manipulation, timing constraint and resource limitation, to produce efficient implementations. Hence the reluctance of the software community to adopt available design tools and environments. Low-level languages such as CUDA, annotations and language extensions like Microsoft AMP, libraries such as pthreads and new languages like OpenCL require a good understanding of the target architecture to specify capture efficient designs. 

In this talk, we propose a novel approach and on-going work for programing heterogeneous multicore architectures with reconfigurable components, which consists of 1) virtualizing domain specific massive parallel coarse-grained architectures on field programmable gate arrays and 2) seamlessly compile single programs for the virtualized architecture to achieve high-performance, with the same effort required in software development.

About this Lecture

Number of Slides:  25
Duration:  30 minutes
Languages Available:  English
Last Updated: 

Request this Lecture

To request this particular lecture, please complete this online form.

Request a Tour

To request a tour with this speaker, please complete this online form.

All requests will be sent to ACM headquarters for review.