

Performance Characterization of Data-intensive Kernels on AMD Fusion Architectures.
#Opencl benchmark indicative of capture one performance code
In Proceedings of the International Symposium on Code Generation and Optimization (CGO ’04). LLVM: a compilation framework for lifelong program analysis transformation. In Proceedings of the International Conference on Parallel and Distributed Systems (ICPADS ’13) (pp. On the programmability and performance of heterogeneous platforms. Krommydas, K., Scogland, T., & Feng, W.c. On the portability of the OpenCL dwarfs on fixed and reconfigurable parallel platforms. Krommydas, K., Owaida, M., Antonopoulos, C., Bellas, N., & Feng, W.C. In Proceedings of the IEEE 25th International Conference on Application-specific Systems, Architectures and Processors (ASAP ’14) (pp. On the characterization of OpenCL dwarfs on fixed and reconfigurable platforms. Krommydas, K., Feng, W.C., Owaida, M., Antonopoulos, C., & Bellas, N. In Proceedings of the Workshop on Parallel Programming Patterns (ParaPLoP ’10). A design pattern language for engineering (Parallel) software: merging the PLPP and OPL projects. Keutzer, K., Massingill, B.L., Mattson, T.G., & Sanders, B.A. SIGARCH Computer Architecture News, 34 (4). In Proceedings of the 3rd ACM/SPEC International Conference on Performance Engineering (ICPE ’12). OpenCL and the 13 dwarfs: a work in progress. IEEE Design & Test of Computers, 28(4).įeng, W.C., Lin, H., Scogland, T., & Zhang, J. Custom arithmetic, datapath design for FPGAs using the FloPoCo core generator. Pittsburgh: PA.ĭe Dinechin, F., Pasca, B., & Normale, E. In Proceedings of the 3rd Workshop on General-Purpose Computation on Graphics Processing Units (GPGPU ’10). The Scalable Heterogeneous Computing (SHOC) benchmark suite. Knoxville: TN.ĭanalis, A., Marin, G., McCurdy, C., Meredith, J.S., Roth, P.C., Spafford, K., Tipparaju, V., & Vetter, J.S. In Proceedings of the Symposium on Application Accelerators in High-Performance Computing (SAAHPC ’11). On the efficacy of a fused CPU+GPU processor (or APU) for parallel computing. In Proceedings of the IEEE International Symposium on Workload Characterization (IISWC ’09). Rodinia: a benchmark suite for heterogeneous computing. Canada: Toronto.Ĭhe, S., Boyer, M., Meng, J., Tarjan, D., Sheaffer, J.W., Lee, S.H., & Skadron, K. In Proceedings of the 17th International Conference on Parallel Architectures and Compilation Techniques (PACT ’08). The PARSEC benchmark suite: characterization and architectural implications. University of California at Berkeley: Department of Electrical Engineering and Computer Sciences.īienia, C., Kumar, S., Singh, J.P., & Li, K. The landscape of parallel computing research: a view from Berkeley. Altera SDK for OpenCL: Programming Guide.Īsanovic, K., Bodik, R., Demmel, J., Keaveny, T., Keutzer, K., Kubiatowicz, J., Morgan, N., Patterson, D., Sen, K., Wawrzynek, J., Wessel, D., & Yelick, K. Implementing FPGA Design with the OpenCL Standard, 2.0 edn.Īltera Corporation (2013). We describe the computation and communication patterns exposed by a representative set of dwarfs, obtain relevant profiling data and execution information, and draw conclusions that highlight the complex interplay between dwarfs’ patterns and the underlying hardware architecture of modern parallel platforms.Īltera Corporation (2012). Using OpenDwarfs, we characterize a diverse set of modern fixed and reconfigurable parallel platforms: multi-core CPUs, discrete and integrated GPUs, Intel Xeon Phi co-processor, as well as a FPGA. As such, we present the latest release of OpenDwarfs, a benchmark suite that currently realizes the Berkeley dwarfs in OpenCL, a vendor-agnostic and open-standard computing language for parallel computing. Furthermore, we desire a common programming model for the benchmarks that facilitates code portability across a wide variety of different processors (e.g., CPU, APU, GPU, FPGA, DSP) and computing environments (e.g., embedded, mobile, desktop, server). To address this challenge, we need benchmarks that capture the execution patterns (i.e., dwarfs or motifs) of applications, both present and future, in order to guide future hardware design.

One such challenge entails evaluating the efficacy of such parallel architectures and identifying the architectural innovations that ultimately benefit applications. The proliferation of heterogeneous computing platforms presents the parallel computing community with new challenges.
