Exploração do Projeto de Sistemas Baseados em GPU ciente de Dark Silicon
Abstract
This paper proposes an infrastructure for the design space exploration of platforms with both graphics processing units and general-purpose cores. The goal is to mitigate the dark silicon and increase system performance at design time. The GPGPUSim simulator has been extended to perform dark silicon estimates of GPU platforms and then integrated into the MultiExplorer framework. Additionally, we propose a strategy to estimate the performance of GPU platforms, and we also model a database that uses both GPU and general-purpose cores, thus enabling the design space exploration for heterogeneous GP-GPU architectures.
References
Carlson, T. E., Heirman, W., and Eeckhout, L. (2011). Sniper: exploring the level of abstraction for scalable and accurate parallel multi-core simulation. In Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis, page 52. ACM.
Dennard, R., Gaensslen, F., Yu, H.-N., Rideout, L., Bassous, E., and Leblanc, A. (1974). Design of ion-implanted mosfets with very small physical dimensions. IEEE Journal of Solid-Circuits, pages 256–267.
Dennard, R. H., Cai, J., and Kumar, A. (2007). A perspective on today’s scaling challenges and possible future directions. Solid-State Electronics, 51(4):518–525.
Duenha, L., Guedes, M., Almeida, H., Boy, M., and Azevedo, R. (2014). MPSoCBench: A toolset for MPSoC system level evaluation. In International Conference on Embedded Computer Systems: Architectures, Modeling, and Simulation (SAMOS XIV), pages 164–171. IEEE.
Fung, W. W., Sham, I., Yuan, G., and Aamodt, T. M. (2007). Dynamic warp formation and scheduling for efficient gpu control flow. In Proceedings of the 40th Annual IEEE/ACM International Symposium on Microarchitecture, pages 407–420. IEEE Computer Society.
Hardavellas, N., Ferdman, M., Falsafi, B., and Ailamaki, A. (2011). Toward dark silicon in servers. IEEE Micro, 31(4):6–15.
Jia, W., Shaw, K. A., and Martonosi, M. (2012). Stargazer: Automated regressionbased gpu design space exploration. In Performance Analysis of Systems and Software (ISPASS), 2012 IEEE International Symposium on, pages 2–13. IEEE.
Leng, J., Hetherington, T., ElTantawy, A., Gilani, S., Kim, N. S., Aamodt, T. M., and Reddi, V. J. (2013). Gpuwattch: enabling energy optimizations in gpgpus. In ACM SIGARCH Computer Architecture News, volume 41, pages 487–498. ACM.
Li, S., Ahn, J., Strong, R., Brockman, J., Tullsen, D., and Jouppi, N. (2009). McPAT: an integrated power, area, and timing modeling framework for multicore and manycore architectures. In 42nd Annual IEEE/ACM International Symposium on Microarchitecture, pages 469–480. IEEE.
Li, S., Ahn, J., Strong, R., Brockman, J., Tullsen, D., and Jouppi, N. (2013) The McPAT framework for multicore and manycore architectures: Simultaneously modeling power, area, and timing. ACM Transactions on Architecture and Code Optimization (TACO), 10(1):5.
Mooney, C. Z. (1997). Monte carlo simulation, volume 116. Sage Publications.
Morris, G. W. and Aubury, M. (2007). Design space exploration of the european option benchmark using hyperstreams. In Field Programmable Logic and Applications, 2007. FPL 2007. International Conference on, pages 5–10. IEEE.
Nejatollahi, H. and Salehi, M. E. (2015). Voltage scaling and dark silicon in symmetric multicore processors. The Journal of Supercomputing, 71(10):3958–3973.
Raghunathan, B., Turakhia, Y., Garg, S., and Marculescu, D. (2013). Cherrypicking: exploiting process variations in dark-silicon homogeneous chip multiprocessors. In Proceedings of the DATE, pages 39–44. EDA Consortium.
Sanders, J. and Kandrot, E. (2010). CUDA by example: an introduction to generalpurpose GPU programming. Addison-Wesley Professional.
Santos, M. T., Oliveira, T., Sonohata, R., Krebs, C., Duenha, L., and Santos, R. (2018a). Modelo de predição de desempenho integrado à exploração do espaço de projetos. In Anais do Workshop de Computação Heterogênea (WCH), pages 630–641.
Santos, R., Duenha, L., Silva, A. C., Sousa, M., Tedesco, L. A., Melgarejo, J. C., Santos, T., Azevedo, R., and Moreno, E. (2018b). Dark-silicon aware design space exploration. Journal of Parallel and Distributed Computing, 120:295–306.
Santos, T., Silva, A., Duenha, L., Santos, R., Moreno, E., and Azevedo, R. (2016) On the dark silicon automatic evaluation on multicore processors. In Proceedings of the SBAC-PAD, pages 166–173. IEEE.
Schaller, R. (1997) 34(6):52–59. Moore’s law: Past, present and future. IEEE Spectrum,
Shafique, M., Garg, S., Henkel, J., and Marculescu, D. (2014). The eda challenges in the dark silicon era: Temperature, reliability, and variability perspectives. In Proceedings of the 51st Annual DAC, pages 1–6. ACM.
Turakhia, Y., Raghunathan, B., Garg, S., and Marculescu, D. (2013). Hades: Architectural synthesis for heterogeneous dark silicon chip multi-processors. In Proceedings of the 50th Annual DAC, page 173. ACM.
Ubal, R., Jang, B., Mistry, P., Schaa, D., and Kaeli, D. (2012). Multi2Sim: a simulation framework for CPU-GPU computing. In Proceedings of the 21st international conference on Parallel architectures and compilation techniques, pages 335–344. ACM.