Avaliando DTM em Arquiteturas Superescalares Configuradas com Diferentes Larguras

Amarildo T. da Costa; Felipe M. G. França; Eliseu M. C. Filho

doi:10.5753/wscad.2001.19124

Amarildo T. da Costa IME / UFRJ
Felipe M. G. França UFRJ
Eliseu M. C. Filho UFRJ

DOI: https://doi.org/10.5753/wscad.2001.19124

Resumo

Este trabalho avalia a exploração de redundância em nível de traços (seqüências de instruções dinâmicas) aplicada a processadores superescalares com diferentes larguras. A redundância existente em programas foi explorada através de um mecanismo de reuso denominado Dynamic Trace Memoization (DTM). Simulações considerando processadores configurados com diferentes larguras e incorporando o mecanismo DTM, identificaram para os programas do SPEC95: percentuais de reuso variando de 28% a 60% (média harmônica); percentuais de ganhos de performance variando de 6.3% a 25% (média harmônica); e que um processador superescalar com largura 4 e incorporando o mecanismo DTM produz ganhos de performance sobre o mesmo processador superescalar base com largura 8. Este último resultado fornece fortes indícios de que a exploração de redundância em nível de traços, apresenta-se como uma alternativa viável à opção de se aumentar as larguras dos processadores superescalares para a obtenção de um maior número de instruções executadas por ciclo de clock.

Palavras-chave: Reuso de Traces, Memorização, Reuso de Instruções, Processadores Superescalares

Referências

AGARWAL, V., HRISHKESH, M.S., KECKLER S.W., BURGER, D. "Clock Rate versus IPC: The End of the Road for Conventional Microarchitectures". ln: Proceedings of the 27th lnternational Symposium on Computer Architecture, pp. 248-259, Vancouver, May 2000.

BODIK, R., GUPTA, R., SOFFA M.L. "Load-Reuse Analysis: Design and Evaluation". In: Proceedings of the ACM/SIGPLAN Conference on Programming Language Design and lmplementation, pp. 64-76, Atlanta, May 1999.

BURGER, D., AUSTIM, T.M. The SimpleScalar Tool Set Version 2.0, Technical Report CS-TR-97-1342, University of Wisconsin-Madison, Wisconsin, June 1997.

da COSTA, A.T., FRANÇA, F.M.G. The Reuse Potencial of Trace Memoization, Technical Report ES-498/99, COPPE/UFRJ, Rio de Janeiro, May 1999.

GONZALEZ A., TUBELLA, J., MOLINA, C. "TraceLevel Reuse". In: Proceedings of the lnternational Conference on Parallel Processing, pp. 30-37, Japan, September 1999.

HUANG, J., LILJA, D.J. "Exploiting basic block Value Locality with block Reuse". In: Proceedings of the 5th International Symposium on High-Performance Computer Architecture, pp. 106-114, Orlando, January 1999.

HUANG, J., LILJA, O.J. "Exploring Sub-block Value Reuse for Superscalar Processors". In: Proceedings of the lnternational Conference on Parallel Architectures and Computational Techniques, pp. 100-107, Philadelphia, October 2000.

LIPASTI, M.H., WlLKERSON, C.B., SHEN, J.P. "Value Locality and Load Value Prediction". In: Proceedings of the 7th lnternational Conference on Architectural Support for Programming Languages and Operating Systems, pp. 138-147, Cambridge, MA, September 1996.

PALACHARLA, S., JOUPPI, N.P., SMITH, J.E. Quantifying the Complexity of Superscalar Processors, Technical Report CS-TR-96-1328, University of Wisconsin-Madison, Wisconsin, November 1996.

SODANI, A., SOHI, G.S. "Dynamic Instruction Reuse". In: Proceedings of the 24th International Symposium on Computer Architecture, pp. 194-205, Denver, July 1997.

SHRIVER, B., SMITH, B., The Anatomy of a High Performance Microprocessor: A Systems Perspective. 1 ed. Los Alamitos, CA, IEEE Computer Society Press, 1998.