Avaliando DTM em Arquiteturas Superescalares Configuradas com Diferentes Larguras

  • Amarildo T. da Costa IME / UFRJ
  • Felipe M. G. França UFRJ
  • Eliseu M. C. Filho UFRJ

Abstract


This work evaluates the exploration of redundancy at the trace level (sequences of dynamic instructions) applied to superscalar processors with different widths. The existing redundancy in programs was explored through a reuse mechanism called Dynamic Trace Memoization (DTM). Simulations considering processors configured with different widths and incorporating the DTM mechanism, had identified for the SPEC95 benchmarks: reuse percents from 28% to 60% (harmonic mean); speedup percents from 6.3% to 25% (harmonic mean); and that a superscalar processor with width 4 and incorporating the DTM mechanism, outperform in performance the same base superscalar processor with width 8. This last result supplies strong indications of that the redundancy exploration at the trace level is a viable alternative to the option of magnifying the widths of superscalar processors to increase the number of executed instructions per clock cycle.

Keywords: Trace Reuse, Memoization, lnstruction Reuse, Superscalar Processor

References

AGARWAL, V., HRISHKESH, M.S., KECKLER S.W., BURGER, D. "Clock Rate versus IPC: The End of the Road for Conventional Microarchitectures". ln: Proceedings of the 27th lnternational Symposium on Computer Architecture, pp. 248-259, Vancouver, May 2000.

BODIK, R., GUPTA, R., SOFFA M.L. "Load-Reuse Analysis: Design and Evaluation". In: Proceedings of the ACM/SIGPLAN Conference on Programming Language Design and lmplementation, pp. 64-76, Atlanta, May 1999.

BURGER, D., AUSTIM, T.M. The SimpleScalar Tool Set Version 2.0, Technical Report CS-TR-97-1342, University of Wisconsin-Madison, Wisconsin, June 1997.

da COSTA, A.T., FRANÇA, F.M.G. The Reuse Potencial of Trace Memoization, Technical Report ES-498/99, COPPE/UFRJ, Rio de Janeiro, May 1999.

GONZALEZ A., TUBELLA, J., MOLINA, C. "TraceLevel Reuse". In: Proceedings of the lnternational Conference on Parallel Processing, pp. 30-37, Japan, September 1999.

HUANG, J., LILJA, D.J. "Exploiting basic block Value Locality with block Reuse". In: Proceedings of the 5th International Symposium on High-Performance Computer Architecture, pp. 106-114, Orlando, January 1999.

HUANG, J., LILJA, O.J. "Exploring Sub-block Value Reuse for Superscalar Processors". In: Proceedings of the lnternational Conference on Parallel Architectures and Computational Techniques, pp. 100-107, Philadelphia, October 2000.

LIPASTI, M.H., WlLKERSON, C.B., SHEN, J.P. "Value Locality and Load Value Prediction". In: Proceedings of the 7th lnternational Conference on Architectural Support for Programming Languages and Operating Systems, pp. 138-147, Cambridge, MA, September 1996.

PALACHARLA, S., JOUPPI, N.P., SMITH, J.E. Quantifying the Complexity of Superscalar Processors, Technical Report CS-TR-96-1328, University of Wisconsin-Madison, Wisconsin, November 1996.

SODANI, A., SOHI, G.S. "Dynamic Instruction Reuse". In: Proceedings of the 24th International Symposium on Computer Architecture, pp. 194-205, Denver, July 1997.

SHRIVER, B., SMITH, B., The Anatomy of a High Performance Microprocessor: A Systems Perspective. 1 ed. Los Alamitos, CA, IEEE Computer Society Press, 1998.
Published
2001-09-10
COSTA, Amarildo T. da; FRANÇA, Felipe M. G.; C. FILHO, Eliseu M.. Avaliando DTM em Arquiteturas Superescalares Configuradas com Diferentes Larguras. In: SYMPOSIUM ON HIGH PERFORMANCE COMPUTING SYSTEMS (SSCAD), 2. , 2001, Pirenópolis. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2001 . p. 63-70. DOI: https://doi.org/10.5753/wscad.2001.19124.