Optimal low-latency network topologies for cluster performance enhancement

Carregando...
Imagem de Miniatura
Citações na Scopus
10
Tipo de produção
article
Data de publicação
2020
Título da Revista
ISSN da Revista
Título do Volume
Editora
SPRINGER
Autores
DENG, Yuefan
GUO, Meng
HUANG, Xiaolong
XU, Zhipeng
LIU, Weifeng
Citação
JOURNAL OF SUPERCOMPUTING, v.76, n.12, p.9558-9584, 2020
Projetos de Pesquisa
Unidades Organizacionais
Fascículo
Resumo
We propose that clusters interconnected with network topologies having minimal mean path length will increase their processing speeds. We approach our heuristic by constructing clusters of up to 32 nodes having torus, ring, Chvatal, Wagner, Bidiakis and optimal topology for minimal mean path length and by simulating the performance of 256 nodes clusters with the same network topologies. The optimal (or near-optimal) low-latency network topologies are found by minimizing the mean path length of regular graphs. The selected topologies are benchmarked using ping-pong messaging, the MPI collective communications and the standard parallel applications including effective bandwidth, FFTE, Graph 500 and NAS parallel benchmarks. We established strong correlations between the clusters' performances and the network topologies, especially the mean path lengths, for a wide range of applications. In communication-intensive benchmarks, optimal graphs enabled network topologies with multifold performance enhancement in comparison with mainstream graphs. It is striking that mere adjustment of the network topology suffices to reclaim performance from the same computing hardware.
Palavras-chave
Network topology, Graph theory, Latency, Benchmarks
Referências
  1. Abd-El-Barr M, 2011, J ELECTR COMPUT ENG, V2011, DOI 10.1155/2011/189434
  2. Adiga NR, 2005, IBM J RES DEV, V49, P265, DOI 10.1147/rd.492.0265
  3. Ajima Y, 2009, COMPUTER, V42, P36, DOI 10.1109/MC.2009.370
  4. Alverson R, 2010, 2010 18 IEEE S HIGH, DOI [10.1109/hoti.2010.23, DOI 10.1109/H0TI.2010.23]
  5. [Anonymous], 2019, EFFECTIVE BANDWIDTH
  6. [Anonymous], 2019, NPB NAS PARALLEL BEN
  7. [Anonymous], 2019, TOP 500 SUPERCOMPUTE
  8. [Anonymous], 2019, FFTE FAST FOURIER TR
  9. [Anonymous], 2008, IBM J RES DEV, V52, P199, DOI 10.1147/RD.521.0199
  10. [Anonymous], 2019, HPC CHALLENGE BENCHM
  11. Ardagna D, 2018, P 2018 ACM SPEC INT, P192, DOI [10.1145/3184407.3184420, DOI 10.1145/3184407.3184420]
  12. Awal MR, 2015, J CIRCUIT SYST COMP, V24, DOI 10.1142/S021812661540006X
  13. Bailey D. H., 1992, Proceedings. Supercomputing '92. (Cat. No.92CH3216-9), P386, DOI 10.1109/SUPERC.1992.236665
  14. BAILEY DH, 1991, INT J SUPERCOMPUT AP, V5, P63, DOI 10.1177/109434209100500306
  15. Barriere L, 2009, DISCRETE MATH, V309, P3871, DOI 10.1016/j.disc.2008.10.028
  16. Barriere L, 2009, DISCRETE APPL MATH, V157, P36, DOI 10.1016/j.dam.2008.04.018
  17. Garzon DB, 2012, LECT NOTES COMPUT SC, V7484, P716, DOI 10.1007/978-3-642-32820-6_71
  18. Besta M, 2014, INT CONF HIGH PERFOR, P348, DOI 10.1109/SC.2014.34
  19. Bondy J., 1976, GRAPH THEORY APPL
  20. Brightwell R, 2006, IEEE MICRO, V26, P41, DOI 10.1109/MM.2006.65
  21. Brinkmann G, 2017, J GRAPH THEOR, V86, P255, DOI 10.1002/jgt.22125
  22. Brinkmann G, 2013, DISCRETE APPL MATH, V161, P311, DOI 10.1016/j.dam.2012.07.018
  23. Brinkmann G, 2011, DISCRETE MATH THEOR, V13, P69
  24. Casanova H, 2014, J PARALLEL DISTR COM, V74, P2899, DOI 10.1016/j.jpdc.2014.06.008
  25. Cerf V. G., 1974, Networks, V4, P335, DOI 10.1002/net.3230040405
  26. Cerf V. G., 1975, COMBINATORIAL MATH, P1, DOI 10.1007/BFB0069540
  27. Chen D., 2011, P 2011 INT C HIGH PE, P1
  28. Dally W., 2003, PRINCIPLES PRACTICES
  29. DALLY WJ, 1990, IEEE T COMPUT, V39, P775, DOI 10.1109/12.53599
  30. DALLY WJ, 1991, IEEE T COMPUT, V40, P1016, DOI 10.1109/12.83652
  31. Day K, 1997, IEEE T PARALL DISTR, V8, P109, DOI 10.1109/71.577251
  32. Deng YF, 2012, INT J MOD PHYS B, V26, DOI 10.1142/S021797921250169X
  33. Domke J, 2019, PROCEEDINGS OF SC19: THE INTERNATIONAL CONFERENCE FOR HIGH PERFORMANCE COMPUTING, NETWORKING, STORAGE AND ANALYSIS, DOI 10.1145/3295500.3356140
  34. EFE K, 1991, IEEE T COMPUT, V40, P1312, DOI 10.1109/12.102840
  35. Faanes G., 2012, HIGH PERF COMP NETW, P1, DOI [10.1109/SC.2012.39, DOI 10.1109/SC.2012.39]
  36. FLOYD RW, 1962, COMMUN ACM, V5, P345, DOI 10.1145/367766.368168
  37. Foroutan S, 2010, DES AUT TEST EUROPE, P1629
  38. Freund R., 2010, STAT METHODS
  39. Fu HH, 2016, SCI CHINA INFORM SCI, V59, DOI 10.1007/s11432-016-5588-7
  40. Gupta A. K., 2006, IEEE Computer Architecture Letters, V5, P10, DOI 10.1109/L-CA.2006.8
  41. HARARY F, 1988, COMPUT MATH APPL, V15, P277, DOI 10.1016/0898-1221(88)90213-1
  42. Harwoodt A, 1998, INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED PROCESSING TECHNIQUES AND APPLICATIONS, VOLS I-IV, PROCEEDINGS, P682
  43. HAYES JP, 1989, P IEEE, V77, P1829, DOI 10.1109/5.48826
  44. Hill MD, 1999, READINGS COMPUTER AR
  45. InfiniBand @Trade Association, 2016, INF ACRCH SPEC REL
  46. Inoguchi Y, 1997, HIGH PERFORMANCE COMPUTING ON THE INFORMATION SUPERHIGHWAY - HPC ASIA '97, PROCEEDINGS, P61, DOI 10.1109/HPC.1997.592123
  47. Jan GE, 2004, J INF SCI ENG, V20, P1213
  48. Kim J, 2008, CONF PROC INT SYMP C, P77, DOI 10.1109/ISCA.2008.19
  49. Kitasuka T, 2016, 2016 TENTH IEEE/ACM INTERNATIONAL SYMPOSIUM ON NETWORKS-ON-CHIP (NOCS)
  50. Koniges A, 2001, P 15 INT PAR DISTR P, DOI [10.1109/ipdps.2001.925208, DOI 10.1109/IPDPS.2001.925208]
  51. LEISERSON CE, 1985, IEEE T COMPUT, V34, P892, DOI 10.1109/TC.1985.6312192
  52. Lenzen C, 2016, ARXIV160700298V1
  53. Liao XK, 2015, J COMPUT SCI TECH-CH, V30, P259, DOI 10.1007/s11390-015-1520-7
  54. Liu V., 2013, P 10 USENIX C NETW S, P399
  55. Liu YJ, 2014, ACM SIGCOMM COMP COM, V44, P283, DOI [10.1145/2619239.2626332, 10.1145/2740070.2626332]
  56. Luszczek Piotr R., 2006, P 2006 ACM IEEE C SU, P213
  57. Matsutani H, 2009, INT S HIGH PERF COMP, P367, DOI 10.1109/HPCA.2009.4798274
  58. Meringer M, 1999, J GRAPH THEOR, V30, P137, DOI 10.1002/(SICI)1097-0118(199902)30:2<137::AID-JGT7>3.0.CO;2-G
  59. Mizuno R, 2016, 2016 TENTH IEEE/ACM INTERNATIONAL SYMPOSIUM ON NETWORKS-ON-CHIP (NOCS)
  60. Moore G. E., 1965, ELECTRONICS, V38, P114, DOI 10.1109/JPROC.1998.658762
  61. Murphy R.C., 2010, CRAY USERS GROUP, V19, P45
  62. Nakao M, 2019, PROCEEDINGS OF INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING IN ASIA-PACIFIC REGION (HPC ASIA 2019), P128, DOI 10.1145/3293320.3293325
  63. Sabino AU, 2018, INT J MOD PHYS C, V29, DOI 10.1142/S0129183118500481
  64. Sanders Peter, 2013, Experimental Algorithms 12th International Symposium, SEA 2013. Proceedings, P164, DOI 10.1007/978-3-642-38527-8_16
  65. Scott S. L., 1996, CRAY T3E NETWORK ADA
  66. Seo JH, 2018, J SUPERCOMPUT, V74, P1636, DOI 10.1007/s11227-017-2186-4
  67. Shimizu N, 2016, 2016 TENTH IEEE/ACM INTERNATIONAL SYMPOSIUM ON NETWORKS-ON-CHIP (NOCS)
  68. Takahashi D, 2002, LECT NOTES COMPUT SC, V2367, P380
  69. Takahashi D, 2000, J SUPERCOMPUT, V15, P207, DOI 10.1023/A:1008160021085
  70. Wang S, 2019, IEEE INFOCOM SER, P1729, DOI 10.1109/INFOCOM.2019.8737595
  71. Weisstein EW, 2018, BIDIAKIS CUBE
  72. Xu J., 2013, TOPOLOGICAL STRUCTUR
  73. Xu ZP, 2019, MATHEMATICS-BASEL, V7, DOI 10.3390/math7121214
  74. Yang YL, 2001, IEEE T PARALL DISTR, V12, P701, DOI 10.1109/71.940745
  75. Zhang P, 2015, IEEE T PARALL DISTR, V26, P984, DOI 10.1109/TPDS.2014.2315201
  76. Zhang P, 2011, IEEE T PARALL DISTR, V22, P287, DOI 10.1109/TPDS.2010.89