在天河二号上跑分!

在 TH-2G 集群上,限定使用 4 个节点(仅用 CPU),或限定使用 2 个节点(用 GPU,2 机 8 卡),进行 HPL、HPCG 跑分任务

基本要求

使用 CPU 跑 benchmark,可以尝试多种方法让跑到的分数尽可能高

暑假里已经做了一次了,这次相对熟门熟路一些~由于暑假已经在本地上还有给的 kn103 节点上跑了,这次选择换一种方式用 slurm 调度脚本做这一次的 Task,就不申请节点手动 ssh 上去搞了(其实是懒)。

先选择直接从 Intel 编译器目录下搞到 CPU 版本的 benchmark 的测试文件。

cp -r /BIGDATA1/app/intelcompiler/18.0.0/compilers_and_libraries_2018.0.128/linux/mkl/benchmarks ~/WuK

HPL

cd ~/WuK/benchmarks/mp_linpack
sbatch mp_linpack.slurm

mp_linpack.slurm

跑通这一项用到的调度脚本,申请了 4 个节点,每个节点使用二十核。考虑四个节点共 80 核,这里节点间消息传递(MPI),节点内共享内存(OpenMP)。

#!/bin/bash
#SBATCH -J WuK_mp_linpack              # 任务名
#SBATCH -N 4                     # 申请 4 个节点
#SBATCH --ntasks-per-node=1     # 每个节点开 1 个进程
#SBATCH --cpus-per-task=20        # 每个进程占用 20 个 core
module load intelcompiler/18.0.0  # 添加 intelcompiler/18.0.0 模块
export I_MPI_FAVRICS=shm:ofa
export OMP_NUM_THREADS=20         # 设置全局 OpenMP 线程为 20
mpirun ./xhpl_intel64_static

HPL.dat

一个蛮有意思的网站,可以根据你平台的性能参数给出 Linpack 的一些基本参数的建议。下表是根据我在平台和网上查到的一些信息(见附录),希望没搞错。

Number of NodesCores Per NodeSpeed Per Core(GHz)Memory Per Node(GB)Instructions Per Cycle
4202.625610

根据这个网站上的建议(90%通常是转折点,192 是 Intel 官方推荐的分块,试了 256 和 128 也没有明显提升)、玄学调参、日常迷(xia)信(gao),目前参数调成下面这个鬼样子(发现其实除了 N 和 NB 之外的参数影响都几乎没有,甚至会略微下降。

HPLinpack benchmark input file
Innovative Computing Laboratory, University of Tennessee
HPL.out      output file name (if any)
6            device out (6=stdout,7=stderr,file)
1            # of problems sizes (N)
333504       Ns
1            # of NBs
192          NBs
1            PMAP process mapping (0=Row-,1=Column-major)
1            # of process grids (P x Q)
2            Ps
2            Qs
16.0         threshold
1            # of panel fact
2            PFACTs (0=left, 1=Crout, 2=Right)
1            # of recursive stopping criterium
2            NBMINs (>= 1)
1            # of panels in recursion
2            NDIVs
1            # of recursive panel fact.
1            RFACTs (0=left, 1=Crout, 2=Right)
1            # of broadcast
0            BCASTs (0=1rg,1=1rM,2=2rg,3=2rM,4=Lng,5=LnM)
1            # of lookahead depth
0            DEPTHs (>=0)
0            SWAP (0=bin-exch,1=long,2=mix)
1            swapping threshold
1            L1 in (0=transposed,1=no-transposed) form
1            U  in (0=transposed,1=no-transposed) form
0            Equilibration (0=no,1=yes)
8            memory alignment in double (> 0)

slurm-87953.out

PxQ暂时设成 2*2,待调。

Number of Intel(R) Xeon Phi(TM) coprocessors : 0
================================================================================
HPLinpack 2.1  --  High-Performance Linpack benchmark  --   October 26, 2012
Written by A. Petitet and R. Clint Whaley,  Innovative Computing Laboratory, UTK
Modified by Piotr Luszczek, Innovative Computing Laboratory, UTK
Modified by Julien Langou, University of Colorado Denver
================================================================================

An explanation of the input/output parameters follows:
T/V    : Wall time / encoded variant.
N      : The order of the coefficient matrix A.
NB     : The partitioning blocking factor.
P      : The number of process rows.
Q      : The number of process columns.
Time   : Time in seconds to solve the linear system.
Gflops : Rate of execution for solving the linear system.

The following parameter values will be used:

N        :  333504
NB       :     192
PMAP     : Column-major process mapping
P        :       2
Q        :       2
PFACT    :   Right
NBMIN    :       2
NDIV     :       2
RFACT    :   Crout
BCAST    :   1ring
DEPTH    :       0
SWAP     : Binary-exchange
L1       : no-transposed form
U        : no-transposed form
EQUIL    : no
ALIGN    :    8 double precision words

--------------------------------------------------------------------------------

- The matrix A is randomly generated for each test.
- The following scaled residual check will be computed:
      ||Ax-b||_oo / ( eps * ( || x ||_oo * || A ||_oo + || b ||_oo ) * N )
- The relative machine precision (eps) is taken to be               1.110223e-16
- Computational tests pass if scaled residuals are less than                16.0

gn18            : Column=001728 Fraction=0.005 Kernel=   15.47 Mflops=3001133.12
gn15            : Column=003456 Fraction=0.010 Kernel=2697627.07 Mflops=2849905.94
gn18            : Column=005184 Fraction=0.015 Kernel=2680946.82 Mflops=2776800.09
gn18            : Column=006720 Fraction=0.020 Kernel=2683301.96 Mflops=2755200.37
gn15            : Column=008448 Fraction=0.025 Kernel=2689752.86 Mflops=2750017.58
gn18            : Column=010176 Fraction=0.030 Kernel=2668387.20 Mflops=2728324.63
gn18            : Column=011712 Fraction=0.035 Kernel=2665144.34 Mflops=2720129.21
gn15            : Column=013440 Fraction=0.040 Kernel=2668177.00 Mflops=2719690.23
gn18            : Column=015168 Fraction=0.045 Kernel=2675456.13 Mflops=2709500.12
gn18            : Column=016704 Fraction=0.050 Kernel=2678187.17 Mflops=2706724.86
gn15            : Column=018432 Fraction=0.055 Kernel=2675676.97 Mflops=2709560.97
gn18            : Column=020160 Fraction=0.060 Kernel=2689596.63 Mflops=2704764.05
gn18            : Column=021696 Fraction=0.065 Kernel=2676089.51 Mflops=2702840.53
gn15            : Column=023424 Fraction=0.070 Kernel=2693546.27 Mflops=2705320.74
gn18            : Column=025152 Fraction=0.075 Kernel=2684329.99 Mflops=2700866.65
gn18            : Column=026688 Fraction=0.080 Kernel=2690321.44 Mflops=2700304.72
gn15            : Column=028416 Fraction=0.085 Kernel=2689399.78 Mflops=2702316.51
gn18            : Column=030144 Fraction=0.090 Kernel=2693488.67 Mflops=2698741.51
gn15            : Column=031872 Fraction=0.095 Kernel=2706411.31 Mflops=2700693.54
gn15            : Column=033408 Fraction=0.100 Kernel=2677795.37 Mflops=2699737.11
gn18            : Column=035136 Fraction=0.105 Kernel=2683891.43 Mflops=2696954.27
gn15            : Column=036864 Fraction=0.110 Kernel=2678046.17 Mflops=2698265.16
gn15            : Column=038400 Fraction=0.115 Kernel=2668031.99 Mflops=2697183.15
gn18            : Column=040128 Fraction=0.120 Kernel=2686918.51 Mflops=2695140.71
gn15            : Column=041856 Fraction=0.125 Kernel=2698545.39 Mflops=2697154.17
gn15            : Column=043392 Fraction=0.130 Kernel=2673540.01 Mflops=2696421.67
gn18            : Column=045120 Fraction=0.135 Kernel=2686818.09 Mflops=2694269.93
gn15            : Column=046848 Fraction=0.140 Kernel=2694293.54 Mflops=2696352.96
gn15            : Column=048384 Fraction=0.145 Kernel=2666305.02 Mflops=2695530.81
gn18            : Column=050112 Fraction=0.150 Kernel=2697819.85 Mflops=2694513.93
gn15            : Column=051840 Fraction=0.155 Kernel=2692854.56 Mflops=2695696.51
gn15            : Column=053376 Fraction=0.160 Kernel=2670024.37 Mflops=2695073.05
gn18            : Column=055104 Fraction=0.165 Kernel=2688153.70 Mflops=2693835.83
gn15            : Column=056832 Fraction=0.170 Kernel=2698158.39 Mflops=2695265.40
gn15            : Column=058368 Fraction=0.175 Kernel=2668135.35 Mflops=2694674.53
gn18            : Column=060096 Fraction=0.180 Kernel=2692667.76 Mflops=2693470.39
gn15            : Column=061824 Fraction=0.185 Kernel=2693972.28 Mflops=2694591.94
gn18            : Column=063552 Fraction=0.190 Kernel=2674871.07 Mflops=2693040.96
gn18            : Column=065088 Fraction=0.195 Kernel=2695473.54 Mflops=2693086.66
gn15            : Column=066816 Fraction=0.200 Kernel=2679727.10 Mflops=2693618.73
gn18            : Column=068544 Fraction=0.205 Kernel=2691928.47 Mflops=2692825.88
gn18            : Column=070080 Fraction=0.210 Kernel=2679372.67 Mflops=2692594.77
gn15            : Column=071808 Fraction=0.215 Kernel=2687959.38 Mflops=2693474.93
gn18            : Column=073536 Fraction=0.220 Kernel=2674865.13 Mflops=2691893.43
gn18            : Column=075072 Fraction=0.225 Kernel=2682160.07 Mflops=2691740.96
gn15            : Column=076800 Fraction=0.230 Kernel=2675731.88 Mflops=2692658.60
gn18            : Column=078528 Fraction=0.235 Kernel=2681156.27 Mflops=2691417.71
gn18            : Column=080064 Fraction=0.240 Kernel=2680599.16 Mflops=2691262.33
gn15            : Column=081792 Fraction=0.245 Kernel=2650965.13 Mflops=2691752.17
gn18            : Column=083520 Fraction=0.250 Kernel=2673149.84 Mflops=2690871.80
gn18            : Column=085056 Fraction=0.255 Kernel=2693006.87 Mflops=2690899.86
gn15            : Column=086784 Fraction=0.260 Kernel=2685961.25 Mflops=2691756.77
gn18            : Column=088512 Fraction=0.265 Kernel=2682420.47 Mflops=2690554.23
gn18            : Column=090048 Fraction=0.270 Kernel=2686986.59 Mflops=2690510.91
gn15            : Column=091776 Fraction=0.275 Kernel=2691305.05 Mflops=2691201.50
gn18            : Column=093504 Fraction=0.280 Kernel=2687603.70 Mflops=2690184.44
gn15            : Column=095232 Fraction=0.285 Kernel=2657215.82 Mflops=2690620.17
gn15            : Column=096768 Fraction=0.290 Kernel=2684556.69 Mflops=2690553.88
gn18            : Column=098496 Fraction=0.295 Kernel=2683701.23 Mflops=2689729.28
gn15            : Column=100224 Fraction=0.300 Kernel=2684444.67 Mflops=2690447.72
gn15            : Column=101760 Fraction=0.305 Kernel=2634969.03 Mflops=2689875.32
gn18            : Column=103488 Fraction=0.310 Kernel=2694567.26 Mflops=2689306.15
gn15            : Column=105216 Fraction=0.315 Kernel=2688455.94 Mflops=2689926.54
gn15            : Column=106752 Fraction=0.320 Kernel=2655613.71 Mflops=2689600.63
gn18            : Column=108480 Fraction=0.325 Kernel=2681805.49 Mflops=2688772.27
gn15            : Column=110208 Fraction=0.330 Kernel=2653358.42 Mflops=2689142.12
gn15            : Column=111744 Fraction=0.335 Kernel=2682697.89 Mflops=2689085.83
gn18            : Column=113472 Fraction=0.340 Kernel=2693037.90 Mflops=2688654.23
gn15            : Column=115200 Fraction=0.345 Kernel=2690310.80 Mflops=2689146.65
gn15            : Column=116736 Fraction=0.350 Kernel=2694534.26 Mflops=2689190.22
gn18            : Column=118464 Fraction=0.355 Kernel=2683886.73 Mflops=2688674.60
gn15            : Column=120192 Fraction=0.360 Kernel=2693565.06 Mflops=2689079.10
gn18            : Column=121920 Fraction=0.365 Kernel=2680499.63 Mflops=2688652.16
gn18            : Column=123456 Fraction=0.370 Kernel=2680328.96 Mflops=2688590.72
gn15            : Column=125184 Fraction=0.375 Kernel=2685894.98 Mflops=2689106.38
gn18            : Column=126912 Fraction=0.380 Kernel=2683916.08 Mflops=2688457.15
gn18            : Column=128448 Fraction=0.385 Kernel=2690861.11 Mflops=2688473.62
gn15            : Column=130176 Fraction=0.390 Kernel=2693917.35 Mflops=2688833.56
gn18            : Column=131904 Fraction=0.395 Kernel=2690119.23 Mflops=2688437.49
gn18            : Column=133440 Fraction=0.400 Kernel=2692459.06 Mflops=2688463.15
gn15            : Column=135168 Fraction=0.405 Kernel=2686365.94 Mflops=2688820.74
gn18            : Column=136896 Fraction=0.410 Kernel=2679060.99 Mflops=2688261.25
gn18            : Column=138432 Fraction=0.415 Kernel=2682947.32 Mflops=2688229.53
gn15            : Column=140160 Fraction=0.420 Kernel=2680967.19 Mflops=2688515.42
gn18            : Column=141888 Fraction=0.425 Kernel=2660655.87 Mflops=2688020.76
gn18            : Column=143424 Fraction=0.430 Kernel=2689113.63 Mflops=2688026.83
gn15            : Column=145152 Fraction=0.435 Kernel=2676167.79 Mflops=2688359.21
gn18            : Column=146880 Fraction=0.440 Kernel=2681651.04 Mflops=2687964.37
gn18            : Column=148416 Fraction=0.445 Kernel=2689010.05 Mflops=2687969.78
gn15            : Column=150144 Fraction=0.450 Kernel=2691793.54 Mflops=2688134.22
gn18            : Column=151872 Fraction=0.455 Kernel=2692663.10 Mflops=2687925.02
gn15            : Column=153600 Fraction=0.460 Kernel=2681482.76 Mflops=2688104.87
gn15            : Column=155136 Fraction=0.465 Kernel=2683047.94 Mflops=2688081.02
gn18            : Column=156864 Fraction=0.470 Kernel=2669306.06 Mflops=2687641.68
gn15            : Column=158592 Fraction=0.475 Kernel=2653914.84 Mflops=2687847.25
gn15            : Column=160128 Fraction=0.480 Kernel=2634178.69 Mflops=2687607.25
gn18            : Column=161856 Fraction=0.485 Kernel=2691084.23 Mflops=2687456.28
gn15            : Column=163584 Fraction=0.490 Kernel=2698385.42 Mflops=2687703.49
gn15            : Column=165120 Fraction=0.495 Kernel=2690717.72 Mflops=2687715.77
gn18            : Column=171840 Fraction=0.515 Kernel=2667654.04 Mflops=2687029.93
gn15            : Column=178560 Fraction=0.535 Kernel=2671204.00 Mflops=2687074.05
gn18            : Column=185280 Fraction=0.555 Kernel=2664279.33 Mflops=2686514.67
gn18            : Column=191808 Fraction=0.575 Kernel=2669457.10 Mflops=2686308.39
gn15            : Column=198528 Fraction=0.595 Kernel=2668700.16 Mflops=2686260.99
gn18            : Column=205248 Fraction=0.615 Kernel=2660231.44 Mflops=2685861.59
gn18            : Column=211776 Fraction=0.635 Kernel=2651166.64 Mflops=2685556.82
gn15            : Column=218496 Fraction=0.655 Kernel=2635120.20 Mflops=2685350.17
gn18            : Column=225216 Fraction=0.675 Kernel=2648355.77 Mflops=2685015.50
gn15            : Column=231936 Fraction=0.695 Kernel=2651497.77 Mflops=2684880.93
gn18            : Column=265152 Fraction=0.795 Kernel=2621334.53 Mflops=2683484.84
gn18            : Column=298560 Fraction=0.895 Kernel=2566484.19 Mflops=2682571.65
gn18            : Column=331968 Fraction=0.995 Kernel=2132550.46 Mflops=2681776.08
================================================================================
T/V                N    NB     P     Q               Time                 Gflops
--------------------------------------------------------------------------------
WC00C2R2      333504   192     2     2            9223.72            2.68107e+03
HPL_pdgesv() start time Thu Nov 14 14:13:51 2019

HPL_pdgesv() end time   Thu Nov 14 16:47:34 2019

--------------------------------------------------------------------------------
||Ax-b||_oo/(eps*(||A||_oo*||x||_oo+||b||_oo)*N)=        0.0032045 ...... PASSED
================================================================================

Finished      1 tests with the following results:
              1 tests completed and passed residual checks,
              0 tests completed and failed residual checks,
              0 tests skipped because of illegal input values.
--------------------------------------------------------------------------------

End of Tests.
================================================================================

HPCG

cd ~/WuK/benchmarks/hpcg
cp setup/Make.OPENMPI_IOMP_AVX2 setup/Make.WuK
sbatch hpcg.slurm

hpcg.slurm

与上一问不同,发现这一问每个节点设置两个进程各十个线程比较合适,否则跑出来的结果会下降到45GFLOP/s,猜测 HPCG 测试中不同 CPU 之间共享内存的效率很低。

#!/bin/bash
#SBATCH -J WuK_hpcg              # 任务名
#SBATCH -N 4                     # 申请 4 个节点
#SBATCH --ntasks-per-node=2     # 每个节点开 2 个进程
#SBATCH --cpus-per-task=10        # 每个进程占用 10 个 core
module load intelcompiler/18.0.0
module load MPI/impi/2018.0.128
export OMP_NUM_THREADS=10         # 设置全局 OpenMP 线程为 10
export I_MPI_FAVRICS=shm:ofa
make clean
./configure IMPI_IOMP_AVX2
make -j
cd bin
mpirun ./xhpcg_avx2

hpcg.dat

注意标准 HPCG 是要至少跑满半个小时的,因此第四行至少要设置成1800;根据README,问题的三维都要是 8 的倍数,并且至少要是 24。

(偷看了一下@zzc 大哥的参数,40 40 40的配置跑到了72GFLOP/s,我自己同参数只能跑到67GFLOP/s,待研究…

HPCG benchmark input file
Sandia National Laboratories; University of Tennessee, Knoxville
48 40 40
1800

n48-8p-10t-V3.0_2019.11.11.08.23.15.yaml

n48-8p-10t version: V3.0
Release date: November 11, 2015
Machine Summary:
  Distributed Processes: 8
  Threads per processes: 10
Global Problem Dimensions:
  Global nx: 96
  Global ny: 80
  Global nz: 80
Processor Dimensions:
  npx: 2
  npy: 2
  npz: 2
Local Domain Dimensions:
  nx: 48
  ny: 40
  nz: 40
########## Problem Summary  ##########:
Setup Information:
  Setup Time: 0.042238
Linear System Information:
  Number of Equations: 614400
  Number of Nonzero Terms: 16200184
Multigrid Information:
  Number of coarse grid levels: 3
  Coarse Grids:
    Grid Level: 1
    Number of Equations: 76800
    Number of Nonzero Terms: 1977208
    Number of Presmoother Steps: 1
    Number of Postsmoother Steps: 1
    Grid Level: 2
    Number of Equations: 9600
    Number of Nonzero Terms: 235480
    Number of Presmoother Steps: 1
    Number of Postsmoother Steps: 1
    Grid Level: 3
    Number of Equations: 1200
    Number of Nonzero Terms: 26656
    Number of Presmoother Steps: 1
    Number of Postsmoother Steps: 1
########## Memory Use Summary  ##########:
Memory Use Information:
  Total memory used for data (Gbytes): 0.440086
  Memory used for OptimizeProblem data (Gbytes): 0
  Bytes per equation (Total memory / Number of Equations): 716.285
  Memory used for linear system and CG (Gbytes): 0.38717
  Coarse Grids:
    Grid Level: 1
    Memory used: 0.0463569
    Grid Level: 2
    Memory used: 0.00582238
    Grid Level: 3
    Memory used: 0.00073604
########## V&V Testing Summary  ##########:
Spectral Convergence Tests:
  Result: PASSED
  Unpreconditioned:
    Maximum iteration count: 11
    Expected iteration count: 12
  Preconditioned:
    Maximum iteration count: 2
    Expected iteration count: 2
Departure from Symmetry |x'Ay-y'Ax|/(2*||x||*||A||*||y||)/epsilon:
  Result: PASSED
  Departure for SpMV: 2.45495e-09
  Departure for MG: 0
########## Iterations Summary  ##########:
Iteration Count Information:
  Result: PASSED
  Reference CG iterations per set: 50
  Optimized CG iterations per set: 50
  Total number of reference iterations: 50
  Total number of optimized iterations: 50
########## Reproducibility Summary  ##########:
Reproducibility Information:
  Result: PASSED
  Scaled residual mean: 3.83307e-08
  Scaled residual variance: 0
########## Performance Summary (times in sec) ##########:
Benchmark Time Summary:
  Optimization phase: 0.0375197
  DDOT: 0.000533819
  WAXPBY: 0.00136781
  SpMV: 0.0266542
  MG: 0.116045
  ALL_reduce: 0.00831723
  Total: 0.152972
Floating Point Operations Summary:
  Raw DDOT: 1.85549e+08
  Raw WAXPBY: 1.85549e+08
  Raw SpMV: 1.65242e+09
  Raw MG: 9.21177e+09
  Total: 1.12353e+10
  Total with convergence overhead: 1.12353e+10
GB/s Summary:
  Raw Read B/W: 452.671
  Raw Write B/W: 104.62
  Raw Total B/W: 557.291
  Total with convergence and optimization phase overhead: 529.675
GFLOP/s Summary:
  Raw DDOT: 347.587
  Raw WAXPBY: 135.654
  Raw SpMV: 61.9946
  Raw MG: 79.381
  Raw Total: 73.4467
  Total with convergence overhead: 73.4467
  Total with convergence and optimization phase overhead: 69.807
User Optimization Overheads:
  Problem setup time (sec): 0.042238
  Optimization phase time (sec): 0.0375197
  Optimization phase time vs reference SpMV+MG time: 2.04868
DDOT Timing Variations:
  Min DDOT MPI_Allreduce time: 0.00318003
  Max DDOT MPI_Allreduce time: 0.00888991
  Avg DDOT MPI_Allreduce time: 0.00730839
__________ Final Summary __________:
  HPCG result is VALID with a GFLOP/s rating of: 69.807
      HPCG 2.4 Rating (for historical value) is: 71.6884
  Reference version of ComputeDotProduct used: Performance results are most likely suboptimal
  Results are valid but execution time (sec) is: 0.152972
       You have selected the QuickPath option: Results are official for legacy installed systems with confirmation from the HPCG Benchmark leaders.
       After confirmation please upload results from the YAML file contents to: http://hpcg-benchmark.org

进阶要求

(学有余力,有兴趣等,选做性质,无论怎样还是裸绩重要-_-)可使用 GPU 跑 benchmark,该任务还是有一定难度的(进而同样可以尝试多种方法让跑到的分数尽可能高)

HPL

cd ~/WuK/wyf_benchmarks/gpu/Day1-HPL/HPL
sbatch benchmark.slurm

benchmark.slurm

在原来myrun的基础上把MPI/openmpi/1.10.7-gcc-4.8.5-dynamic换成MPI/openmpi/1.10.7-icc-14.0.2-dynamic(毫 无 卵 用

#!/bin/bash
#SBATCH -J WuK_hpl # 任务名
#SBATCH -N 2 # 申请 2 个节点
#SBATCH --ntasks-per-node=4 # 每个节点开 4 个进程
#SBATCH --exclusive
HPL_DIR=`pwd`

#-------------------module load-------------------------------------
module load MPI/openmpi/1.10.7-icc-14.0.2-dynamic
#module load MPI/openmpi/1.10.7-gcc-4.8.5-dynamic
#module load opt/cuda/10.1
#module load opt/openmpi/openmpi3-x86_64
#module load nvme/openmpi/3.1.3-gcc4.8.5
#module load opt/intelcompilers/2016.4-compilers
module list

#-------------------------------------------------------------------
export HOSTNAME=`hostname`
echo "HOSTNAME=$HOSTNAME"

DATETIME=`hostname`.`date +"%m%d.%H%M%S"`

echo "Results in ./results/xhpl_8_gpu-$DATETIME-output.txt"

#-------------------------------------------------------------------

nvidia-smi

MPI_WHERE=`which mpirun`
echo "FINDMPI=$MPI_WHERE"

$MPI_WHERE --version

#--------------------run openmpi------------------------------------

export I_MPI_FAVRICS=shm:ofa
export CUDA_VISIBLE_DEVICES="0,1,2,3"
##HOSTFILE
#$MPI_WHERE -np 4 --hostfile ./hostfiles/host2_4 -bind-to none -x LD_LIBRARY_PATH ./run_linpack_GPU_2xv100 | tee ./results/xhpl_4_gpu-$DATETIME-output.txt

##RANKFILE
#$MPI_WHERE -np 8 -bind-to none -x LD_LIBRARY_PATH ./run_linpack_GPU_2xv100 | tee ./results/xhpl_4_gpu-$DATETIME-output.txt
$MPI_WHERE -np 8 -bind-to none -x LD_LIBRARY_PATH --mca btl_tcp_if_include ib0  ./run_linpack_GPU_4xgpu | tee ./results/xhpl_8_gpu-$DATETIME-output.txt
#$MPI_WHERE -np 8 -bind-to none -x LD_LIBRARY_PATH ./xhpl_GPU_cuda90103_static_mkl_2016_static_ompi_1.10.2_sm35_sm60_sm70 | tee ./results/xhpl_8_gpu-$DATETIME-output.txt

HPL.dat

HPLinpack benchmark input file
Innovative Computing Laboratory, University of Tennessee
HPL.out      output file name (if any)
6            device out (6=stdout,7=stderr,file)
1            # of problems sizes (N)
131328 179712 120000 178176 153600 127488 124416 131328 178176 153600 124416 138240 142848 1244160 178176 44160 88320 87552 86784 44160 90624 65280 96000 102400 168960 153600  142848 153600 142848 124416 96256 142848 115200  Ns
1            # of NBs
384 1024 384 320 320 384 256 256 384 384 768 1024 768 896 768 1024 512 384 640 768 896 960 1024 1152 1280 384 640 960 768 640 256  960 512 768 1152         NBs
1            PMAP process mapping (0=Row-,1=Column-major)
1            # of process grids (P x Q)
2 1 2 1        Ps
4 2 2 4        Qs
16.0         threshold
1            # of panel fact
0 1 2        PFACTs (0=left, 1=Crout, 2=Right)
1            # of recursive stopping criterium
2 8          NBMINs (>= 1)
1            # of panels in recursion
2            NDIVs
1            # of recursive panel fact.
0 1 2        RFACTs (0=left, 1=Crout, 2=Right)
1            # of broadcast
1 0 2          BCASTs (0=1rg,1=1rM,2=2rg,3=2rM,4=Lng,5=LnM)
1            # of lookahead depth
0            DEPTHs (>=0)
0            SWAP (0=bin-exch,1=long,2=mix)
192          swapping threshold
1            L1 in (0=transposed,1=no-transposed) form
0            U  in (0=transposed,1=no-transposed) form
1            Equilibration (0=no,1=yes)
8            memory alignment in double (> 0)

xhpl_8_gpu-gn21.1113.180456-output.txt

================================================================================
HPLinpack 2.1  --  High-Performance Linpack benchmark  --   October 26, 2012
Written by A. Petitet and R. Clint Whaley,  Innovative Computing Laboratory, UTK
Modified by Piotr Luszczek, Innovative Computing Laboratory, UTK
Modified by Julien Langou, University of Colorado Denver
================================================================================

An explanation of the input/output parameters follows:
T/V    : Wall time / encoded variant.
N      : The order of the coefficient matrix A.
NB     : The partitioning blocking factor.
P      : The number of process rows.
Q      : The number of process columns.
Time   : Time in seconds to solve the linear system.
Gflops : Rate of execution for solving the linear system.

The following parameter values will be used:

N      :  131328
NB     :     384
PMAP   : Column-major process mapping
P      :       2
Q      :       4
PFACT  :    Left
NBMIN  :       2
NDIV   :       2
RFACT  :    Left
BCAST  :  1ringM
DEPTH  :       0
SWAP   : Binary-exchange
L1     : no-transposed form
U      : transposed form
EQUIL  : yes
ALIGN  : 8 double precision words

--------------------------------------------------------------------------------

- The matrix A is randomly generated for each test.
- The following scaled residual check will be computed:
      ||Ax-b||_oo / ( eps * ( || x ||_oo * || A ||_oo + || b ||_oo ) * N )
- The relative machine precision (eps) is taken to be               1.110223e-16
- Computational tests pass if scaled residuals are less than                16.0

monitor_gpu from environment variable 1
gpu_temp_warning from environment variable 60
gpu_clock_warning from environment variable 1200

	******** TESTING SYSTEM PARAMETERS ********
	PARAM 	[UNITS] 	MIN 	MAX 	AVG
	----- 	------- 	--- 	--- 	---
CPU :
	CPU_BW	[GB/s ] 	16.7 	16.9 	16.8
	CPU_FP	[GFLPS] 	134.2 	140.6 	138.6
PCIE :
	H2D_BW	[GB/s ] 	6.1 	6.2 	6.2
	D2H_BW	[GB/s ] 	6.6 	6.6 	6.6
	BID_BW	[GB/s ] 	8.5 	8.5 	8.5
GPU :
	GPU_BW	[GB/s ] 	160 	160 	160
	GPU_FP	[GFLPS]
	     	NB =  128 	 977 	1000 	 985
	     	NB =  256 	1087 	1221 	1161
	     	NB =  384 	1002 	1141 	1061
	     	NB =  512 	 851 	1051 	 915
	     	NB =  640 	 925 	1029 	 953
	     	NB =  768 	 944 	1063 	 979
	     	NB =  896 	 937 	1117 	 985
	     	NB = 1024 	 940 	1068 	 992
NET :
	NET_BW	[MB/s ]
		     8 B  	   1 	   1 	   1
		    64 B  	  11 	  14 	  13
		   512 B  	  77 	 127 	  88
		     4 KB 	 204 	 265 	 227
		    32 KB 	 369 	 738 	 509
		   256 KB 	 426 	 524 	 480
		  2048 KB 	 583 	 751 	 667
		 16384 KB 	1663 	1784 	1706
	NET_LAT	[ us  ] 	0.7 	0.9 	0.8

displaying Prog:%complete, N:columns, Time:seconds
iGF:instantaneous GF, GF:avg GF, GF_per: process GF


Per-Process Host Memory Estimate: 17.65 GB (MAX) 17.45 GB (MIN)

PCOL: 3 GPU_COLS: 19969 CPU_COLS: 12672
PCOL: 2 GPU_COLS: 19969 CPU_COLS: 12672
PCOL: 1 GPU_COLS: 19969 CPU_COLS: 13056
PCOL: 0 GPU_COLS: 19969 CPU_COLS: 13056
test_loop: 1 of 1
 Prog= 1.74%	N_left= 130560	Time= 4.92	Time_left= 277.21	iGF=  5352.27	GF=  5352.27	iGF_per= 669.03 	GF_per= 669.03 
!!!! WARNING: Rank: 2 : gn21 : GPU 0000:84:00.0 	Clock: 666 MHz 	Temp: 64 C 	Power: 150 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 3 : gn21 : GPU 0000:85:00.0 	Clock: 601 MHz 	Temp: 53 C 	Power: 147 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 0 : gn21 : GPU 0000:04:00.0 	Clock: 653 MHz 	Temp: 65 C 	Power: 144 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 6 : gn22 : GPU 0000:84:00.0 	Clock: 640 MHz 	Temp: 65 C 	Power: 141 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 7 : gn22 : GPU 0000:85:00.0 	Clock: 614 MHz 	Temp: 49 C 	Power: 146 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 1 : gn21 : GPU 0000:05:00.0 	Clock: 627 MHz 	Temp: 52 C 	Power: 149 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 4 : gn22 : GPU 0000:04:00.0 	Clock: 692 MHz 	Temp: 65 C 	Power: 149 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 5 : gn22 : GPU 0000:05:00.0 	Clock: 627 MHz 	Temp: 49 C 	Power: 148 W 	PCIe  gen 3 	 x16
 Prog= 3.47%	N_left= 129792	Time= 8.61	Time_left= 239.56	iGF=  7062.41	GF=  6084.62	iGF_per= 882.80 	GF_per= 760.58 
!!!! WARNING: Rank: 3 : gn21 : GPU 0000:85:00.0 	Clock: 614 MHz 	Temp: 54 C 	Power: 148 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 2 : gn21 : GPU 0000:84:00.0 	Clock: 640 MHz 	Temp: 65 C 	Power: 107 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 6 : gn22 : GPU 0000:84:00.0 	Clock: 640 MHz 	Temp: 66 C 	Power: 96 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 7 : gn22 : GPU 0000:85:00.0 	Clock: 627 MHz 	Temp: 50 C 	Power: 147 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 5 : gn22 : GPU 0000:05:00.0 	Clock: 614 MHz 	Temp: 50 C 	Power: 147 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 4 : gn22 : GPU 0000:04:00.0 	Clock: 666 MHz 	Temp: 66 C 	Power: 61 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 1 : gn21 : GPU 0000:05:00.0 	Clock: 614 MHz 	Temp: 54 C 	Power: 148 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 0 : gn21 : GPU 0000:04:00.0 	Clock: 653 MHz 	Temp: 66 C 	Power: 111 W 	PCIe  gen 3 	 x16
 Prog= 5.17%	N_left= 129024	Time= 12.24	Time_left= 224.36	iGF=  7088.48	GF=  6382.35	iGF_per= 886.06 	GF_per= 797.79 
 Prog= 6.02%	N_left= 128640	Time= 14.04	Time_left= 219.36	iGF=  7061.22	GF=  6469.64	iGF_per= 882.65 	GF_per= 808.70 
!!!! WARNING: Rank: 6 : gn22 : GPU 0000:84:00.0 	Clock: 640 MHz 	Temp: 68 C 	Power: 142 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 7 : gn22 : GPU 0000:85:00.0 	Clock: 614 MHz 	Temp: 51 C 	Power: 144 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 4 : gn22 : GPU 0000:04:00.0 	Clock: 692 MHz 	Temp: 68 C 	Power: 149 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 5 : gn22 : GPU 0000:05:00.0 	Clock: 614 MHz 	Temp: 51 C 	Power: 145 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 2 : gn21 : GPU 0000:84:00.0 	Clock: 666 MHz 	Temp: 67 C 	Power: 150 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 3 : gn21 : GPU 0000:85:00.0 	Clock: 601 MHz 	Temp: 55 C 	Power: 149 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 0 : gn21 : GPU 0000:04:00.0 	Clock: 653 MHz 	Temp: 67 C 	Power: 142 W 	PCIe  gen 3 	 x16
 Prog= 7.69%	N_left= 127872	Time= 17.60	Time_left= 211.32	iGF=  7095.53	GF=  6596.26	iGF_per= 886.94 	GF_per= 824.53 
!!!! WARNING: Rank: 1 : gn21 : GPU 0000:05:00.0 	Clock: 627 MHz 	Temp: 55 C 	Power: 148 W 	PCIe  gen 3 	 x16
 Prog= 9.34%	N_left= 127104	Time= 21.14	Time_left= 205.18	iGF=  7048.65	GF=  6672.05	iGF_per= 881.08 	GF_per= 834.01 
!!!! WARNING: Rank: 7 : gn22 : GPU 0000:85:00.0 	Clock: 627 MHz 	Temp: 52 C 	Power: 144 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 6 : gn22 : GPU 0000:84:00.0 	Clock: 640 MHz 	Temp: 68 C 	Power: 95 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 5 : gn22 : GPU 0000:05:00.0 	Clock: 614 MHz 	Temp: 51 C 	Power: 146 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 4 : gn22 : GPU 0000:04:00.0 	Clock: 666 MHz 	Temp: 68 C 	Power: 102 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 2 : gn21 : GPU 0000:84:00.0 	Clock: 640 MHz 	Temp: 67 C 	Power: 120 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 3 : gn21 : GPU 0000:85:00.0 	Clock: 601 MHz 	Temp: 56 C 	Power: 144 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 1 : gn21 : GPU 0000:05:00.0 	Clock: 627 MHz 	Temp: 55 C 	Power: 148 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 0 : gn21 : GPU 0000:04:00.0 	Clock: 640 MHz 	Temp: 68 C 	Power: 99 W 	PCIe  gen 3 	 x16
 Prog= 10.16%	N_left= 126720	Time= 22.89	Time_left= 202.42	iGF=  7062.50	GF=  6701.92	iGF_per= 882.81 	GF_per= 837.74 
 Prog= 11.78%	N_left= 125952	Time= 26.34	Time_left= 197.18	iGF=  7110.82	GF=  6755.43	iGF_per= 888.85 	GF_per= 844.43 
!!!! WARNING: Rank: 6 : gn22 : GPU 0000:84:00.0 	Clock: 653 MHz 	Temp: 70 C 	Power: 145 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 7 : gn22 : GPU 0000:85:00.0 	Clock: 614 MHz 	Temp: 52 C 	Power: 145 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 4 : gn22 : GPU 0000:04:00.0 	Clock: 692 MHz 	Temp: 69 C 	Power: 149 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 5 : gn22 : GPU 0000:05:00.0 	Clock: 614 MHz 	Temp: 52 C 	Power: 147 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 2 : gn21 : GPU 0000:84:00.0 	Clock: 666 MHz 	Temp: 69 C 	Power: 146 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 3 : gn21 : GPU 0000:85:00.0 	Clock: 614 MHz 	Temp: 56 C 	Power: 146 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 0 : gn21 : GPU 0000:04:00.0 	Clock: 653 MHz 	Temp: 69 C 	Power: 146 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 1 : gn21 : GPU 0000:05:00.0 	Clock: 614 MHz 	Temp: 56 C 	Power: 149 W 	PCIe  gen 3 	 x16
 Prog= 13.39%	N_left= 125184	Time= 29.78	Time_left= 192.62	iGF=  7052.57	GF=  6789.70	iGF_per= 881.57 	GF_per= 848.71 
!!!! WARNING: Rank: 6 : gn22 : GPU 0000:84:00.0 	Clock: 640 MHz 	Temp: 70 C 	Power: 79 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 7 : gn22 : GPU 0000:85:00.0 	Clock: 614 MHz 	Temp: 53 C 	Power: 145 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 5 : gn22 : GPU 0000:05:00.0 	Clock: 614 MHz 	Temp: 52 C 	Power: 147 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 4 : gn22 : GPU 0000:04:00.0 	Clock: 679 MHz 	Temp: 69 C 	Power: 63 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 1 : gn21 : GPU 0000:05:00.0 	Clock: 627 MHz 	Temp: 56 C 	Power: 148 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 0 : gn21 : GPU 0000:04:00.0 	Clock: 653 MHz 	Temp: 69 C 	Power: 87 W 	PCIe  gen 3 	 x16
 Prog= 14.97%	N_left= 124416	Time= 33.16	Time_left= 188.28	iGF=  7079.24	GF=  6819.21	iGF_per= 884.91 	GF_per= 852.40 
!!!! WARNING: Rank: 3 : gn21 : GPU 0000:85:00.0 	Clock: 601 MHz 	Temp: 57 C 	Power: 149 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 2 : gn21 : GPU 0000:84:00.0 	Clock: 719 MHz 	Temp: 69 C 	Power: 129 W 	PCIe  gen 3 	 x16
 Prog= 15.76%	N_left= 124032	Time= 34.84	Time_left= 186.24	iGF=  7050.23	GF=  6830.36	iGF_per= 881.28 	GF_per= 853.80 
 Prog= 17.31%	N_left= 123264	Time= 38.14	Time_left= 182.14	iGF=  7117.56	GF=  6855.21	iGF_per= 889.69 	GF_per= 856.90 
!!!! WARNING: Rank: 6 : gn22 : GPU 0000:84:00.0 	Clock: 640 MHz 	Temp: 71 C 	Power: 100 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 7 : gn22 : GPU 0000:85:00.0 	Clock: 627 MHz 	Temp: 53 C 	Power: 146 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 5 : gn22 : GPU 0000:05:00.0 	Clock: 614 MHz 	Temp: 53 C 	Power: 147 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 4 : gn22 : GPU 0000:04:00.0 	Clock: 666 MHz 	Temp: 70 C 	Power: 83 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 1 : gn21 : GPU 0000:05:00.0 	Clock: 627 MHz 	Temp: 56 C 	Power: 148 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 0 : gn21 : GPU 0000:04:00.0 	Clock: 653 MHz 	Temp: 70 C 	Power: 77 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 3 : gn21 : GPU 0000:85:00.0 	Clock: 601 MHz 	Temp: 58 C 	Power: 147 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 2 : gn21 : GPU 0000:84:00.0 	Clock: 719 MHz 	Temp: 70 C 	Power: 142 W 	PCIe  gen 3 	 x16
 Prog= 18.85%	N_left= 122496	Time= 41.44	Time_left= 178.41	iGF=  7020.03	GF=  6868.35	iGF_per= 877.50 	GF_per= 858.54 
 Prog= 19.61%	N_left= 122112	Time= 43.06	Time_left= 176.52	iGF=  7098.84	GF=  6877.01	iGF_per= 887.36 	GF_per= 859.63 
!!!! WARNING: Rank: 6 : gn22 : GPU 0000:84:00.0 	Clock: 640 MHz 	Temp: 72 C 	Power: 106 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 7 : gn22 : GPU 0000:85:00.0 	Clock: 627 MHz 	Temp: 53 C 	Power: 143 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 5 : gn22 : GPU 0000:05:00.0 	Clock: 614 MHz 	Temp: 53 C 	Power: 147 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 4 : gn22 : GPU 0000:04:00.0 	Clock: 679 MHz 	Temp: 70 C 	Power: 82 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 1 : gn21 : GPU 0000:05:00.0 	Clock: 627 MHz 	Temp: 57 C 	Power: 147 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 0 : gn21 : GPU 0000:04:00.0 	Clock: 653 MHz 	Temp: 70 C 	Power: 102 W 	PCIe  gen 3 	 x16
 Prog= 21.12%	N_left= 121344	Time= 46.27	Time_left= 172.83	iGF=  7091.03	GF=  6891.86	iGF_per= 886.38 	GF_per= 861.48 
!!!! WARNING: Rank: 2 : gn21 : GPU 0000:84:00.0 	Clock: 692 MHz 	Temp: 71 C 	Power: 137 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 3 : gn21 : GPU 0000:85:00.0 	Clock: 601 MHz 	Temp: 58 C 	Power: 149 W 	PCIe  gen 3 	 x16
 Prog= 22.61%	N_left= 120576	Time= 49.45	Time_left= 169.32	iGF=  7053.75	GF=  6902.29	iGF_per= 881.72 	GF_per= 862.79 
!!!! WARNING: Rank: 6 : gn22 : GPU 0000:84:00.0 	Clock: 640 MHz 	Temp: 72 C 	Power: 97 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 7 : gn22 : GPU 0000:85:00.0 	Clock: 627 MHz 	Temp: 53 C 	Power: 147 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 5 : gn22 : GPU 0000:05:00.0 	Clock: 614 MHz 	Temp: 53 C 	Power: 147 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 4 : gn22 : GPU 0000:04:00.0 	Clock: 679 MHz 	Temp: 71 C 	Power: 65 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 1 : gn21 : GPU 0000:05:00.0 	Clock: 627 MHz 	Temp: 57 C 	Power: 148 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 0 : gn21 : GPU 0000:04:00.0 	Clock: 653 MHz 	Temp: 71 C 	Power: 104 W 	PCIe  gen 3 	 x16
 Prog= 24.07%	N_left= 119808	Time= 52.58	Time_left= 165.83	iGF=  7092.80	GF=  6913.62	iGF_per= 886.60 	GF_per= 864.20 
!!!! WARNING: Rank: 3 : gn21 : GPU 0000:85:00.0 	Clock: 601 MHz 	Temp: 58 C 	Power: 150 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 2 : gn21 : GPU 0000:84:00.0 	Clock: 679 MHz 	Temp: 71 C 	Power: 138 W 	PCIe  gen 3 	 x16
 Prog= 24.80%	N_left= 119424	Time= 54.14	Time_left= 164.15	iGF=  7042.32	GF=  6917.33	iGF_per= 880.29 	GF_per= 864.67 
 Prog= 26.24%	N_left= 118656	Time= 57.19	Time_left= 160.72	iGF=  7145.81	GF=  6929.50	iGF_per= 893.23 	GF_per= 866.19 
!!!! WARNING: Rank: 6 : gn22 : GPU 0000:84:00.0 	Clock: 653 MHz 	Temp: 72 C 	Power: 124 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 7 : gn22 : GPU 0000:85:00.0 	Clock: 627 MHz 	Temp: 53 C 	Power: 148 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 5 : gn22 : GPU 0000:05:00.0 	Clock: 614 MHz 	Temp: 53 C 	Power: 149 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 4 : gn22 : GPU 0000:04:00.0 	Clock: 692 MHz 	Temp: 71 C 	Power: 72 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 1 : gn21 : GPU 0000:05:00.0 	Clock: 627 MHz 	Temp: 56 C 	Power: 147 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 0 : gn21 : GPU 0000:04:00.0 	Clock: 653 MHz 	Temp: 71 C 	Power: 92 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 2 : gn21 : GPU 0000:84:00.0 	Clock: 653 MHz 	Temp: 71 C 	Power: 126 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 3 : gn21 : GPU 0000:85:00.0 	Clock: 601 MHz 	Temp: 58 C 	Power: 150 W 	PCIe  gen 3 	 x16
 Prog= 27.67%	N_left= 117888	Time= 60.25	Time_left= 157.53	iGF=  7012.78	GF=  6933.74	iGF_per= 876.60 	GF_per= 866.72 
 Prog= 28.37%	N_left= 117504	Time= 61.75	Time_left= 155.89	iGF=  7118.41	GF=  6938.21	iGF_per= 889.80 	GF_per= 867.28 
!!!! WARNING: Rank: 6 : gn22 : GPU 0000:84:00.0 	Clock: 640 MHz 	Temp: 72 C 	Power: 116 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 7 : gn22 : GPU 0000:85:00.0 	Clock: 640 MHz 	Temp: 53 C 	Power: 146 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 5 : gn22 : GPU 0000:05:00.0 	Clock: 614 MHz 	Temp: 53 C 	Power: 147 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 4 : gn22 : GPU 0000:04:00.0 	Clock: 732 MHz 	Temp: 71 C 	Power: 61 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 1 : gn21 : GPU 0000:05:00.0 	Clock: 614 MHz 	Temp: 57 C 	Power: 147 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 0 : gn21 : GPU 0000:04:00.0 	Clock: 653 MHz 	Temp: 71 C 	Power: 109 W 	PCIe  gen 3 	 x16
 Prog= 29.77%	N_left= 116736	Time= 64.71	Time_left= 152.68	iGF=  7113.55	GF=  6946.23	iGF_per= 889.19 	GF_per= 868.28 
!!!! WARNING: Rank: 3 : gn21 : GPU 0000:85:00.0 	Clock: 601 MHz 	Temp: 58 C 	Power: 149 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 2 : gn21 : GPU 0000:84:00.0 	Clock: 653 MHz 	Temp: 72 C 	Power: 131 W 	PCIe  gen 3 	 x16
 Prog= 31.14%	N_left= 115968	Time= 67.65	Time_left= 149.57	iGF=  7064.22	GF=  6951.37	iGF_per= 883.03 	GF_per= 868.92 
!!!! WARNING: Rank: 5 : gn22 : GPU 0000:05:00.0 	Clock: 627 MHz 	Temp: 53 C 	Power: 148 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 4 : gn22 : GPU 0000:04:00.0 	Clock: 679 MHz 	Temp: 71 C 	Power: 78 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 6 : gn22 : GPU 0000:84:00.0 	Clock: 679 MHz 	Temp: 73 C 	Power: 119 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 7 : gn22 : GPU 0000:85:00.0 	Clock: 627 MHz 	Temp: 53 C 	Power: 148 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 1 : gn21 : GPU 0000:05:00.0 	Clock: 627 MHz 	Temp: 57 C 	Power: 148 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 0 : gn21 : GPU 0000:04:00.0 	Clock: 653 MHz 	Temp: 72 C 	Power: 104 W 	PCIe  gen 3 	 x16
 Prog= 32.50%	N_left= 115200	Time= 70.54	Time_left= 146.49	iGF=  7103.96	GF=  6957.61	iGF_per= 887.99 	GF_per= 869.70 
!!!! WARNING: Rank: 3 : gn21 : GPU 0000:85:00.0 	Clock: 614 MHz 	Temp: 58 C 	Power: 148 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 2 : gn21 : GPU 0000:84:00.0 	Clock: 653 MHz 	Temp: 72 C 	Power: 118 W 	PCIe  gen 3 	 x16
 Prog= 33.18%	N_left= 114816	Time= 71.99	Time_left= 145.01	iGF=  6999.33	GF=  6958.46	iGF_per= 874.92 	GF_per= 869.81 
 Prog= 34.51%	N_left= 114048	Time= 74.80	Time_left= 141.97	iGF=  7156.42	GF=  6965.89	iGF_per= 894.55 	GF_per= 870.74 
!!!! WARNING: Rank: 5 : gn22 : GPU 0000:05:00.0 	Clock: 614 MHz 	Temp: 53 C 	Power: 147 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 4 : gn22 : GPU 0000:04:00.0 	Clock: 679 MHz 	Temp: 72 C 	Power: 145 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 6 : gn22 : GPU 0000:84:00.0 	Clock: 640 MHz 	Temp: 72 C 	Power: 109 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 7 : gn22 : GPU 0000:85:00.0 	Clock: 614 MHz 	Temp: 53 C 	Power: 144 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 1 : gn21 : GPU 0000:05:00.0 	Clock: 627 MHz 	Temp: 57 C 	Power: 149 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 0 : gn21 : GPU 0000:04:00.0 	Clock: 653 MHz 	Temp: 72 C 	Power: 85 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 3 : gn21 : GPU 0000:85:00.0 	Clock: 614 MHz 	Temp: 58 C 	Power: 148 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 2 : gn21 : GPU 0000:84:00.0 	Clock: 653 MHz 	Temp: 72 C 	Power: 123 W 	PCIe  gen 3 	 x16
 Prog= 35.82%	N_left= 113280	Time= 77.64	Time_left= 139.09	iGF=  7006.26	GF=  6967.37	iGF_per= 875.78 	GF_per= 870.92 
 Prog= 36.47%	N_left= 112896	Time= 79.01	Time_left= 137.61	iGF=  7163.52	GF=  6970.77	iGF_per= 895.44 	GF_per= 871.35 
!!!! WARNING: Rank: 5 : gn22 : GPU 0000:05:00.0 	Clock: 614 MHz 	Temp: 53 C 	Power: 148 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 4 : gn22 : GPU 0000:04:00.0 	Clock: 679 MHz 	Temp: 72 C 	Power: 148 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 6 : gn22 : GPU 0000:84:00.0 	Clock: 653 MHz 	Temp: 72 C 	Power: 117 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 7 : gn22 : GPU 0000:85:00.0 	Clock: 627 MHz 	Temp: 53 C 	Power: 147 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 1 : gn21 : GPU 0000:05:00.0 	Clock: 614 MHz 	Temp: 56 C 	Power: 147 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 0 : gn21 : GPU 0000:04:00.0 	Clock: 653 MHz 	Temp: 72 C 	Power: 88 W 	PCIe  gen 3 	 x16
 Prog= 37.76%	N_left= 112128	Time= 81.75	Time_left= 134.74	iGF=  7099.54	GF=  6975.08	iGF_per= 887.44 	GF_per= 871.89 
!!!! WARNING: Rank: 3 : gn21 : GPU 0000:85:00.0 	Clock: 614 MHz 	Temp: 57 C 	Power: 148 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 2 : gn21 : GPU 0000:84:00.0 	Clock: 692 MHz 	Temp: 72 C 	Power: 118 W 	PCIe  gen 3 	 x16
 Prog= 39.03%	N_left= 111360	Time= 84.46	Time_left= 131.93	iGF=  7073.07	GF=  6978.23	iGF_per= 884.13 	GF_per= 872.28 
!!!! WARNING: Rank: 5 : gn22 : GPU 0000:05:00.0 	Clock: 614 MHz 	Temp: 53 C 	Power: 148 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 4 : gn22 : GPU 0000:04:00.0 	Clock: 679 MHz 	Temp: 72 C 	Power: 146 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 6 : gn22 : GPU 0000:84:00.0 	Clock: 653 MHz 	Temp: 73 C 	Power: 136 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 7 : gn22 : GPU 0000:85:00.0 	Clock: 627 MHz 	Temp: 53 C 	Power: 148 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 1 : gn21 : GPU 0000:05:00.0 	Clock: 627 MHz 	Temp: 56 C 	Power: 148 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 0 : gn21 : GPU 0000:04:00.0 	Clock: 653 MHz 	Temp: 72 C 	Power: 93 W 	PCIe  gen 3 	 x16
 Prog= 40.28%	N_left= 110592	Time= 87.12	Time_left= 129.15	iGF=  7109.17	GF=  6982.23	iGF_per= 888.65 	GF_per= 872.78 
!!!! WARNING: Rank: 3 : gn21 : GPU 0000:85:00.0 	Clock: 614 MHz 	Temp: 57 C 	Power: 148 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 2 : gn21 : GPU 0000:84:00.0 	Clock: 653 MHz 	Temp: 72 C 	Power: 84 W 	PCIe  gen 3 	 x16
 Prog= 40.90%	N_left= 110208	Time= 88.46	Time_left= 127.81	iGF=  6974.82	GF=  6982.12	iGF_per= 871.85 	GF_per= 872.76 
 Prog= 42.13%	N_left= 109440	Time= 91.02	Time_left= 125.03	iGF=  7223.64	GF=  6988.92	iGF_per= 902.96 	GF_per= 873.62 
!!!! WARNING: Rank: 5 : gn22 : GPU 0000:05:00.0 	Clock: 614 MHz 	Temp: 53 C 	Power: 149 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 4 : gn22 : GPU 0000:04:00.0 	Clock: 679 MHz 	Temp: 72 C 	Power: 145 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 6 : gn22 : GPU 0000:84:00.0 	Clock: 640 MHz 	Temp: 73 C 	Power: 138 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 7 : gn22 : GPU 0000:85:00.0 	Clock: 627 MHz 	Temp: 53 C 	Power: 148 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 1 : gn21 : GPU 0000:05:00.0 	Clock: 627 MHz 	Temp: 56 C 	Power: 148 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 0 : gn21 : GPU 0000:04:00.0 	Clock: 692 MHz 	Temp: 72 C 	Power: 92 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 3 : gn21 : GPU 0000:85:00.0 	Clock: 614 MHz 	Temp: 57 C 	Power: 147 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 2 : gn21 : GPU 0000:84:00.0 	Clock: 666 MHz 	Temp: 72 C 	Power: 109 W 	PCIe  gen 3 	 x16
 Prog= 43.34%	N_left= 108672	Time= 93.64	Time_left= 122.43	iGF=  6974.26	GF=  6988.51	iGF_per= 871.78 	GF_per= 873.56 
 Prog= 43.94%	N_left= 108288	Time= 94.90	Time_left= 121.08	iGF=  7202.91	GF=  6991.35	iGF_per= 900.36 	GF_per= 873.92 
!!!! WARNING: Rank: 5 : gn22 : GPU 0000:05:00.0 	Clock: 614 MHz 	Temp: 53 C 	Power: 148 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 4 : gn22 : GPU 0000:04:00.0 	Clock: 692 MHz 	Temp: 73 C 	Power: 148 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 6 : gn22 : GPU 0000:84:00.0 	Clock: 666 MHz 	Temp: 73 C 	Power: 124 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 7 : gn22 : GPU 0000:85:00.0 	Clock: 627 MHz 	Temp: 53 C 	Power: 147 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 1 : gn21 : GPU 0000:05:00.0 	Clock: 614 MHz 	Temp: 56 C 	Power: 147 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 0 : gn21 : GPU 0000:04:00.0 	Clock: 653 MHz 	Temp: 72 C 	Power: 98 W 	PCIe  gen 3 	 x16
 Prog= 45.12%	N_left= 107520	Time= 97.41	Time_left= 118.47	iGF=  7126.06	GF=  6994.82	iGF_per= 890.76 	GF_per= 874.35 
!!!! WARNING: Rank: 3 : gn21 : GPU 0000:85:00.0 	Clock: 614 MHz 	Temp: 57 C 	Power: 146 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 2 : gn21 : GPU 0000:84:00.0 	Clock: 679 MHz 	Temp: 72 C 	Power: 114 W 	PCIe  gen 3 	 x16
 Prog= 46.29%	N_left= 106752	Time= 99.89	Time_left= 115.90	iGF=  7115.59	GF=  6997.81	iGF_per= 889.45 	GF_per= 874.73 
 Prog= 46.87%	N_left= 106368	Time= 101.10	Time_left= 114.61	iGF=  7188.08	GF=  7000.10	iGF_per= 898.51 	GF_per= 875.01 
!!!! WARNING: Rank: 4 : gn22 : GPU 0000:04:00.0 	Clock: 679 MHz 	Temp: 72 C 	Power: 69 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 5 : gn22 : GPU 0000:05:00.0 	Clock: 627 MHz 	Temp: 52 C 	Power: 113 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 6 : gn22 : GPU 0000:84:00.0 	Clock: 640 MHz 	Temp: 73 C 	Power: 141 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 7 : gn22 : GPU 0000:85:00.0 	Clock: 640 MHz 	Temp: 53 C 	Power: 148 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 0 : gn21 : GPU 0000:04:00.0 	Clock: 653 MHz 	Temp: 72 C 	Power: 144 W 	PCIe  gen 3 	 x16
 Prog= 48.01%	N_left= 105600	Time= 103.57	Time_left= 112.16	iGF=  6976.27	GF=  6999.53	iGF_per= 872.03 	GF_per= 874.94 
!!!! WARNING: Rank: 1 : gn21 : GPU 0000:05:00.0 	Clock: 627 MHz 	Temp: 56 C 	Power: 148 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 2 : gn21 : GPU 0000:84:00.0 	Clock: 666 MHz 	Temp: 72 C 	Power: 148 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 3 : gn21 : GPU 0000:85:00.0 	Clock: 614 MHz 	Temp: 57 C 	Power: 150 W 	PCIe  gen 3 	 x16
 Prog= 49.14%	N_left= 104832	Time= 105.92	Time_left= 109.64	iGF=  7250.43	GF=  7005.08	iGF_per= 906.30 	GF_per= 875.64 
 Prog= 50.25%	N_left= 104064	Time= 108.32	Time_left= 107.26	iGF=  6970.42	GF=  7004.31	iGF_per= 871.30 	GF_per= 875.54 
!!!! WARNING: Rank: 5 : gn22 : GPU 0000:05:00.0 	Clock: 614 MHz 	Temp: 52 C 	Power: 148 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 4 : gn22 : GPU 0000:04:00.0 	Clock: 692 MHz 	Temp: 72 C 	Power: 146 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 7 : gn22 : GPU 0000:85:00.0 	Clock: 614 MHz 	Temp: 53 C 	Power: 149 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 6 : gn22 : GPU 0000:84:00.0 	Clock: 640 MHz 	Temp: 73 C 	Power: 141 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 1 : gn21 : GPU 0000:05:00.0 	Clock: 627 MHz 	Temp: 56 C 	Power: 148 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 0 : gn21 : GPU 0000:04:00.0 	Clock: 653 MHz 	Temp: 72 C 	Power: 144 W 	PCIe  gen 3 	 x16
 Prog= 50.79%	N_left= 103680	Time= 109.46	Time_left= 106.03	iGF=  7288.19	GF=  7007.26	iGF_per= 911.02 	GF_per= 875.91 
!!!! WARNING: Rank: 2 : gn21 : GPU 0000:84:00.0 	Clock: 666 MHz 	Temp: 72 C 	Power: 117 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 3 : gn21 : GPU 0000:85:00.0 	Clock: 614 MHz 	Temp: 57 C 	Power: 148 W 	PCIe  gen 3 	 x16
 Prog= 51.88%	N_left= 102912	Time= 111.75	Time_left= 103.65	iGF=  7162.64	GF=  7010.44	iGF_per= 895.33 	GF_per= 876.31 
 Prog= 52.95%	N_left= 102144	Time= 114.03	Time_left= 101.32	iGF=  7081.88	GF=  7011.87	iGF_per= 885.24 	GF_per= 876.48 
!!!! WARNING: Rank: 4 : gn22 : GPU 0000:04:00.0 	Clock: 679 MHz 	Temp: 72 C 	Power: 90 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 5 : gn22 : GPU 0000:05:00.0 	Clock: 614 MHz 	Temp: 52 C 	Power: 120 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 6 : gn22 : GPU 0000:84:00.0 	Clock: 640 MHz 	Temp: 72 C 	Power: 73 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 7 : gn22 : GPU 0000:85:00.0 	Clock: 627 MHz 	Temp: 53 C 	Power: 102 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 0 : gn21 : GPU 0000:04:00.0 	Clock: 653 MHz 	Temp: 72 C 	Power: 69 W 	PCIe  gen 3 	 x16
 Prog= 53.48%	N_left= 101760	Time= 115.21	Time_left= 100.22	iGF=  6750.56	GF=  7009.19	iGF_per= 843.82 	GF_per= 876.15 
!!!! WARNING: Rank: 1 : gn21 : GPU 0000:05:00.0 	Clock: 692 MHz 	Temp: 56 C 	Power: 123 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 2 : gn21 : GPU 0000:84:00.0 	Clock: 666 MHz 	Temp: 72 C 	Power: 146 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 3 : gn21 : GPU 0000:85:00.0 	Clock: 614 MHz 	Temp: 57 C 	Power: 148 W 	PCIe  gen 3 	 x16
 Prog= 54.52%	N_left= 100992	Time= 117.44	Time_left= 97.96	iGF=  7069.73	GF=  7010.34	iGF_per= 883.72 	GF_per= 876.29 
 Prog= 55.55%	N_left= 100224	Time= 119.67	Time_left= 95.75	iGF=  6976.79	GF=  7009.72	iGF_per= 872.10 	GF_per= 876.21 
!!!! WARNING: Rank: 5 : gn22 : GPU 0000:05:00.0 	Clock: 614 MHz 	Temp: 52 C 	Power: 149 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 4 : gn22 : GPU 0000:04:00.0 	Clock: 679 MHz 	Temp: 72 C 	Power: 145 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 1 : gn21 : GPU 0000:05:00.0 	Clock: 627 MHz 	Temp: 56 C 	Power: 148 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 0 : gn21 : GPU 0000:04:00.0 	Clock: 653 MHz 	Temp: 72 C 	Power: 142 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 6 : gn22 : GPU 0000:84:00.0 	Clock: 640 MHz 	Temp: 73 C 	Power: 105 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 7 : gn22 : GPU 0000:85:00.0 	Clock: 614 MHz 	Temp: 53 C 	Power: 149 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 3 : gn21 : GPU 0000:85:00.0 	Clock: 614 MHz 	Temp: 57 C 	Power: 149 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 2 : gn21 : GPU 0000:84:00.0 	Clock: 719 MHz 	Temp: 72 C 	Power: 109 W 	PCIe  gen 3 	 x16
 Prog= 56.57%	N_left= 99456	Time= 121.87	Time_left= 93.57	iGF=  6960.50	GF=  7008.83	iGF_per= 870.06 	GF_per= 876.10 
 Prog= 57.07%	N_left= 99072	Time= 122.90	Time_left= 92.45	iGF=  7377.51	GF=  7011.91	iGF_per= 922.19 	GF_per= 876.49 
 Prog= 58.06%	N_left= 98304	Time= 125.03	Time_left= 90.32	iGF=  7008.82	GF=  7011.85	iGF_per= 876.10 	GF_per= 876.48 
!!!! WARNING: Rank: 4 : gn22 : GPU 0000:04:00.0 	Clock: 732 MHz 	Temp: 72 C 	Power: 80 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 5 : gn22 : GPU 0000:05:00.0 	Clock: 614 MHz 	Temp: 52 C 	Power: 122 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 6 : gn22 : GPU 0000:84:00.0 	Clock: 640 MHz 	Temp: 73 C 	Power: 141 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 7 : gn22 : GPU 0000:85:00.0 	Clock: 627 MHz 	Temp: 53 C 	Power: 146 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 0 : gn21 : GPU 0000:04:00.0 	Clock: 653 MHz 	Temp: 72 C 	Power: 145 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 1 : gn21 : GPU 0000:05:00.0 	Clock: 627 MHz 	Temp: 56 C 	Power: 148 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 2 : gn21 : GPU 0000:84:00.0 	Clock: 666 MHz 	Temp: 72 C 	Power: 148 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 3 : gn21 : GPU 0000:85:00.0 	Clock: 614 MHz 	Temp: 57 C 	Power: 148 W 	PCIe  gen 3 	 x16
 Prog= 59.03%	N_left= 97536	Time= 127.11	Time_left= 88.20	iGF=  7092.94	GF=  7013.18	iGF_per= 886.62 	GF_per= 876.65 
 Prog= 59.52%	N_left= 97152	Time= 128.18	Time_left= 87.19	iGF=  6808.10	GF=  7011.47	iGF_per= 851.01 	GF_per= 876.43 
 Prog= 60.47%	N_left= 96384	Time= 130.21	Time_left= 85.13	iGF=  7058.03	GF=  7012.20	iGF_per= 882.25 	GF_per= 876.52 
!!!! WARNING: Rank: 5 : gn22 : GPU 0000:05:00.0 	Clock: 614 MHz 	Temp: 52 C 	Power: 146 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 4 : gn22 : GPU 0000:04:00.0 	Clock: 679 MHz 	Temp: 72 C 	Power: 145 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 6 : gn22 : GPU 0000:84:00.0 	Clock: 640 MHz 	Temp: 73 C 	Power: 141 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 7 : gn22 : GPU 0000:85:00.0 	Clock: 614 MHz 	Temp: 53 C 	Power: 149 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 1 : gn21 : GPU 0000:05:00.0 	Clock: 614 MHz 	Temp: 55 C 	Power: 147 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 0 : gn21 : GPU 0000:04:00.0 	Clock: 653 MHz 	Temp: 72 C 	Power: 143 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 2 : gn21 : GPU 0000:84:00.0 	Clock: 666 MHz 	Temp: 72 C 	Power: 113 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 3 : gn21 : GPU 0000:85:00.0 	Clock: 614 MHz 	Temp: 57 C 	Power: 148 W 	PCIe  gen 3 	 x16
 Prog= 61.41%	N_left= 95616	Time= 132.23	Time_left= 83.11	iGF=  7010.61	GF=  7012.17	iGF_per= 876.33 	GF_per= 876.52 
 Prog= 62.33%	N_left= 94848	Time= 134.23	Time_left= 81.13	iGF=  6971.29	GF=  7011.56	iGF_per= 871.41 	GF_per= 876.45 
 Prog= 62.78%	N_left= 94464	Time= 135.15	Time_left= 80.11	iGF=  7462.89	GF=  7014.64	iGF_per= 932.86 	GF_per= 876.83 
!!!! WARNING: Rank: 4 : gn22 : GPU 0000:04:00.0 	Clock: 692 MHz 	Temp: 72 C 	Power: 93 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 5 : gn22 : GPU 0000:05:00.0 	Clock: 627 MHz 	Temp: 52 C 	Power: 146 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 6 : gn22 : GPU 0000:84:00.0 	Clock: 640 MHz 	Temp: 72 C 	Power: 86 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 7 : gn22 : GPU 0000:85:00.0 	Clock: 614 MHz 	Temp: 53 C 	Power: 107 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 1 : gn21 : GPU 0000:05:00.0 	Clock: 614 MHz 	Temp: 55 C 	Power: 145 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 0 : gn21 : GPU 0000:04:00.0 	Clock: 653 MHz 	Temp: 72 C 	Power: 144 W 	PCIe  gen 3 	 x16
 Prog= 63.68%	N_left= 93696	Time= 137.12	Time_left= 78.19	iGF=  6920.94	GF=  7013.30	iGF_per= 865.12 	GF_per= 876.66 
!!!! WARNING: Rank: 3 : gn21 : GPU 0000:85:00.0 	Clock: 614 MHz 	Temp: 57 C 	Power: 148 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 2 : gn21 : GPU 0000:84:00.0 	Clock: 666 MHz 	Temp: 72 C 	Power: 88 W 	PCIe  gen 3 	 x16
 Prog= 64.57%	N_left= 92928	Time= 139.00	Time_left= 76.27	iGF=  7099.44	GF=  7014.47	iGF_per= 887.43 	GF_per= 876.81 
 Prog= 65.01%	N_left= 92544	Time= 139.99	Time_left= 75.35	iGF=  6678.28	GF=  7012.09	iGF_per= 834.79 	GF_per= 876.51 
!!!! WARNING: Rank: 4 : gn22 : GPU 0000:04:00.0 	Clock: 679 MHz 	Temp: 72 C 	Power: 66 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 5 : gn22 : GPU 0000:05:00.0 	Clock: 627 MHz 	Temp: 52 C 	Power: 119 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 6 : gn22 : GPU 0000:84:00.0 	Clock: 640 MHz 	Temp: 73 C 	Power: 141 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 7 : gn22 : GPU 0000:85:00.0 	Clock: 627 MHz 	Temp: 53 C 	Power: 148 W 	PCIe  gen 3 	 x16
 Prog= 65.87%	N_left= 91776	Time= 141.84	Time_left= 73.49	iGF=  7059.59	GF=  7012.71	iGF_per= 882.45 	GF_per= 876.59 
!!!! WARNING: Rank: 1 : gn21 : GPU 0000:05:00.0 	Clock: 614 MHz 	Temp: 55 C 	Power: 148 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 0 : gn21 : GPU 0000:04:00.0 	Clock: 653 MHz 	Temp: 72 C 	Power: 143 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 2 : gn21 : GPU 0000:84:00.0 	Clock: 666 MHz 	Temp: 72 C 	Power: 148 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 3 : gn21 : GPU 0000:85:00.0 	Clock: 614 MHz 	Temp: 57 C 	Power: 148 W 	PCIe  gen 3 	 x16
 Prog= 66.72%	N_left= 91008	Time= 143.61	Time_left= 71.63	iGF=  7250.03	GF=  7015.64	iGF_per= 906.25 	GF_per= 876.95 
 Prog= 67.56%	N_left= 90240	Time= 145.42	Time_left= 69.84	iGF=  6946.37	GF=  7014.77	iGF_per= 868.30 	GF_per= 876.85 
 Prog= 67.97%	N_left= 89856	Time= 146.25	Time_left= 68.92	iGF=  7524.52	GF=  7017.66	iGF_per= 940.57 	GF_per= 877.21 
!!!! WARNING: Rank: 4 : gn22 : GPU 0000:04:00.0 	Clock: 679 MHz 	Temp: 72 C 	Power: 83 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 5 : gn22 : GPU 0000:05:00.0 	Clock: 627 MHz 	Temp: 52 C 	Power: 144 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 6 : gn22 : GPU 0000:84:00.0 	Clock: 640 MHz 	Temp: 72 C 	Power: 80 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 7 : gn22 : GPU 0000:85:00.0 	Clock: 627 MHz 	Temp: 52 C 	Power: 104 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 1 : gn21 : GPU 0000:05:00.0 	Clock: 627 MHz 	Temp: 55 C 	Power: 144 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 0 : gn21 : GPU 0000:04:00.0 	Clock: 653 MHz 	Temp: 72 C 	Power: 143 W 	PCIe  gen 3 	 x16
 Prog= 68.78%	N_left= 89088	Time= 148.07	Time_left= 67.20	iGF=  6763.41	GF=  7014.53	iGF_per= 845.43 	GF_per= 876.82 
!!!! WARNING: Rank: 3 : gn21 : GPU 0000:85:00.0 	Clock: 614 MHz 	Temp: 57 C 	Power: 147 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 2 : gn21 : GPU 0000:84:00.0 	Clock: 666 MHz 	Temp: 72 C 	Power: 147 W 	PCIe  gen 3 	 x16
 Prog= 69.58%	N_left= 88320	Time= 149.77	Time_left= 65.47	iGF=  7123.46	GF=  7015.77	iGF_per= 890.43 	GF_per= 876.97 
 Prog= 69.98%	N_left= 87936	Time= 150.64	Time_left= 64.63	iGF=  6827.20	GF=  7014.67	iGF_per= 853.40 	GF_per= 876.83 
!!!! WARNING: Rank: 4 : gn22 : GPU 0000:04:00.0 	Clock: 679 MHz 	Temp: 72 C 	Power: 75 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 5 : gn22 : GPU 0000:05:00.0 	Clock: 614 MHz 	Temp: 52 C 	Power: 115 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 6 : gn22 : GPU 0000:84:00.0 	Clock: 653 MHz 	Temp: 73 C 	Power: 145 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 7 : gn22 : GPU 0000:85:00.0 	Clock: 627 MHz 	Temp: 53 C 	Power: 149 W 	PCIe  gen 3 	 x16
 Prog= 70.76%	N_left= 87168	Time= 152.33	Time_left= 62.95	iGF=  6965.59	GF=  7014.13	iGF_per= 870.70 	GF_per= 876.77 
!!!! WARNING: Rank: 1 : gn21 : GPU 0000:05:00.0 	Clock: 627 MHz 	Temp: 55 C 	Power: 148 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 0 : gn21 : GPU 0000:04:00.0 	Clock: 653 MHz 	Temp: 72 C 	Power: 143 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 2 : gn21 : GPU 0000:84:00.0 	Clock: 666 MHz 	Temp: 72 C 	Power: 99 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 3 : gn21 : GPU 0000:85:00.0 	Clock: 601 MHz 	Temp: 57 C 	Power: 149 W 	PCIe  gen 3 	 x16
 Prog= 71.52%	N_left= 86400	Time= 153.99	Time_left= 61.31	iGF=  6977.44	GF=  7013.73	iGF_per= 872.18 	GF_per= 876.72 
 Prog= 71.90%	N_left= 86016	Time= 154.78	Time_left= 60.48	iGF=  7186.38	GF=  7014.62	iGF_per= 898.30 	GF_per= 876.83 
 Prog= 72.65%	N_left= 85248	Time= 156.36	Time_left= 58.87	iGF=  7159.71	GF=  7016.08	iGF_per= 894.96 	GF_per= 877.01 
!!!! WARNING: Rank: 5 : gn22 : GPU 0000:05:00.0 	Clock: 614 MHz 	Temp: 52 C 	Power: 148 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 4 : gn22 : GPU 0000:04:00.0 	Clock: 679 MHz 	Temp: 72 C 	Power: 145 W 	PCIe  gen 3 	 x16
 Prog= 73.38%	N_left= 84480	Time= 157.95	Time_left= 57.29	iGF=  6951.39	GF=  7015.43	iGF_per= 868.92 	GF_per= 876.93 
!!!! WARNING: Rank: 6 : gn22 : GPU 0000:84:00.0 	Clock: 653 MHz 	Temp: 73 C 	Power: 106 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 7 : gn22 : GPU 0000:85:00.0 	Clock: 614 MHz 	Temp: 53 C 	Power: 144 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 0 : gn21 : GPU 0000:04:00.0 	Clock: 653 MHz 	Temp: 72 C 	Power: 143 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 1 : gn21 : GPU 0000:05:00.0 	Clock: 627 MHz 	Temp: 56 C 	Power: 148 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 2 : gn21 : GPU 0000:84:00.0 	Clock: 666 MHz 	Temp: 72 C 	Power: 146 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 3 : gn21 : GPU 0000:85:00.0 	Clock: 614 MHz 	Temp: 56 C 	Power: 147 W 	PCIe  gen 3 	 x16
 Prog= 74.10%	N_left= 83712	Time= 159.46	Time_left= 55.73	iGF=  7175.02	GF=  7016.94	iGF_per= 896.88 	GF_per= 877.12 
 Prog= 74.46%	N_left= 83328	Time= 160.27	Time_left= 54.99	iGF=  6642.56	GF=  7015.06	iGF_per= 830.32 	GF_per= 876.88 
 Prog= 75.16%	N_left= 82560	Time= 161.79	Time_left= 53.48	iGF=  6954.71	GF=  7014.49	iGF_per= 869.34 	GF_per= 876.81 
!!!! WARNING: Rank: 4 : gn22 : GPU 0000:04:00.0 	Clock: 745 MHz 	Temp: 72 C 	Power: 119 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 5 : gn22 : GPU 0000:05:00.0 	Clock: 627 MHz 	Temp: 52 C 	Power: 148 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 7 : gn22 : GPU 0000:85:00.0 	Clock: 627 MHz 	Temp: 53 C 	Power: 116 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 6 : gn22 : GPU 0000:84:00.0 	Clock: 653 MHz 	Temp: 73 C 	Power: 73 W 	PCIe  gen 3 	 x16
 Prog= 75.84%	N_left= 81792	Time= 163.24	Time_left= 52.00	iGF=  7130.12	GF=  7015.52	iGF_per= 891.27 	GF_per= 876.94 
!!!! WARNING: Rank: 1 : gn21 : GPU 0000:05:00.0 	Clock: 614 MHz 	Temp: 55 C 	Power: 143 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 0 : gn21 : GPU 0000:04:00.0 	Clock: 653 MHz 	Temp: 71 C 	Power: 144 W 	PCIe  gen 3 	 x16
 Prog= 76.18%	N_left= 81408	Time= 163.94	Time_left= 51.26	iGF=  7300.90	GF=  7016.74	iGF_per= 912.61 	GF_per= 877.09 
!!!! WARNING: Rank: 3 : gn21 : GPU 0000:85:00.0 	Clock: 614 MHz 	Temp: 56 C 	Power: 148 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 2 : gn21 : GPU 0000:84:00.0 	Clock: 666 MHz 	Temp: 72 C 	Power: 147 W 	PCIe  gen 3 	 x16
 Prog= 76.85%	N_left= 80640	Time= 165.35	Time_left= 49.81	iGF=  7165.10	GF=  7018.01	iGF_per= 895.64 	GF_per= 877.25 
 Prog= 77.50%	N_left= 79872	Time= 166.79	Time_left= 48.41	iGF=  6849.29	GF=  7016.54	iGF_per= 856.16 	GF_per= 877.07 
 Prog= 78.15%	N_left= 79104	Time= 168.15	Time_left= 47.02	iGF=  7134.45	GF=  7017.50	iGF_per= 891.81 	GF_per= 877.19 
!!!! WARNING: Rank: 4 : gn22 : GPU 0000:04:00.0 	Clock: 679 MHz 	Temp: 72 C 	Power: 122 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 5 : gn22 : GPU 0000:05:00.0 	Clock: 627 MHz 	Temp: 52 C 	Power: 148 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 6 : gn22 : GPU 0000:84:00.0 	Clock: 653 MHz 	Temp: 72 C 	Power: 91 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 7 : gn22 : GPU 0000:85:00.0 	Clock: 627 MHz 	Temp: 53 C 	Power: 128 W 	PCIe  gen 3 	 x16
 Prog= 78.46%	N_left= 78720	Time= 168.86	Time_left= 46.35	iGF=  6744.68	GF=  7016.35	iGF_per= 843.08 	GF_per= 877.04 
!!!! WARNING: Rank: 1 : gn21 : GPU 0000:05:00.0 	Clock: 614 MHz 	Temp: 55 C 	Power: 143 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 0 : gn21 : GPU 0000:04:00.0 	Clock: 653 MHz 	Temp: 71 C 	Power: 145 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 3 : gn21 : GPU 0000:85:00.0 	Clock: 614 MHz 	Temp: 56 C 	Power: 148 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 2 : gn21 : GPU 0000:84:00.0 	Clock: 653 MHz 	Temp: 71 C 	Power: 142 W 	PCIe  gen 3 	 x16
 Prog= 79.09%	N_left= 77952	Time= 170.18	Time_left= 45.00	iGF=  7180.44	GF=  7017.62	iGF_per= 897.55 	GF_per= 877.20 
 Prog= 79.70%	N_left= 77184	Time= 171.56	Time_left= 43.70	iGF=  6665.35	GF=  7014.77	iGF_per= 833.17 	GF_per= 876.85 
 Prog= 80.00%	N_left= 76800	Time= 172.20	Time_left= 43.05	iGF=  7161.89	GF=  7015.31	iGF_per= 895.24 	GF_per= 876.91 
 Prog= 80.59%	N_left= 76032	Time= 173.46	Time_left= 41.76	iGF=  7128.27	GF=  7016.13	iGF_per= 891.03 	GF_per= 877.02 
!!!! WARNING: Rank: 4 : gn22 : GPU 0000:04:00.0 	Clock: 679 MHz 	Temp: 72 C 	Power: 144 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 5 : gn22 : GPU 0000:05:00.0 	Clock: 627 MHz 	Temp: 52 C 	Power: 147 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 7 : gn22 : GPU 0000:85:00.0 	Clock: 627 MHz 	Temp: 53 C 	Power: 145 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 6 : gn22 : GPU 0000:84:00.0 	Clock: 653 MHz 	Temp: 73 C 	Power: 96 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 1 : gn21 : GPU 0000:05:00.0 	Clock: 627 MHz 	Temp: 55 C 	Power: 148 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 0 : gn21 : GPU 0000:04:00.0 	Clock: 653 MHz 	Temp: 72 C 	Power: 145 W 	PCIe  gen 3 	 x16
 Prog= 81.18%	N_left= 75264	Time= 174.68	Time_left= 40.50	iGF=  7198.21	GF=  7017.41	iGF_per= 899.78 	GF_per= 877.18 
!!!! WARNING: Rank: 3 : gn21 : GPU 0000:85:00.0 	Clock: 614 MHz 	Temp: 56 C 	Power: 146 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 2 : gn21 : GPU 0000:84:00.0 	Clock: 666 MHz 	Temp: 72 C 	Power: 147 W 	PCIe  gen 3 	 x16
 Prog= 81.75%	N_left= 74496	Time= 175.88	Time_left= 39.27	iGF=  7186.51	GF=  7018.56	iGF_per= 898.31 	GF_per= 877.32 
 Prog= 82.03%	N_left= 74112	Time= 176.56	Time_left= 38.68	iGF=  6231.23	GF=  7015.52	iGF_per= 778.90 	GF_per= 876.94 
 Prog= 82.58%	N_left= 73344	Time= 177.71	Time_left= 37.49	iGF=  7215.65	GF=  7016.83	iGF_per= 901.96 	GF_per= 877.10 
 Prog= 83.12%	N_left= 72576	Time= 178.92	Time_left= 36.33	iGF=  6790.76	GF=  7015.31	iGF_per= 848.84 	GF_per= 876.91 
!!!! WARNING: Rank: 5 : gn22 : GPU 0000:05:00.0 	Clock: 614 MHz 	Temp: 52 C 	Power: 147 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 4 : gn22 : GPU 0000:04:00.0 	Clock: 679 MHz 	Temp: 72 C 	Power: 146 W 	PCIe  gen 3 	 x16
 Prog= 83.39%	N_left= 72192	Time= 179.47	Time_left= 35.75	iGF=  7313.71	GF=  7016.22	iGF_per= 914.21 	GF_per= 877.03 
!!!! WARNING: Rank: 6 : gn22 : GPU 0000:84:00.0 	Clock: 640 MHz 	Temp: 73 C 	Power: 147 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 7 : gn22 : GPU 0000:85:00.0 	Clock: 614 MHz 	Temp: 53 C 	Power: 146 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 0 : gn21 : GPU 0000:04:00.0 	Clock: 653 MHz 	Temp: 71 C 	Power: 142 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 2 : gn21 : GPU 0000:84:00.0 	Clock: 666 MHz 	Temp: 71 C 	Power: 81 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 3 : gn21 : GPU 0000:85:00.0 	Clock: 627 MHz 	Temp: 56 C 	Power: 148 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 1 : gn21 : GPU 0000:05:00.0 	Clock: 627 MHz 	Temp: 55 C 	Power: 147 W 	PCIe  gen 3 	 x16
 Prog= 83.91%	N_left= 71424	Time= 180.58	Time_left= 34.62	iGF=  7099.90	GF=  7016.74	iGF_per= 887.49 	GF_per= 877.09 
 Prog= 84.43%	N_left= 70656	Time= 181.68	Time_left= 33.51	iGF=  7037.08	GF=  7016.86	iGF_per= 879.63 	GF_per= 877.11 
 Prog= 84.93%	N_left= 69888	Time= 182.75	Time_left= 32.43	iGF=  7115.52	GF=  7017.44	iGF_per= 889.44 	GF_per= 877.18 
 Prog= 85.18%	N_left= 69504	Time= 183.35	Time_left= 31.91	iGF=  6232.89	GF=  7014.88	iGF_per= 779.11 	GF_per= 876.86 
 Prog= 85.66%	N_left= 68736	Time= 184.38	Time_left= 30.86	iGF=  7113.47	GF=  7015.43	iGF_per= 889.18 	GF_per= 876.93 
!!!! WARNING: Rank: 4 : gn22 : GPU 0000:04:00.0 	Clock: 692 MHz 	Temp: 71 C 	Power: 69 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 5 : gn22 : GPU 0000:05:00.0 	Clock: 614 MHz 	Temp: 52 C 	Power: 154 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 7 : gn22 : GPU 0000:85:00.0 	Clock: 627 MHz 	Temp: 53 C 	Power: 148 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 6 : gn22 : GPU 0000:84:00.0 	Clock: 640 MHz 	Temp: 73 C 	Power: 140 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 0 : gn21 : GPU 0000:04:00.0 	Clock: 653 MHz 	Temp: 71 C 	Power: 111 W 	PCIe  gen 3 	 x16
 Prog= 86.14%	N_left= 67968	Time= 185.39	Time_left= 29.84	iGF=  7129.80	GF=  7016.05	iGF_per= 891.22 	GF_per= 877.01 
!!!! WARNING: Rank: 1 : gn21 : GPU 0000:05:00.0 	Clock: 627 MHz 	Temp: 55 C 	Power: 148 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 2 : gn21 : GPU 0000:84:00.0 	Clock: 666 MHz 	Temp: 72 C 	Power: 145 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 3 : gn21 : GPU 0000:85:00.0 	Clock: 614 MHz 	Temp: 56 C 	Power: 148 W 	PCIe  gen 3 	 x16
 Prog= 86.37%	N_left= 67584	Time= 185.89	Time_left= 29.33	iGF=  7060.74	GF=  7016.17	iGF_per= 882.59 	GF_per= 877.02 
 Prog= 86.83%	N_left= 66816	Time= 186.87	Time_left= 28.34	iGF=  7084.09	GF=  7016.52	iGF_per= 885.51 	GF_per= 877.07 
 Prog= 87.28%	N_left= 66048	Time= 187.83	Time_left= 27.38	iGF=  7031.13	GF=  7016.60	iGF_per= 878.89 	GF_per= 877.07 
!!!! WARNING: Rank: 4 : gn22 : GPU 0000:04:00.0 	Clock: 679 MHz 	Temp: 72 C 	Power: 144 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 6 : gn22 : GPU 0000:84:00.0 	Clock: 640 MHz 	Temp: 73 C 	Power: 141 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 7 : gn22 : GPU 0000:85:00.0 	Clock: 640 MHz 	Temp: 53 C 	Power: 147 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 5 : gn22 : GPU 0000:05:00.0 	Clock: 627 MHz 	Temp: 52 C 	Power: 131 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 1 : gn21 : GPU 0000:05:00.0 	Clock: 627 MHz 	Temp: 55 C 	Power: 148 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 0 : gn21 : GPU 0000:04:00.0 	Clock: 653 MHz 	Temp: 71 C 	Power: 142 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 2 : gn21 : GPU 0000:84:00.0 	Clock: 666 MHz 	Temp: 72 C 	Power: 150 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 3 : gn21 : GPU 0000:85:00.0 	Clock: 614 MHz 	Temp: 57 C 	Power: 147 W 	PCIe  gen 3 	 x16
 Prog= 88.77%	N_left= 63360	Time= 191.09	Time_left= 24.17	iGF=  6900.13	GF=  7014.61	iGF_per= 862.52 	GF_per= 876.83 
 Prog= 89.95%	N_left= 61056	Time= 193.60	Time_left= 21.63	iGF=  7116.60	GF=  7015.93	iGF_per= 889.58 	GF_per= 876.99 
!!!! WARNING: Rank: 4 : gn22 : GPU 0000:04:00.0 	Clock: 679 MHz 	Temp: 72 C 	Power: 144 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 6 : gn22 : GPU 0000:84:00.0 	Clock: 653 MHz 	Temp: 73 C 	Power: 146 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 7 : gn22 : GPU 0000:85:00.0 	Clock: 614 MHz 	Temp: 53 C 	Power: 143 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 5 : gn22 : GPU 0000:05:00.0 	Clock: 627 MHz 	Temp: 52 C 	Power: 96 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 1 : gn21 : GPU 0000:05:00.0 	Clock: 627 MHz 	Temp: 55 C 	Power: 148 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 0 : gn21 : GPU 0000:04:00.0 	Clock: 653 MHz 	Temp: 71 C 	Power: 143 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 2 : gn21 : GPU 0000:84:00.0 	Clock: 666 MHz 	Temp: 72 C 	Power: 151 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 3 : gn21 : GPU 0000:85:00.0 	Clock: 627 MHz 	Temp: 56 C 	Power: 155 W 	PCIe  gen 3 	 x16
 Prog= 91.22%	N_left= 58368	Time= 196.47	Time_left= 18.91	iGF=  6677.63	GF=  7010.99	iGF_per= 834.70 	GF_per= 876.37 
 Prog= 92.38%	N_left= 55680	Time= 199.20	Time_left= 16.43	iGF=  6400.48	GF=  7002.61	iGF_per= 800.06 	GF_per= 875.33 
!!!! WARNING: Rank: 4 : gn22 : GPU 0000:04:00.0 	Clock: 705 MHz 	Temp: 72 C 	Power: 67 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 5 : gn22 : GPU 0000:05:00.0 	Clock: 601 MHz 	Temp: 52 C 	Power: 144 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 7 : gn22 : GPU 0000:85:00.0 	Clock: 627 MHz 	Temp: 52 C 	Power: 91 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 6 : gn22 : GPU 0000:84:00.0 	Clock: 653 MHz 	Temp: 72 C 	Power: 77 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 1 : gn21 : GPU 0000:05:00.0 	Clock: 627 MHz 	Temp: 55 C 	Power: 97 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 0 : gn21 : GPU 0000:04:00.0 	Clock: 653 MHz 	Temp: 71 C 	Power: 84 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 3 : gn21 : GPU 0000:85:00.0 	Clock: 627 MHz 	Temp: 56 C 	Power: 152 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 2 : gn21 : GPU 0000:84:00.0 	Clock: 679 MHz 	Temp: 71 C 	Power: 151 W 	PCIe  gen 3 	 x16
 Prog= 93.43%	N_left= 52992	Time= 201.56	Time_left= 14.17	iGF=  6718.04	GF=  6999.28	iGF_per= 839.75 	GF_per= 874.91 
 Prog= 94.38%	N_left= 50304	Time= 203.77	Time_left= 12.13	iGF=  6496.96	GF=  6993.84	iGF_per= 812.12 	GF_per= 874.23 
!!!! WARNING: Rank: 4 : gn22 : GPU 0000:04:00.0 	Clock: 705 MHz 	Temp: 72 C 	Power: 77 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 5 : gn22 : GPU 0000:05:00.0 	Clock: 614 MHz 	Temp: 52 C 	Power: 153 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 6 : gn22 : GPU 0000:84:00.0 	Clock: 653 MHz 	Temp: 72 C 	Power: 79 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 7 : gn22 : GPU 0000:85:00.0 	Clock: 640 MHz 	Temp: 52 C 	Power: 88 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 0 : gn21 : GPU 0000:04:00.0 	Clock: 679 MHz 	Temp: 71 C 	Power: 69 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 1 : gn21 : GPU 0000:05:00.0 	Clock: 627 MHz 	Temp: 54 C 	Power: 91 W 	PCIe  gen 3 	 x16
 Prog= 95.23%	N_left= 47616	Time= 205.94	Time_left= 10.31	iGF=  5959.95	GF=  6982.98	iGF_per= 744.99 	GF_per= 872.87 
!!!! WARNING: Rank: 2 : gn21 : GPU 0000:84:00.0 	Clock: 666 MHz 	Temp: 71 C 	Power: 81 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 3 : gn21 : GPU 0000:85:00.0 	Clock: 627 MHz 	Temp: 56 C 	Power: 85 W 	PCIe  gen 3 	 x16
 Prog= 96.00%	N_left= 44928	Time= 207.88	Time_left= 8.67	iGF=  5924.73	GF=  6973.09	iGF_per= 740.59 	GF_per= 871.64 
 Prog= 96.58%	N_left= 42624	Time= 209.31	Time_left= 7.41	iGF=  6182.29	GF=  6967.69	iGF_per= 772.79 	GF_per= 870.96 
!!!! WARNING: Rank: 4 : gn22 : GPU 0000:04:00.0 	Clock: 758 MHz 	Temp: 71 C 	Power: 83 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 5 : gn22 : GPU 0000:05:00.0 	Clock: 653 MHz 	Temp: 51 C 	Power: 155 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 7 : gn22 : GPU 0000:85:00.0 	Clock: 719 MHz 	Temp: 52 C 	Power: 99 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 6 : gn22 : GPU 0000:84:00.0 	Clock: 679 MHz 	Temp: 72 C 	Power: 103 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 1 : gn21 : GPU 0000:05:00.0 	Clock: 692 MHz 	Temp: 54 C 	Power: 94 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 0 : gn21 : GPU 0000:04:00.0 	Clock: 679 MHz 	Temp: 70 C 	Power: 78 W 	PCIe  gen 3 	 x16
 Prog= 97.19%	N_left= 39936	Time= 210.96	Time_left= 6.10	iGF=  5531.08	GF=  6956.40	iGF_per= 691.38 	GF_per= 869.55 
!!!! WARNING: Rank: 2 : gn21 : GPU 0000:84:00.0 	Clock: 745 MHz 	Temp: 71 C 	Power: 134 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 3 : gn21 : GPU 0000:85:00.0 	Clock: 692 MHz 	Temp: 55 C 	Power: 140 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 4 : gn22 : GPU 0000:04:00.0 	Clock: 810 MHz 	Temp: 71 C 	Power: 101 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 5 : gn22 : GPU 0000:05:00.0 	Clock: 679 MHz 	Temp: 50 C 	Power: 150 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 6 : gn22 : GPU 0000:84:00.0 	Clock: 719 MHz 	Temp: 71 C 	Power: 100 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 7 : gn22 : GPU 0000:85:00.0 	Clock: 810 MHz 	Temp: 52 C 	Power: 100 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 1 : gn21 : GPU 0000:05:00.0 	Clock: 771 MHz 	Temp: 54 C 	Power: 94 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 0 : gn21 : GPU 0000:04:00.0 	Clock: 797 MHz 	Temp: 70 C 	Power: 106 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 2 : gn21 : GPU 0000:84:00.0 	Clock: 745 MHz 	Temp: 71 C 	Power: 110 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 3 : gn21 : GPU 0000:85:00.0 	Clock: 797 MHz 	Temp: 55 C 	Power: 129 W 	PCIe  gen 3 	 x16
 Prog= 99.14%	N_left= 26880	Time= 216.79	Time_left= 1.87	iGF=  5063.49	GF=  6905.51	iGF_per= 632.94 	GF_per= 863.19 
 Prog= 99.89%	N_left= 13440	Time= 220.17	Time_left= 0.24	iGF=  3356.15	GF=  6851.09	iGF_per= 419.52 	GF_per= 856.39 
!!!! WARNING: Rank: 4 : gn22 : GPU 0000:04:00.0 	Clock: 797 MHz 	Temp: 71 C 	Power: 94 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 5 : gn22 : GPU 0000:05:00.0 	Clock: 797 MHz 	Temp: 50 C 	Power: 99 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 0 : gn21 : GPU 0000:04:00.0 	Clock: 823 MHz 	Temp: 69 C 	Power: 89 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 1 : gn21 : GPU 0000:05:00.0 	Clock: 771 MHz 	Temp: 53 C 	Power: 96 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 6 : gn22 : GPU 0000:84:00.0 	Clock: 862 MHz 	Temp: 71 C 	Power: 91 W 	PCIe  gen 3 	 x16
!!!! WARNING: Rank: 7 : gn22 : GPU 0000:85:00.0 	Clock: 653 MHz 	Temp: 50 C 	Power: 95 W 	PCIe  gen 3 	 x16
 Prog= 100.00%	N_left= 384	Time= 221.29	Time_left= 0.00	iGF=  1445.67	GF=  6823.74	iGF_per= 180.71 	GF_per= 852.97 
================================================================================
T/V                N    NB     P     Q               Time                 Gflops
--------------------------------------------------------------------------------
WC01L2L2      131328   384     2     4             221.65              6.813e+03
--------------------------------------------------------------------------------
||Ax-b||_oo/(eps*(||A||_oo*||x||_oo+||b||_oo)*N)=        0.0029561 ...... PASSED
================================================================================

Finished      1 tests with the following results:
              1 tests completed and passed residual checks,
              0 tests completed and failed residual checks,
              0 tests skipped because of illegal input values.
--------------------------------------------------------------------------------

End of Tests.
================================================================================

遇到问题

最初申请了gn[27-28]但是跑不起来,暂时不知道原因。(我的幸运数字 27)

Currently Loaded Modulefiles:
 1) MPI/openmpi/1.10.7-gcc-4.8.5-dynamic
HOSTNAME=gn27
Results in ./results/xhpl_8_gpu-gn27.1113.170352-output.txt
Wed Nov 13 17:04:03 2019
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 390.30                 Driver Version: 390.30                    |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  Tesla K80           Off  | 00000000:04:00.0 Off |                    0 |
| N/A   57C    P0    57W / 149W |      0MiB / 11441MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   1  Tesla K80           Off  | 00000000:05:00.0 Off |                    0 |
| N/A   43C    P0    71W / 149W |      0MiB / 11441MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   2  Tesla K80           Off  | 00000000:84:00.0 Off |                    0 |
| N/A   51C    P0    60W / 149W |      0MiB / 11441MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   3  Tesla K80           Off  | 00000000:85:00.0 Off |                    0 |
| N/A   41C    P0    73W / 149W |      0MiB / 11441MiB |    100%      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+
FINDMPI=/BIGDATA1/app/MPI/openmpi/1.10.7-gcc-4.8.5-dynamic/bin/mpirun
mpirun (Open MPI) 1.10.7

Report bugs to http://www.open-mpi.org/community/help/
[gn28:16924] [[49228,0],1] tcp_peer_send_blocking: send() to socket 9 failed: Broken pipe (32)
--------------------------------------------------------------------------
ORTE was unable to reliably start one or more daemons.
This usually is caused by:

* not finding the required libraries and/or binaries on
  one or more nodes. Please check your PATH and LD_LIBRARY_PATH
  settings, or configure OMPI with --enable-orterun-prefix-by-default

* lack of authority to execute on one or more specified nodes.
  Please verify your allocation and authorities.

* the inability to write startup files into /tmp (--tmpdir/orte_tmpdir_base).
  Please check with your sys admin to determine the correct location to use.

*  compilation of the orted with dynamic libraries when static are required
  (e.g., on Cray). Please check your configure cmd line and consider using
  one of the contrib/platform definitions for your system type.

* an inability to create a connection back to mpirun due to a
  lack of common network interfaces and/or no route found between
  them. Please check network connectivity (including firewalls
  and network routing requirements).
--------------------------------------------------------------------------

HPCG

集群上 CUDA 最新版本仅有 8.0,因此选择了hpcg-3.1_cuda8_ompi1.10.2_gcc485_sm_35_sm_50_sm_60_ver_3_28_17.tgz

上传到集群目录并解压。

cd ~/WuK/benchmarks/gpu/
tar -xf hpcg-3.1_cuda8_ompi1.10.2_gcc485_sm_35_sm_50_sm_60_ver_3_28_17.gz
hpcg-3.1_cuda8_ompi1.10.2_gcc485_sm_35_sm_50_sm_60_ver_3_28_17
cd hpcg-3.1_cuda8_ompi1.10.2_gcc485_sm_35_sm_50_sm_60_ver_3_28_17
sbatch benchmark.slurm

benchmark.slurm

仿照上面 HPL 的写出来的。

#!/bin/bash
#SBATCH -J WuK_hpcg # 任务名
#SBATCH -N 2 # 申请 2 个节点
#SBATCH --ntasks-per-node=4 # 每个节点开 4 个进程
#SBATCH --exclusive

HPCG_DIR=`pwd`

#-------------------module load-------------------------------------
module load MPI/openmpi/1.10.7-icc-14.0.2-dynamic
#module load MPI/openmpi/1.10.7-gcc-4.8.5-dynamic
module load CUDA/8.0
#module load opt/cuda/10.1
#module load opt/openmpi/openmpi3-x86_64
#module load nvme/openmpi/3.1.3-gcc4.8.5
#module load opt/intelcompilers/2016.4-compilers
module list

#-------------------------------------------------------------------
export HOSTNAME=`hostname`
echo "HOSTNAME=$HOSTNAME"

DATETIME=`hostname`.`date +"%m%d.%H%M%S"`

echo "Results in ./results/xhpcg_8_gpu-$DATETIME-output.txt"

#-------------------------------------------------------------------

nvidia-smi

MPI_WHERE=`which mpirun`
echo "FINDMPI=$MPI_WHERE"

$MPI_WHERE --version

#--------------------run openmpi------------------------------------

export I_MPI_FAVRICS=shm:ofa
export CUDA_VISIBLE_DEVICES="0,1,2,3"

$MPI_WHERE -n 8 -bind-to none -x LD_LIBRARY_PATH --mca btl_tcp_if_include ib0 ./xhpcg-3.1_gcc_485_cuda8061_ompi_1_10_2_sm_35_sm_50_sm_60_ver_3_28_17 | tee ./results/xhpcg_8_gpu-$DATETIME-output.txt

HPCG.dat

根据readme_hpcg_running_notes.txt,所用集群的K80最佳三维参数为256 256 256。同样,Official submission running times for hpcg should be at least 1 hr (3600) so examples provided use 3660 to be safe.

HPCG benchmark input file
Sandia National Laboratories; University of Tennessee, Knoxville
256 256 256
3660

xhpcg_8_gpu-gn21.1113.192827-output.txt

官方样例xhpcg_8_gpu-hsw214.0402.150331-output.txt同配置下跑出了227.9 GF,我好菜啊…

start of application (5 OMP threads)...
2019-11-13 19:28:54.560

Problem setup...
Setup time: 0.605619 sec

GPU: 'Tesla K80'
Memory use: 8142 MB / 11441 MB
2x2x2 process grid
256x256x256 local domain

Reference SpMV+MG...

Reference CG...
Initial Residual: 1.130436e+04 Max_err: 1.000000e+00 tot_err: 1.158524e+04
REF  Iter = 1 Scaled Residual: 1.866786e-01 Max error: 1.000000e+00 tot_error: 9.849293e-01
REF  Iter = 2 Scaled Residual: 1.025624e-01 Max error: 1.000000e+00 tot_error: 9.709075e-01
REF  Iter = 3 Scaled Residual: 7.081692e-02 Max error: 1.000000e+00 tot_error: 9.571438e-01
REF  Iter = 4 Scaled Residual: 5.410908e-02 Max error: 1.000000e+00 tot_error: 9.434838e-01
REF  Iter = 5 Scaled Residual: 4.375764e-02 Max error: 1.000000e+00 tot_error: 9.298920e-01
REF  Iter = 6 Scaled Residual: 3.671068e-02 Max error: 1.000000e+00 tot_error: 9.163611e-01
REF  Iter = 7 Scaled Residual: 3.161214e-02 Max error: 1.000000e+00 tot_error: 9.028886e-01
REF  Iter = 8 Scaled Residual: 2.776606e-02 Max error: 1.000000e+00 tot_error: 8.894770e-01
REF  Iter = 9 Scaled Residual: 2.477195e-02 Max error: 1.000000e+00 tot_error: 8.761282e-01
REF  Iter = 10 Scaled Residual: 2.237653e-02 Max error: 1.000000e+00 tot_error: 8.628447e-01
REF  Iter = 11 Scaled Residual: 2.041054e-02 Max error: 1.000000e+00 tot_error: 8.496263e-01
REF  Iter = 12 Scaled Residual: 1.875762e-02 Max error: 1.000000e+00 tot_error: 8.364720e-01
REF  Iter = 13 Scaled Residual: 1.733753e-02 Max error: 1.000000e+00 tot_error: 8.233799e-01
REF  Iter = 14 Scaled Residual: 1.609948e-02 Max error: 1.000000e+00 tot_error: 8.103479e-01
REF  Iter = 15 Scaled Residual: 1.501260e-02 Max error: 1.000000e+00 tot_error: 7.973761e-01
REF  Iter = 16 Scaled Residual: 1.405666e-02 Max error: 1.000000e+00 tot_error: 7.844660e-01
REF  Iter = 17 Scaled Residual: 1.321713e-02 Max error: 1.000000e+00 tot_error: 7.716205e-01
REF  Iter = 18 Scaled Residual: 1.248015e-02 Max error: 1.000000e+00 tot_error: 7.588430e-01
REF  Iter = 19 Scaled Residual: 1.182996e-02 Max error: 1.000000e+00 tot_error: 7.461364e-01
REF  Iter = 20 Scaled Residual: 1.125051e-02 Max error: 1.000000e+00 tot_error: 7.335016e-01
REF  Iter = 21 Scaled Residual: 1.072619e-02 Max error: 1.000000e+00 tot_error: 7.209387e-01
REF  Iter = 22 Scaled Residual: 1.024404e-02 Max error: 1.000000e+00 tot_error: 7.084458e-01
REF  Iter = 23 Scaled Residual: 9.797113e-03 Max error: 1.000000e+00 tot_error: 6.960213e-01
REF  Iter = 24 Scaled Residual: 9.383760e-03 Max error: 1.000000e+00 tot_error: 6.836645e-01
REF  Iter = 25 Scaled Residual: 9.005280e-03 Max error: 1.000000e+00 tot_error: 6.713770e-01
REF  Iter = 26 Scaled Residual: 8.664336e-03 Max error: 1.000000e+00 tot_error: 6.591612e-01
REF  Iter = 27 Scaled Residual: 8.361679e-03 Max error: 1.000000e+00 tot_error: 6.470217e-01
REF  Iter = 28 Scaled Residual: 8.094356e-03 Max error: 1.000000e+00 tot_error: 6.349618e-01
REF  Iter = 29 Scaled Residual: 7.856248e-03 Max error: 1.000000e+00 tot_error: 6.229842e-01
REF  Iter = 30 Scaled Residual: 7.638411e-03 Max error: 1.000000e+00 tot_error: 6.110895e-01
REF  Iter = 31 Scaled Residual: 7.431725e-03 Max error: 1.000000e+00 tot_error: 5.992771e-01
REF  Iter = 32 Scaled Residual: 7.230632e-03 Max error: 1.000000e+00 tot_error: 5.875451e-01
REF  Iter = 33 Scaled Residual: 7.033108e-03 Max error: 1.000000e+00 tot_error: 5.758935e-01
REF  Iter = 34 Scaled Residual: 6.839429e-03 Max error: 1.000000e+00 tot_error: 5.643230e-01
REF  Iter = 35 Scaled Residual: 6.649585e-03 Max error: 1.000000e+00 tot_error: 5.528367e-01
REF  Iter = 36 Scaled Residual: 6.460266e-03 Max error: 1.000000e+00 tot_error: 5.414384e-01
REF  Iter = 37 Scaled Residual: 6.266134e-03 Max error: 1.000000e+00 tot_error: 5.301323e-01
REF  Iter = 38 Scaled Residual: 6.064667e-03 Max error: 1.000000e+00 tot_error: 5.189197e-01
REF  Iter = 39 Scaled Residual: 5.859845e-03 Max error: 1.000000e+00 tot_error: 5.077945e-01
REF  Iter = 40 Scaled Residual: 5.660937e-03 Max error: 1.000000e+00 tot_error: 4.967385e-01
REF  Iter = 41 Scaled Residual: 5.474462e-03 Max error: 1.000000e+00 tot_error: 4.857365e-01
REF  Iter = 42 Scaled Residual: 5.303830e-03 Max error: 1.000000e+00 tot_error: 4.747937e-01
REF  Iter = 43 Scaled Residual: 5.155532e-03 Max error: 1.000000e+00 tot_error: 4.639201e-01
REF  Iter = 44 Scaled Residual: 5.029801e-03 Max error: 1.000000e+00 tot_error: 4.531153e-01
REF  Iter = 45 Scaled Residual: 4.919530e-03 Max error: 1.000000e+00 tot_error: 4.423834e-01
REF  Iter = 46 Scaled Residual: 4.819439e-03 Max error: 1.000000e+00 tot_error: 4.317252e-01
REF  Iter = 47 Scaled Residual: 4.721621e-03 Max error: 1.000000e+00 tot_error: 4.211404e-01
REF  Iter = 48 Scaled Residual: 4.624066e-03 Max error: 1.000000e+00 tot_error: 4.106276e-01
REF  Iter = 49 Scaled Residual: 4.524096e-03 Max error: 1.000000e+00 tot_error: 4.001856e-01
REF  Iter = 50 Scaled Residual: 4.425263e-03 Max error: 1.000000e+00 tot_error: 3.898165e-01

Optimization...
Optimization time: 6.379558e-01 sec

Validation...

Optimized CG Setup...
Initial Residual: 1.130436e+04 Max_err: 1.000000e+00 tot_err: 1.158524e+04
Iteration = 1 Scaled Residual: 2.207820e-01 Max error: 1.000000e+00 tot_error: 9.848263e-01
Iteration = 2 Scaled Residual: 1.195819e-01 Max error: 1.000000e+00 tot_error: 9.706033e-01
Iteration = 3 Scaled Residual: 8.139207e-02 Max error: 1.000000e+00 tot_error: 9.566972e-01
Iteration = 4 Scaled Residual: 6.165808e-02 Max error: 1.000000e+00 tot_error: 9.429398e-01
Iteration = 5 Scaled Residual: 4.967278e-02 Max error: 1.000000e+00 tot_error: 9.292521e-01
Iteration = 6 Scaled Residual: 4.158389e-02 Max error: 1.000000e+00 tot_error: 9.156258e-01
Iteration = 7 Scaled Residual: 3.576953e-02 Max error: 1.000000e+00 tot_error: 9.020490e-01
Iteration = 8 Scaled Residual: 3.139175e-02 Max error: 1.000000e+00 tot_error: 8.885251e-01
Iteration = 9 Scaled Residual: 2.796567e-02 Max error: 1.000000e+00 tot_error: 8.750543e-01
Iteration = 10 Scaled Residual: 2.521453e-02 Max error: 1.000000e+00 tot_error: 8.616418e-01
Iteration = 11 Scaled Residual: 2.295672e-02 Max error: 1.000000e+00 tot_error: 8.482859e-01
Iteration = 12 Scaled Residual: 2.106689e-02 Max error: 1.000000e+00 tot_error: 8.349914e-01
Iteration = 13 Scaled Residual: 1.946574e-02 Max error: 1.000000e+00 tot_error: 8.217580e-01
Iteration = 14 Scaled Residual: 1.809035e-02 Max error: 1.000000e+00 tot_error: 8.085880e-01
Iteration = 15 Scaled Residual: 1.689473e-02 Max error: 1.000000e+00 tot_error: 7.954822e-01
Iteration = 16 Scaled Residual: 1.584746e-02 Max error: 1.000000e+00 tot_error: 7.824436e-01
Iteration = 17 Scaled Residual: 1.492268e-02 Max error: 1.000000e+00 tot_error: 7.694711e-01
Iteration = 18 Scaled Residual: 1.410121e-02 Max error: 1.000000e+00 tot_error: 7.565683e-01
Iteration = 19 Scaled Residual: 1.336749e-02 Max error: 1.000000e+00 tot_error: 7.437350e-01
Iteration = 20 Scaled Residual: 1.270845e-02 Max error: 1.000000e+00 tot_error: 7.309738e-01
Iteration = 21 Scaled Residual: 1.211470e-02 Max error: 1.000000e+00 tot_error: 7.182850e-01
Iteration = 22 Scaled Residual: 1.157777e-02 Max error: 1.000000e+00 tot_error: 7.056725e-01
Iteration = 23 Scaled Residual: 1.109062e-02 Max error: 1.000000e+00 tot_error: 6.931348e-01
Iteration = 24 Scaled Residual: 1.064638e-02 Max error: 1.000000e+00 tot_error: 6.806764e-01
Iteration = 25 Scaled Residual: 1.023819e-02 Max error: 1.000000e+00 tot_error: 6.682961e-01
Iteration = 26 Scaled Residual: 9.860575e-03 Max error: 1.000000e+00 tot_error: 6.559970e-01
Iteration = 27 Scaled Residual: 9.508176e-03 Max error: 1.000000e+00 tot_error: 6.437776e-01
Iteration = 28 Scaled Residual: 9.175130e-03 Max error: 1.000000e+00 tot_error: 6.316402e-01
Iteration = 29 Scaled Residual: 8.857803e-03 Max error: 1.000000e+00 tot_error: 6.195809e-01
Iteration = 30 Scaled Residual: 8.552436e-03 Max error: 1.000000e+00 tot_error: 6.076019e-01
Iteration = 31 Scaled Residual: 8.257667e-03 Max error: 1.000000e+00 tot_error: 5.956986e-01
Iteration = 32 Scaled Residual: 7.973539e-03 Max error: 1.000000e+00 tot_error: 5.838727e-01
Iteration = 33 Scaled Residual: 7.700550e-03 Max error: 1.000000e+00 tot_error: 5.721214e-01
Iteration = 34 Scaled Residual: 7.439293e-03 Max error: 1.000000e+00 tot_error: 5.604475e-01
Iteration = 35 Scaled Residual: 7.191580e-03 Max error: 1.000000e+00 tot_error: 5.488508e-01
Iteration = 36 Scaled Residual: 6.957137e-03 Max error: 1.000000e+00 tot_error: 5.373380e-01
Iteration = 37 Scaled Residual: 6.736682e-03 Max error: 1.000000e+00 tot_error: 5.259116e-01
Iteration = 38 Scaled Residual: 6.530211e-03 Max error: 1.000000e+00 tot_error: 5.145791e-01
Iteration = 39 Scaled Residual: 6.338444e-03 Max error: 1.000000e+00 tot_error: 5.033395e-01
Iteration = 40 Scaled Residual: 6.162254e-03 Max error: 1.000000e+00 tot_error: 4.921890e-01
Iteration = 41 Scaled Residual: 6.002128e-03 Max error: 1.000000e+00 tot_error: 4.811143e-01
Iteration = 42 Scaled Residual: 5.853732e-03 Max error: 1.000000e+00 tot_error: 4.701054e-01
Iteration = 43 Scaled Residual: 5.711831e-03 Max error: 1.000000e+00 tot_error: 4.591599e-01
Iteration = 44 Scaled Residual: 5.572338e-03 Max error: 1.000000e+00 tot_error: 4.482895e-01
Iteration = 45 Scaled Residual: 5.436712e-03 Max error: 1.000000e+00 tot_error: 4.374971e-01
Iteration = 46 Scaled Residual: 5.305107e-03 Max error: 1.000000e+00 tot_error: 4.267747e-01
Iteration = 47 Scaled Residual: 5.173265e-03 Max error: 1.000000e+00 tot_error: 4.161159e-01
Iteration = 48 Scaled Residual: 5.039648e-03 Max error: 1.000000e+00 tot_error: 4.055277e-01
Iteration = 49 Scaled Residual: 4.909518e-03 Max error: 9.999999e-01 tot_error: 3.950059e-01
Iteration = 50 Scaled Residual: 4.781372e-03 Max error: 9.999998e-01 tot_error: 3.845456e-01
Iteration = 51 Scaled Residual: 4.656561e-03 Max error: 9.999997e-01 tot_error: 3.741512e-01
Iteration = 52 Scaled Residual: 4.539125e-03 Max error: 9.999994e-01 tot_error: 3.638202e-01
Iteration = 53 Scaled Residual: 4.427002e-03 Max error: 9.999989e-01 tot_error: 3.535533e-01
Iteration = 54 Scaled Residual: 4.322927e-03 Max error: 9.999978e-01 tot_error: 3.433555e-01

Starting Benchmarking Phase...
Performing 311 CG sets	 expected time: 3660.0 seconds	 expected Perf:     210.2 GF (26.3 GF_per)
2019-11-13 19:32:47.686
progress = 1.3% 	   47.2 / 3660.0 sec elapsed 	 3612.8 sec remain 	   210.238 GF 	 26.280 GF_per
progress = 2.6% 	   94.4 / 3660.0 sec elapsed 	 3565.6 sec remain 	   210.222 GF 	 26.278 GF_per
progress = 3.9% 	  141.6 / 3660.0 sec elapsed 	 3518.4 sec remain 	   210.216 GF 	 26.277 GF_per
progress = 5.2% 	  188.9 / 3660.0 sec elapsed 	 3471.1 sec remain 	   210.221 GF 	 26.278 GF_per
progress = 6.5% 	  236.1 / 3660.0 sec elapsed 	 3423.9 sec remain 	   210.220 GF 	 26.278 GF_per
progress = 7.7% 	  283.3 / 3660.0 sec elapsed 	 3376.7 sec remain 	   210.223 GF 	 26.278 GF_per
progress = 9.0% 	  330.5 / 3660.0 sec elapsed 	 3329.5 sec remain 	   210.221 GF 	 26.278 GF_per
progress = 10.3% 	  377.7 / 3660.0 sec elapsed 	 3282.3 sec remain 	   210.223 GF 	 26.278 GF_per
progress = 11.6% 	  424.9 / 3660.0 sec elapsed 	 3235.1 sec remain 	   210.223 GF 	 26.278 GF_per
progress = 12.9% 	  472.1 / 3660.0 sec elapsed 	 3187.9 sec remain 	   210.224 GF 	 26.278 GF_per
progress = 14.2% 	  519.4 / 3660.0 sec elapsed 	 3140.6 sec remain 	   210.224 GF 	 26.278 GF_per
progress = 15.5% 	  566.6 / 3660.0 sec elapsed 	 3093.4 sec remain 	   210.226 GF 	 26.278 GF_per
progress = 16.8% 	  613.8 / 3660.0 sec elapsed 	 3046.2 sec remain 	   210.227 GF 	 26.278 GF_per
progress = 18.1% 	  661.0 / 3660.0 sec elapsed 	 2999.0 sec remain 	   210.226 GF 	 26.278 GF_per
progress = 19.3% 	  708.2 / 3660.0 sec elapsed 	 2951.8 sec remain 	   210.226 GF 	 26.278 GF_per
progress = 20.6% 	  755.4 / 3660.0 sec elapsed 	 2904.6 sec remain 	   210.224 GF 	 26.278 GF_per
progress = 21.9% 	  802.6 / 3660.0 sec elapsed 	 2857.4 sec remain 	   210.226 GF 	 26.278 GF_per
progress = 23.2% 	  849.8 / 3660.0 sec elapsed 	 2810.2 sec remain 	   210.226 GF 	 26.278 GF_per
progress = 24.5% 	  897.1 / 3660.0 sec elapsed 	 2762.9 sec remain 	   210.226 GF 	 26.278 GF_per
progress = 25.8% 	  944.3 / 3660.0 sec elapsed 	 2715.7 sec remain 	   210.225 GF 	 26.278 GF_per
progress = 27.1% 	  991.5 / 3660.0 sec elapsed 	 2668.5 sec remain 	   210.225 GF 	 26.278 GF_per
progress = 28.4% 	 1038.7 / 3660.0 sec elapsed 	 2621.3 sec remain 	   210.226 GF 	 26.278 GF_per
progress = 29.7% 	 1085.9 / 3660.0 sec elapsed 	 2574.1 sec remain 	   210.225 GF 	 26.278 GF_per
progress = 31.0% 	 1133.1 / 3660.0 sec elapsed 	 2526.9 sec remain 	   210.226 GF 	 26.278 GF_per
progress = 32.2% 	 1180.3 / 3660.0 sec elapsed 	 2479.7 sec remain 	   210.227 GF 	 26.278 GF_per
progress = 33.5% 	 1227.5 / 3660.0 sec elapsed 	 2432.5 sec remain 	   210.227 GF 	 26.278 GF_per
progress = 34.8% 	 1274.8 / 3660.0 sec elapsed 	 2385.2 sec remain 	   210.227 GF 	 26.278 GF_per
progress = 36.1% 	 1322.0 / 3660.0 sec elapsed 	 2338.0 sec remain 	   210.227 GF 	 26.278 GF_per
progress = 37.4% 	 1369.2 / 3660.0 sec elapsed 	 2290.8 sec remain 	   210.228 GF 	 26.278 GF_per
progress = 38.7% 	 1416.4 / 3660.0 sec elapsed 	 2243.6 sec remain 	   210.228 GF 	 26.278 GF_per
progress = 40.0% 	 1463.6 / 3660.0 sec elapsed 	 2196.4 sec remain 	   210.228 GF 	 26.279 GF_per
progress = 41.3% 	 1510.8 / 3660.0 sec elapsed 	 2149.2 sec remain 	   210.228 GF 	 26.279 GF_per
progress = 42.6% 	 1558.0 / 3660.0 sec elapsed 	 2102.0 sec remain 	   210.228 GF 	 26.279 GF_per
progress = 43.9% 	 1605.2 / 3660.0 sec elapsed 	 2054.8 sec remain 	   210.229 GF 	 26.279 GF_per
progress = 45.1% 	 1652.4 / 3660.0 sec elapsed 	 2007.6 sec remain 	   210.230 GF 	 26.279 GF_per
progress = 46.4% 	 1699.7 / 3660.0 sec elapsed 	 1960.3 sec remain 	   210.230 GF 	 26.279 GF_per
progress = 47.7% 	 1746.9 / 3660.0 sec elapsed 	 1913.1 sec remain 	   210.230 GF 	 26.279 GF_per
progress = 49.0% 	 1794.1 / 3660.0 sec elapsed 	 1865.9 sec remain 	   210.231 GF 	 26.279 GF_per
progress = 50.3% 	 1841.3 / 3660.0 sec elapsed 	 1818.7 sec remain 	   210.231 GF 	 26.279 GF_per
progress = 51.6% 	 1888.5 / 3660.0 sec elapsed 	 1771.5 sec remain 	   210.231 GF 	 26.279 GF_per
progress = 52.9% 	 1935.7 / 3660.0 sec elapsed 	 1724.3 sec remain 	   210.231 GF 	 26.279 GF_per
progress = 54.2% 	 1982.9 / 3660.0 sec elapsed 	 1677.1 sec remain 	   210.231 GF 	 26.279 GF_per
progress = 55.5% 	 2030.1 / 3660.0 sec elapsed 	 1629.9 sec remain 	   210.231 GF 	 26.279 GF_per
progress = 56.8% 	 2077.4 / 3660.0 sec elapsed 	 1582.6 sec remain 	   210.231 GF 	 26.279 GF_per
progress = 58.0% 	 2124.6 / 3660.0 sec elapsed 	 1535.4 sec remain 	   210.231 GF 	 26.279 GF_per
progress = 59.3% 	 2171.8 / 3660.0 sec elapsed 	 1488.2 sec remain 	   210.231 GF 	 26.279 GF_per
progress = 60.6% 	 2219.0 / 3660.0 sec elapsed 	 1441.0 sec remain 	   210.231 GF 	 26.279 GF_per
progress = 61.9% 	 2266.2 / 3660.0 sec elapsed 	 1393.8 sec remain 	   210.231 GF 	 26.279 GF_per
progress = 63.2% 	 2313.4 / 3660.0 sec elapsed 	 1346.6 sec remain 	   210.232 GF 	 26.279 GF_per
progress = 64.5% 	 2360.6 / 3660.0 sec elapsed 	 1299.4 sec remain 	   210.232 GF 	 26.279 GF_per
progress = 65.8% 	 2407.8 / 3660.0 sec elapsed 	 1252.2 sec remain 	   210.233 GF 	 26.279 GF_per
progress = 67.1% 	 2455.0 / 3660.0 sec elapsed 	 1205.0 sec remain 	   210.233 GF 	 26.279 GF_per
progress = 68.4% 	 2502.2 / 3660.0 sec elapsed 	 1157.8 sec remain 	   210.233 GF 	 26.279 GF_per
progress = 69.7% 	 2549.5 / 3660.0 sec elapsed 	 1110.5 sec remain 	   210.233 GF 	 26.279 GF_per
progress = 70.9% 	 2596.7 / 3660.0 sec elapsed 	 1063.3 sec remain 	   210.234 GF 	 26.279 GF_per
progress = 72.2% 	 2643.9 / 3660.0 sec elapsed 	 1016.1 sec remain 	   210.234 GF 	 26.279 GF_per
progress = 73.5% 	 2691.1 / 3660.0 sec elapsed 	  968.9 sec remain 	   210.234 GF 	 26.279 GF_per
progress = 74.8% 	 2738.3 / 3660.0 sec elapsed 	  921.7 sec remain 	   210.234 GF 	 26.279 GF_per
progress = 76.1% 	 2785.5 / 3660.0 sec elapsed 	  874.5 sec remain 	   210.233 GF 	 26.279 GF_per
progress = 77.4% 	 2832.7 / 3660.0 sec elapsed 	  827.3 sec remain 	   210.234 GF 	 26.279 GF_per
progress = 78.7% 	 2879.9 / 3660.0 sec elapsed 	  780.1 sec remain 	   210.233 GF 	 26.279 GF_per
progress = 80.0% 	 2927.1 / 3660.0 sec elapsed 	  732.9 sec remain 	   210.234 GF 	 26.279 GF_per
progress = 81.3% 	 2974.3 / 3660.0 sec elapsed 	  685.7 sec remain 	   210.234 GF 	 26.279 GF_per
progress = 82.6% 	 3021.6 / 3660.0 sec elapsed 	  638.4 sec remain 	   210.234 GF 	 26.279 GF_per
progress = 83.8% 	 3068.8 / 3660.0 sec elapsed 	  591.2 sec remain 	   210.234 GF 	 26.279 GF_per
progress = 85.1% 	 3116.0 / 3660.0 sec elapsed 	  544.0 sec remain 	   210.234 GF 	 26.279 GF_per
progress = 86.4% 	 3163.2 / 3660.0 sec elapsed 	  496.8 sec remain 	   210.234 GF 	 26.279 GF_per
progress = 87.7% 	 3210.4 / 3660.0 sec elapsed 	  449.6 sec remain 	   210.235 GF 	 26.279 GF_per
progress = 89.0% 	 3257.6 / 3660.0 sec elapsed 	  402.4 sec remain 	   210.235 GF 	 26.279 GF_per
progress = 90.3% 	 3304.8 / 3660.0 sec elapsed 	  355.2 sec remain 	   210.235 GF 	 26.279 GF_per
progress = 91.6% 	 3352.0 / 3660.0 sec elapsed 	  308.0 sec remain 	   210.235 GF 	 26.279 GF_per
progress = 92.9% 	 3399.2 / 3660.0 sec elapsed 	  260.8 sec remain 	   210.235 GF 	 26.279 GF_per
progress = 94.2% 	 3446.4 / 3660.0 sec elapsed 	  213.6 sec remain 	   210.235 GF 	 26.279 GF_per
progress = 95.5% 	 3493.7 / 3660.0 sec elapsed 	  166.3 sec remain 	   210.236 GF 	 26.279 GF_per
progress = 96.7% 	 3540.9 / 3660.0 sec elapsed 	  119.1 sec remain 	   210.236 GF 	 26.279 GF_per
progress = 98.0% 	 3588.1 / 3660.0 sec elapsed 	   71.9 sec remain 	   210.236 GF 	 26.279 GF_per
progress = 99.3% 	 3635.3 / 3660.0 sec elapsed 	   24.7 sec remain 	   210.236 GF 	 26.279 GF_per

Completed Benchmarking Phase... elapsed time: 3671.2 seconds
2019-11-13 20:33:58.935

Number of CG sets:	311
Iterations per set:	54
scaled res mean:	4.322927e-03
scaled res variance:	0.000000e+00

Total Time: 3.671209e+03 sec
Setup        Overhead: 0.51%
Optimization Overhead: 0.54%
Convergence  Overhead: 7.41%

2x2x2 process grid
256x256x256 local domain
SpMV  =  178.8 GF (1125.7 GB/s Effective)   22.3 GF_per ( 140.7 GB/s Effective)
SymGS =  248.6 GF (1919.1 GB/s Effective)   31.1 GF_per ( 239.9 GB/s Effective)
total =  229.4 GF (1739.8 GB/s Effective)   28.7 GF_per ( 217.5 GB/s Effective)
final =  210.2 GF (1594.1 GB/s Effective)   26.3 GF_per ( 199.3 GB/s Effective)

end of application...
2019-11-13 20:33:59.268

附表

这里记录一些经常要查的东西~

Slurm 语法

一个节点的硬件配置

天河 2G 是天河二号的 K80 集群。每节点 CPU 是双路intel Xeon e5-2660v3处理器,内存 256GB,GPU 为四张 NVIDIA K80,互联网络采用 56Gb IB 网。

看了一下,每个节点确实是 20 核。

$ yhcontrol show node gn21
NodeName=gn21 Arch=x86_64 CoresPerSocket=10
   CPUAlloc=0 CPUErr=0 CPUTot=20 CPULoad=1.84 Features=(null)
   Gres=(null)
   NodeAddr=gn21 NodeHostName=gn21
   OS=Linux RealMemory=256000 AllocMem=0 Sockets=2 Boards=1
   State=IDLE ThreadsPerCore=1 TmpDisk=0 Weight=1
   BootTime=2019-08-05T09:29:10 SlurmdStartTime=2019-08-05T09:29:34
   CurrentWatts=0 LowestJoules=0 ConsumedJoules=0
   ExtSensorsJoules=n/s ExtSensorsWatts=0 ExtSensorsTemp=n/s
$ nvidia-smi
Wed Nov 13 20:41:55 2019
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 390.30                 Driver Version: 390.30                    |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  Tesla K80           Off  | 00000000:04:00.0 Off |                    0 |
| N/A   51C    P0    59W / 149W |      0MiB / 11441MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   1  Tesla K80           Off  | 00000000:05:00.0 Off |                    0 |
| N/A   40C    P0    71W / 149W |      0MiB / 11441MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   2  Tesla K80           Off  | 00000000:84:00.0 Off |                    0 |
| N/A   50C    P0    60W / 149W |      0MiB / 11441MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   3  Tesla K80           Off  | 00000000:85:00.0 Off |                    0 |
| N/A   39C    P0    71W / 149W |      0MiB / 11441MiB |     98%      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

天河 2G 节点可用的 Module 列表

发现每次都要查需要的软件在集群上是否安装以及对应的版本号,太长一串了。这里直接把结果贴在下面省的每次查。

$ module av
------- /BIGDATA1/app_GPU/modulefiles -------
anaconda2/5.1.0
anaconda3/5.1.0
bazel/0.9.0
bazel/0.13.0
caffe/1.0.0
CUDA/7.5
CUDA/8.0
cudnn/5.1-CUDA8.0
cudnn/6.0-CUDA8.0
deeplearning/18Q2
deeplearning/18Q2_py36
expect/5.45.3
intelcompiler/13.0.1
intelcompiler/14.0.2
intelcompiler/15.0.1
intelcompiler/17.0.6
intelcompiler/18.0.0
intelcompiler/mkl-14
intelcompiler/mkl-15
jdk/8u141-gcc-4.8.5
MPICH/Gnu/3.2-gcc4.8.5-dyn-noglex
mvapich2/2.2-gcc4.8.5
mvapich2/2.2-icc14
mvapich2/2.2-pgi17.1
nccl/2.1.15-cuda8.0
opencv/3.3.0
openmpi/1.10.2
openmpi/2.1.1-gcc4.8.5
PGIcompiler/17.1
protobuf/3.2.0
Python/2.7.14-anaconda2
Python/3.6.4-anaconda2
PyTorch/0.5a
tcl/8.4.19
tcl/8.6.8-gcc-4.8.5
TensorFlow/1.3-gpu-py2.7
TensorFlow/1.3-gpu-py3.6
TensorFlow/1.4-gpu-py3.6
TensorFlow/1.5-gpu-py2.7
TensorFlow/1.6-gpu-py2.7

--------- /BIGDATA1/app/modulefiles ---------
abacus/2.0
abinit/7.10.4-icc-14.0.2
ActiveTcl/8.6.4.1.299124-gcc-4.8.5
anaconda2/4.2.0-gcc-4.8.5
anaconda2/5.0.1-gcc-4.8.5
anaconda2/5.2.0-gcc-4.8.5
anaconda3/4.2.0-gcc-4.8.5
anaconda3/5.0.1-gcc-4.8.5
annovar/20160201
apache-ant/1.9.9-gcc-4.8.5
apr-util/1.6.0-gcc-4.8.5
apr/1.6.2-gcc-4.8.5
apr/1.6.2-icc-14.0.2
arpack/96-gcc-4.8.5
arpack/96-icc-14.0.2
arpack/96-icc-15.0.1
atk/2.16.0-gcc-4.8.5
atlas/3.10.2-gcc-4.8.5
atompaw/4.0.1.0-gcc-4.8.5
autoconf/2.69-gcc-4.8.5
autodock-vina/1.1.2-gcc-4.8.5
autodock/4.2.6-gcc-4.8.5
automake/1.15.1-gcc-4.8.5
bamtools/2.5.1-gcc-4.8.5
bazel/2.0-gcc-4.9.2
bcbio/1.0.9
bcftools/1.3.1-gcc-4.8.5
bcl2fastq/2.20-gcc-4.8.5
bdw-gc/7.6.0-gcc-4.8.5
bdw-gc/7.6.0-icc-14.0.2
beagle/2.1-gcc-4.8.5
beast/1.8.4-gcc-4.8.5
beast/2.4.8-gcc-4.8.5
bedops/2.4.20-gcc-4.9.2
bedtools2/2.23.0-gcc-4.8.5
bedtools2/2.25.0-gcc-4.8.5
bedtools2/2.26.0-gcc-4.8.5
BerkeleyGW/1.2.0-icc-14
BerkeleyGW/2.0.0-icc-14
binutils/2.29.1-gcc-4.8.5
bismark/0.19.0-gcc-4.8.5
bison/3.0.4-gcc-4.8.5
bison/3.0.4-icc-14.0.2
blacs/1.1-icc-14.0.2-mpich
blas/3.5.0-gcc-4.8.5
blas/3.5.0-icc-14.0.2
blas/3.5.0-icc-15.0.1
blasr/5.3-gcc-7.2.0
blast/2.2.30-gcc-4.8.5
blast/2.5.0-gcc-4.8.5
blat/36-gcc-4.8.5
blitz/0.10-gcc-4.8.5
boost/1.41.0-gcc-4.4.7
boost/1.41.0-gcc-4.8.5
boost/1.41.0-icc-13.0.1-mpich-3.2.1
boost/1.41.0-icc-14.0.2-mpich-3.2.1
boost/1.54.0-gcc-4.8.5
boost/1.59.0-gcc-5.3.0
boost/1.59.0-icc-15.0.1-mpich-3.2.1
boost/1.65.0-gcc-4.8.5
boost/1.66.0-gcc-4.8.5
boost/1.66.0-icc-14.0.2-mpich-3.2.1
bowtie/1.2.2-gcc-4.8.5
bowtie2/2.3.4.1-gcc-4.8.5
breakdancer/20160325-gcc-4.8.5
bsddb/4.7.25-gcc-4.8.5
bwa/0.7.10-gcc-4.8.5
bwa/0.7.12-gcc-4.8.5
bwa/0.7.15-gcc-4.8.5-bwakit
bzip2/1.0.6-gcc-4.8.5
bzip2/1.0.6-icc-14.0.2
cairo/1.14.12-gcc-4.8.5
CAMx/6.30-icc-14.0.2-mpich-3.2.1
canu/1.8-gcc-4.8.5
cblas/3.5.0-gcc-4.8.5
cdo/1.7.1-icc-14.0.2
cdo/1.7.2-icc-14.0.2
cdo/1.7.2-icc-14.0.2-netcdf_4.3.2
cellsys/5.0-gcc-4.8.5
cesm/1.2.2
cgnslib/3.2.1-gcc-4.8.5
chemsh/3.6.0-icc-14.0.2-mpich-3.2.1
cif2cell/1.2.10-gcc-4.8.5-python-2.7.9-fPIC
clapack/3.2.1-gcc-4.8.5
cmake/3.0.2-gcc-4.8.5
cmake/3.10.1-gcc-4.8.5
cmake/3.12.3-gcc-4.8.5
cmake/3.13.3-gcc-4.9.2
CNVnator/0.3.3
code_saturne/4.0.1-icc-14.0.2
code_saturne/4.0.3-icc-14.0.2
code_saturne/4.0.3-med-icc-14.0.2
code_saturne/4.0.4-icc-14.0.2
code_saturne/4.0.5-icc-14.0.2
collectl/4.3.0-gcc-4.8.5
Control-FREEC/9.2-gcc-4.8.5
copasi/4.15-gcc-4.8.5
cp2k/2.6.2-icc-14.0.2-popt
cp2k/3.0-icc-14.0.2
cp2k/4.1-icc-14.0.2
cp2k/5.1-icc-14.0.2
cp2k/6.1-icc-14.0.2
cube/4.3.5-icc-14.0.2-mpich-3.2.1
cufflinks/2.2.1-gcc-4.8.5
curl/7.49.0-gcc-4.4.7
curl/7.58.0-gcc-4.8.5
curl/7.58.0-icc-13.0.1
curl/7.58.0-icc-14.0.2
curl/7.62.0-gcc-5.4.0
curl/7.62.0-gcc-7.2.0
cutadapt/1.15-gcc-4.8.5
damageproto/1.2.1-gcc-4.8.5
deal.II/8.4.1-icc-15.0.1
deeptools/3.2.0
DFTB+/19.1-icc-17.0.6
diamond/0.9.17
DL_POLY/4.08-icc-14.0.2-mpi
eclipse/3.8.5
ELPA/2016.05.004-intel-15
emacs/24.3-gcc-4.8.5
emboss/6.6.0-gcc-4.8.5
esmf/6.3.0rp1-icc-14.0.2
esmf/7.0.0-icc-13.0.1-NC
esmf/7.0.0-icc-14.0.2
esmf/7.0.0-icc-14.0.2-debug
esmf/7.0.0-icc-14.0.2-NC
EXCAVATOR/2.2
exciting/2018.8-icc-14-mkl
ExomeCNVTest/0.51
expat/2.0.0
expat/2.2.2-gcc-4.8.5
fastqc/0.11.7-gcc-4.8.5
fastx_toolkit/0.0.14-gcc-4.8.5
fds/6.3.0-gcc-4.8.5
ffmpeg/3.4-gcc-4.8.5
fftw/2.1.5-icc-14.0.2
fftw/3.3.4-gcc-4.8.5-mpi
fftw/3.3.4-icc-14-double-avx
fftw/3.3.4-icc-14-double-avx-sse2
fftw/3.3.4-icc-14-float
fftw/3.3.4-icc-14-single
fftw/3.3.4-icc-14-single-avx
fftw/3.3.4-icc-14-single-avx-sse2
fftw/3.3.4-icc-14.0.2-mpi-fPIC
fftw/3.3.4-single-avx-sse2
fftw/3.3.5-icc-14-double
fftw/3.3.7-icc-14.0.2
fftw/3.3.7-icc-15-fma
fftw/3.3.8-icc-15-mpi
fish/2.1.1-gcc-4.8.5
fish/2.2.0-gcc-4.8.5
fish/2.6.0-gcc-4.8.5
FishingCNV/1.5.3-gcc-4.8.5
fixesproto/5.0-gcc-4.8.5
flex/2.5.39-gcc-4.8.5
flex/2.6.4-gcc-4.8.5
Flye/2.3.2-gcc-4.8.5
font-util/1.3.1-gcc-4.8.5
font-util/1.3.1-icc-14.0.2
fontconfig/2.12.3-gcc-4.8.5
freesurfer/5.3.0
freetype/2.6-gcc-4.8.5
freetype/2.7.1-gcc-4.8.5
fsl/5.0.9
FVCOM-lib/4.1-icc-14.0.2
FVCOM/4.1-icc-14.0.2
Gamess_USA/may2-2013
GATK/3.7-gcc-4.8.5
GATK/3.8-gcc-4.8.5
GATK/4.0.2.1
gcc/4.4.7
gcc/4.6.3
gcc/4.7.4
gcc/4.8.5
gcc/4.9.2
gcc/5.2.0
gcc/5.3.0
gcc/5.4.0
gcc/6.4.0
gcc/7.2.0
gcta/1.26.0
gdal/2.1.0-icc-14.0.2
gdb-server/7.6.1
gdbm/1.14.1-gcc-4.8.5
gdk-pixbuf/2.31.2-gcc-4.8.5
gemma/0.96
genewise/2.4.1
gengetopt/2.22.6-gcc-4.8.5
geos/3.5.0-icc-14.0.2
get_homologues/20170302
gettext/0.19.8.1-gcc-4.8.5
gflags/2.1.2-gcc-4.8.5
gflags/2.2.1-gcc-4.8.5
ghostscript/9.21-gcc-4.8.5
glib/2.44.1-gcc-4.8.5
glib/2.55.1-gcc-4.8.5
glibc/2.17-gcc-4.6.3
glibc/2.17-gcc-4.8.5
globus/6.0.0-icc-14.0.2
glog/0.3.3-gcc-4.8.5
glog/0.3.5-gcc-4.8.5
glproto/1.4.17-gcc-4.8.5
glue/1.46-python2.7.9
gmap/20160404
gmp/4.2.4
gmp/6.1.2-gcc-4.8.5
gmt/5.2.1-icc-14.0.2
gnuplot/5.0.5-gcc-4.8.5
gobject-introspection/1.49.2-gcc-4.8.5
gperf/3.0.4-gcc-4.8.5
grads/2.0.2
grads/2.2.0
grass/7.0.4-icc-14.0.2
gri/2.12.23-icc-14.0.2
groff/1.22.1-gcc-4.8.5
gromacs/4.5.5-icc-14.0.2-double
gromacs/5.0.4-icc-14-single
gromacs/5.0.4-icc-14-single-serial
gromacs/5.0.5-icc-14-double
gromacs/5.0.5-icc-14-double-avx-256
gromacs/5.0.5-icc-14-single
gromacs/5.1.4-icc-14-double-avx-256
gromacs/2018.3-icc-14-single-GPU-cuda8
gromacs/2018.3-icc-15-double
gromacs/2018.3-icc-15-single-gpu
gromacs/2018.3-icc-17-single
gsl/1.16-icc-13.0.1
gsl/1.16-icc-14.0.2
gsl/2.1-gcc-4.8.5
gsl/2.4-gcc-4.8.5
gsl/2.5-icc-14.0.2
guile/2.2.0-gcc-4.8.5
harfbuzz/1.4.6-gcc-4.8.5
hdf5/1.8.9-icc-14.0.2
hdf5/1.8.11-icc-13.0.1
hdf5/1.8.11-icc-14.0.2
hdf5/1.8.11-icc-15.0.1
hdf5/1.8.12-icc-14.0.2
hdf5/1.8.12-icc-14.0.2-serial
hdf5/1.8.12-icc-15.0.1-parallel
hdf5/1.8.13-02-icc-14.0.2
hdf5/1.8.13-gcc-4.8.5
hdf5/1.8.13-gcc-4.8.5-parallel
hdf5/1.8.13-icc-14.0.2
hdf5/1.8.13-icc-14.0.2-serial
hdf5/1.8.13-icc-15.0.1
hdf5/1.8.17-icc-14.0.2
hdf5/1.8.17-icc-15.0.1-parallel
hdf5/1.8.18-icc-14.0.2
hdf5/1.8.20-gcc-4.8.5
hdf5/1.8.20-icc-13.0.1
hdf5/1.8.20-icc-14.0.2
hdf5/1.8.20-icc-14.0.2-with-cxx
hdf5/1.8.21-gcc-5.4.0-parallel
hdf5/1.8.21-gcc-7.2.0-parallel
hdf5/1.8.21-icc-18.0.0-par
hdf5/1.10.1-gcc-4.8.5
hdf5/1.10.1-icc-15.0.1
hdf5/1.10.4-gcc-5.2.0
hdf5/1.10.4-icc-14.0.2-parallel
help2man/1.47.4-gcc-4.8.5
hisat2/2.1.0-gcc-4.8.5
hmmer/3.1b2-icc-14.0.2
htslib/1.9-gcc-4.8.5
hypre/2.10.1-icc-14.0.2
hypre/2.10.1-icc-15.0.1
icu4c/60.1-gcc-4.8.5
ilmbase/2.2.0-gcc-4.8.5
ilmbase/2.2.0-icc-14.0.2
ImageMagick/6.9.2-5
ImageMagick/7.0.7-5
impute2/2.3.2_static
inputproto/2.3.2-gcc-4.8.5
intel-tbb/2018.2-gcc-4.8.5
intelcompiler/13.0.1
intelcompiler/14.0.2
intelcompiler/15.0.1
intelcompiler/17.0.6
intelcompiler/18.0.0
intelcompiler/mkl-14
intelcompiler/mkl-15
isl/0.16.1-gcc-4.8.5
jasmin/3.2.5
jasper/1.900.1
jasper/1.900.1-gcc-4.8.5-withpng
jasper/1.900.1-icc-14.0.2-withpng
jasper/1.900.1-icc-15.0.1-withpng
jasper/1.900.1-with-libpng
java/1.8_jdk8u141
jdk/7u15-gcc-4.8.5
jdk/7u71-gcc-4.8.5
jdk/8u141-gcc-4.8.5
jpeg/9c-icc-14.0.2
jpeg/9c-icc-15.0.1
julia/0.4.5-icc-14.0.2
kbproto/1.0.7-gcc-4.8.5
kggseq/20170318
kggseq/20170328
kim-api/1.7.1-icc-14.0.2
lammps/1Dec12-icc-14.0.2
lammps/7Dec15-icc-14.0.2
lammps/12Dec18-icc-15.0.1
lammps/31Mar17-icc-14.0.2
lapack/3.5.0-gcc-4.8.5
lapack/3.5.0-icc-14.0.2
lapack/3.5.0-icc-15.0.1
lapack/3.8.0-gcc-7.2.0
lastz/1.02.00-gcc-4.8.5
lastz/1.04.00-gcc-4.8.5
lcms/2.8-gcc-4.8.5
lefse/1.0.8
leveldb/1.15-gcc-4.8.5
leveldb/1.20-gcc-4.8.5
libarchive/3.3.3-icc-14.0.2
libatomic-ops/7.4.4-gcc-4.8.5
libatomic-ops/7.4.4-icc-14.0.2
libbsd/0.8.6-gcc-4.8.5
libconfig/1.5-gcc-4.8.5
libedit/3.1-gcc-4.8.5
libevent/2.0.22-icc-14.0.2
libffi/3.2.1-gcc-4.8.5
libffi/3.2.1-icc-14.0.2
libgcrypt/1.8.1-gcc-4.8.5
libgd/2.1.0-icc-14.0.2
libgd/2.2.4-gcc-4.8.5
libgpg-error/1.27-gcc-4.8.5
libgphoto2/2.5.8-gcc-4.8.5
libiconv/1.15-gcc-4.8.5
libint/1.1.5-icc-14.0.2
libint/2.0.3-icc-14.0.2
libint/2.4.2-icc-15
libjpeg-turbo/1.5.0-gcc-4.8.5
libjpeg/6b-gcc-4.8.5
libjpeg/9b-gcc-4.8.5
libjpeg/9b-icc-14.0.2
libmng/2.0.3-gcc-4.8.5
libpng/1.2.56-icc-14.0.2
libpng/1.6.29-gcc-4.8.5
libpng/1.6.34-gcc-4.8.5
libpthread-stubs/0.4-gcc-4.8.5
libsigsegv/2.11-gcc-4.8.5
libsigsegv/2.11-icc-14.0.2
libsvm/322-gcc-4.8.5
libtiff/4.0.8-gcc-4.8.5
libtool/2.4.6-gcc-4.8.5
libunistring/0.9.7-gcc-4.8.5
libunwind/1.1-gcc-4.8.5
libx11/1.6.4-gcc-4.8.5
libx11/1.6.5-gcc-4.8.5
libxau/1.0.8-gcc-4.8.5
libxaw/1.0.13
libxc/1.1.0-icc-14.0.2
libxc/2.0.0-icc-14.0.2
libxc/2.0.1-icc-14.0.2
libxc/2.2.2-icc-14.0.2
libxc/2.2.3-icc-14.0.2
libxc/4.2.3-icc-15.0.1
libxcb/1.12-gcc-4.8.5
libxcursor/1.1.14-gcc-4.8.5
libxdamage/1.1.4-gcc-4.8.5
libxdmcp/1.1.2-gcc-4.8.5
libxext/1.3.3-gcc-4.8.5
libxfixes/5.0.2-gcc-4.8.5
libxml2/2.8.0-gcc-4.8.5
libxml2/2.9.4-gcc-4.8.5
libXmu/1.1.1
libxpm/3.5.0
libXpm/3.5.5-gcc-4.8.5
libxrender/0.9.10-gcc-4.8.5
libxshmfence/1.2-gcc-4.8.5
libxv/1.0.10-gcc-4.8.5
libxvmc/1.0.9-gcc-4.8.5
libzip/1.2.0-gcc-4.8.5
likwid/4.3.0-gcc-4.8.5
llvm/5.0.1-gcc-4.8.5
lmdb/0.9.21-gcc-4.8.5
lua/5.3.4-gcc-4.8.5
lzlib/1.10-gcc-4.8.5
m4/1.4.18-gcc-4.8.5
m4/1.4.18-icc-14.0.2
makedepend/1.0.5-gcc-4.8.5
makedepf90/2.8.8-gcc-4.8.5
mcl/14-137-gcc-4.8.5
med/3.0.8-icc-14.0.2
meerkat/0.185-gcc-4.8.5
meme/4.12.0-gcc-4.8.5
mesa/17.2.3-gcc-4.8.5
METIS/5.1.0-icc-14.0.2
METIS/5.1.0-icc-14.0.2-32bit
METIS/5.1.0-icc-15.0.1
mgltools/1.5.6-gcc-4.8.5
mongodb-api/c/1.9.2-gcc-4.8.5
mongodb/3.7.2
mpc/0.8.1
mpfr/2.4.2
mpfr/3.0.1-gcc-4.8.5
MPI/impi/4.1.3.048
MPI/impi/5.0.2.044
MPI/impi/2017.4.256
MPI/impi/2018.0.128
MPI/mpich/3.2.1-gcc-4.8.5-dynamic
MPI/mpich/3.2.1-gcc-4.9.2-dynamic
MPI/mpich/3.2.1-gcc-5.2.0-dynamic
MPI/mpich/3.2.1-gcc-5.4.0-dynamic
MPI/mpich/3.2.1-gcc-7.2.0-dynamic
MPI/mpich/3.2.1-icc-13.0.1-dynamic
MPI/mpich/3.2.1-icc-14.0.2-dynamic
MPI/mpich/3.2.1-icc-15.0.1-dynamic
MPI/mpich/3.2.1-icc-17.0.6-dynamic
MPI/mpich/3.2.1-icc-18.0.0-dynamic
MPI/openmpi/1.8.8-gcc-4.8.5-dynamic
MPI/openmpi/1.8.8-icc-14.0.2-dynamic
MPI/openmpi/1.10.7-gcc-4.8.5-dynamic
MPI/openmpi/1.10.7-icc-14.0.2-dynamic
MPI/openmpi/2.1.2-gcc-4.8.5-dynamic
MPI/openmpi/2.1.2-icc-14.0.2-dynamic
MPI/openmpi/3.0.0-gcc-4.8.5-dynamic
MPI/openmpi/3.0.0-icc-14.0.2-dynamic
MPI/openmpi/3.1.3-gcc-4.8.5-dynamic
mpiP/3.4.1-icc-14.0.2-mpich3.2.1
mplayer/1.3.0-gcc-4.8.5
MSTmap/20161226-gcc-4.8.5
multiz-tba/20090121-gcc-4.8.5
MUMPS/5.1.1-gcc-4.8.5
muparser/2.2.5-icc-14.0.2
muparser/2.2.5-icc-15.0.1
NAMD/2.12
nasm/2.11.06-gcc-4.8.5
ncbi-vdb/2.6.4-gcc-4.8.5
ncbi-vdb/2.7.0-gcc-4.8.5
NCL/6.1.0
NCL/6.3.0
NCL/6.4.0
NCL/6.6.2
nco/4.6.0-icc-13.0.1
nco/4.6.0-icc-14.0.2
ncurses/5.9-gcc-4.8.5
ncurses/6.0-gcc-4.8.5
ncview/2.1.5-icc-14.0.2
netcdf/3.6.3-icc-14.0.2
netcdf/4.1.3-icc-13.0.1
netcdf/4.3.2-icc-13.0.1
netcdf/4.3.2-icc-14.0.2
netcdf/4.3.2-icc-15.0.1-par
netcdf/4.3.3.1-icc-14.0.2
netcdf/4.3.3.1-icc-15.0.1
netcdf/4.4.1-icc-14.0.2-par
netcdf/4.4.1-icc-15.0.1-parallel
netcdf/4.4.1-icc-17.0.6-par
netcdf/4.5.0-icc-13.0.1
netcdf/4.5.0-icc-14.0.2
netcdf/4.5.0-icc-15.0.1-parallel
netcdf/4.6.2-gcc-4.8.5
netcdf/4.6.2-gcc-7.2.0-par
netcdf/4.6.2-gcc-7.2.0-parallel
netcdf/4.6.2-icc-18.0.0-par
netcdf/4.6.3-gcc-5.2.0
ngs/1.2.5-gcc-4.8.5
ngsqctoolkit/2.3.3-gcc-4.8.5
numactl/2.0.11-gcc-4.8.5
numdiff/5.8.1-icc-15.0.1
nwchem/6.5-icc-14.0.2
ocaml/4.02.3-gcc-4.8.5
octave/4.2.1-gcc-4.8.5
octave/4.4.1-gcc-4.8.5
octopus/5.0.1-icc-14.0.2
octopus/9.0-icc-15.0.1
opam/1.2.2-gcc-4.8.5
opari2/2.0.2-icc-14.0.2
openbabel/2.3.2-gcc-4.8.5
openbabel/2.4.0-gcc-4.8.5
openblas/0.2.20-gcc-4.8.5
opencv/1.0.0-gcc-4.4.7
opencv/1.0.0-gcc-4.8.5
opencv/2.4.9-gcc-4.8.5
opencv/2.4.11-gcc-4.8.5
opencv/3.0.0-gcc-4.8.5
opencv/3.4.0-gcc-4.8.5
openexr/2.2.0-gcc-4.8.5
OpenFOAM/2.1.1-gcc-4.8.5
OpenFOAM/2.2.0-icc-14.0.2
OpenFOAM/2.3.1-gcc-4.8.5
OpenFOAM/2.3.1-icc-17.0.6
OpenFOAM/2.4.0-icc-17.0.6
OpenFOAM/3.0.0-icc-17.0.6
OpenFOAM/4.0-icc-17.0.6
OpenFOAM/extend40-icc-17.0.6
OpenFOAM/SOWFA-2.0.x
OpenFOAM/SOWFA-2.2.0
OpenFOAM/v1712-icc-17.0.6
openjpeg/2.1.2-icc-14.0.2
openjpeg/2.3.0-icc-14.0.2
openssl/1.0.2g-gcc-4.8.5
openssl/1.0.2n-gcc-4.8.5
otf2/1.5.1-icc-14.0.2
p4est/1.1-icc-14.0.2-mpich-3.2.1
p4est/1.1-icc-15.0.1
p7zip/16.02-gcc-4.8.5
packmol/18.013-gcc-4.8.5
pango/1.41.0-gcc-4.8.5
panoply/4.8.8-gcc-4.8.5
PAPI/5.7.0-icc-14.0.2
PAPI/5.7.0-icc-15.0.1
parallel/20180222-gcc-4.8.5
parmetis/4.0.3-icc-14.0.2
parmetis/4.0.3-icc-14.0.2-mpich3.2.1-32bit
parmetis/4.0.3-icc-14.0.2-mpich3.2.1-64bit
pb-assembly/falcon-0.4.0
pb-assembly/falcon-0.4.0-py3.7
pcre/8.41-gcc-4.8.5
pegasus/4.6.2-gcc-4.8.5
perl/5.10.1-gcc-4.8.5
perl/5.24.1-gcc-4.8.5
perl/5.26.1-gcc-4.8.5
perl/5.30.0-gcc-4.8.5
petsc/2.3.3-icc-14.0.2
petsc/3.1-p8
petsc/3.2-p7
petsc/3.2-p7-with-mumps
petsc/3.4.4-t
petsc/3.5.1-icc-14.0.2
petsc/3.5.4-gcc-4.8.5
petsc/3.5.4-icc-14.0.2
petsc/3.6.2-icc-14.0.2
petsc/3.6.2-icc-15.0.1
petsc/3.6.3-complex
petsc/3.6.3-complex-debug
petsc/3.6.3-gcc-4.8.5
petsc/3.6.3-icc-14.0.2
petsc/3.6.3-icc-15.0.1
petsc/3.6.3-icc14-4.9.2
petsc/3.6.4-gcc-4.9.2
petsc/3.6.4-icc-14.0.2
petsc/3.6.4-icc-15.0.1
petsc/3.6.4-icc-15.0.1-gcc-4.9.2
petsc/3.7.3
petsc/3.7.3-icc-14.0.2
petsc/3.7.5-icc-14.0.2
petsc/3.7.5-icc-15.0.1
petsc/3.7.6-gcc-4.8.5
petsc/3.7.6-icc-14.0.2
petsc/3.8.0-gcc-4.8.5
petsc/3.8.0-icc-14.0.2
pfft/1.0.8-icc-15.0.1
phast/1.4-gcc-4.8.5
phenglei/v2d0
phenglei/v3d0
phono3py/1.18.1
PhyloCSF/20180227-gcc-4.8.5
picard/1.141-gcc-4.8.5
picard/2.17.8-gcc-4.8.5
pigz/2.3.4
pigz/2.4-gcc-4.8.5
PIO/1.9.23-gcc-4.8.5
PIO/1.9.23-gcc-5.2.0
PIO/1.9.23-gcc-7.2.0
PIO/1.9.23-icc-15.0.1
PIO/1.10.0-icc-15.0.1
PIO/2.3.1-icc-15.0.1
PIO/2.4.0-icc-14.0.2
PIO/20190314-master
pitchfork/20180224-gcc-7.2.0
pixman/0.34.0-gcc-4.8.5
pkgconf/1.4.0-gcc-4.8.5
pkgconf/1.4.0-icc-14.0.2
plink/1.07-gcc-4.8.5
plink/1.90-gcc-4.8.5
plinkseq/0.10-gcc-4.8.5
plumed/2.1.2-icc-14.0.2
plumed/2.2.3-icc-14.0.2
PNCL/1.0
pnetcdf/1.6.0-gcc-4.8.5
pnetcdf/1.6.0-icc-14.0.2
pnetcdf/1.6.1-icc-13.0.1
pnetcdf/1.6.1-icc-14.0.2
pnetcdf/1.6.1-icc-15.0.1
pnetcdf/1.8.1-icc-15.0.1
pnetcdf/1.9.0-gcc-5.4.0
pnetcdf/1.9.0-gcc-7.2.0
pnetcdf/1.9.0-icc-13.0.1
pnetcdf/1.9.0-icc-14.0.2
pnetcdf/1.9.0-icc-15.0.1
pnetcdf/1.9.0-icc-17.0.6
pnetcdf/1.11.0-gcc-5.2.0
postgresql/10.1-gcc-4.8.5
presentproto/1.0-gcc-4.8.5
prodigal/2.6.3-gcc-4.8.5
proj/4.9.2-icc-13.0.1
proj/4.9.2-icc-14.0.2
protobuf/3.1.0-icc-14.0.2
PSI/4.1-icc-14.0.2
py-appdirs/1.4.3-gcc-4.8.5
py-lit/0.5.0-gcc-4.8.5
py-mako/1.0.4-gcc-4.8.5
py-markupsafe/1.0-gcc-4.8.5
py-packaging/16.8-gcc-4.8.5
py-pyparsing/2.2.0-gcc-4.8.5
py-setuptools/35.0.2-gcc-4.8.5
py-six/1.10.0-gcc-4.8.5
pyngl/1.6.1
python/2.7.9-gcc-4.8.5
python/2.7.10rc1-gcc-5.4.0
python/2.7.12-gcc-4.8.5-anaconda
python/2.7.14-gcc-4.8.5
python/2.7.14-gcc-4.8.5-anaconda
python/2.7.15-gcc-7.2.0
python/2.7.16-gcc-5.4.0
python/3.5.2-gcc-4.8.5-anaconda
python/3.5.5-gcc-4.8.5
python/3.6.3-gcc-4.8.5-anaconda
python/3.6.4-gcc-4.8.5
qiime2/2018.11
qt/4.8.6-gcc-4.8.5
qt/5.5.0-gcc-4.8.5
Quantum_Espresso/6.0-icc-15.0.1-MPI
Quantum_Espresso/6.1-icc-15.0.1-MPI
Quantum_Espresso/6.4-icc-15.0.1-MPI
R/3.2.5-gcc-4.8.5
R/3.3.3-gcc-4.8.5
R/3.4.3-gcc-4.8.5
R/3.5.0-gcc-4.8.5
R/3.5.1-gcc-4.8.5
R/3.5.2-gcc-7.2.0
rar/5.3.b4-gcc-4.8.5
raxml/8.2.9-gcc-4.8.5
readline/6.3-gcc-4.8.5
readline/7.0-gcc-4.8.5
redis/3.2.4-gcc-4.8.5
renderproto/0.11.1-gcc-4.8.5
rings/1.3.3-gcc-4.8.5
rmats/3.2.1
root/6.03.04-gcc-4.8.5
root/6.12.04-gcc-4.8.5
rsem/1.2.29-gcc-4.8.5
sambamba/0.6.3-gcc-4.8.5
sambamba/0.6.7-gcc-4.8.5
samtools/0.1.19-gcc-4.8.5
samtools/1.2-gcc-4.8.5
samtools/1.7-gcc-4.8.5
scala/2.11.8-gcc-4.8.5
scalapack/1.8.0-gcc-4.8.5
scalapack/2.0.2-icc-14.0.2-mkl
scalasca/2.2-gcc-4.8.5
scons/2.5.1-gcc-4.8.5
scorep/3.1-icc-14.0.2-mpich-3.2.1
screen/4.6.2-gcc-4.8.5
sentieon/201704
sentieon/201711
serf/1.3.9-gcc-4.8.5
shapeit/2.837-gcc-4.8.5
siesta/4.1b3-gcc-4.8.5
silo/4.10.2-icc-14.0.2-mpich-3.2.1
singularity/2.5.2-icc-14.0.2
SLEPc/3.6.1-icc-14.0.2
SLEPc/3.6.1-icc-15.0.1
SLEPc/3.6.1-icc-15.0.1-gcc-4.9.2
SLEPc/3.6.3-icc-14.0.2
SLEPc/3.6.3-icc-15.0.1
snappy/1.1.6-gcc-4.8.5
snptest/2.5-gcc-4.8.5
spades/3.10.1-gcc-4.8.5
spark/2.3.0_hadoop-2.7
sqlite/3.21.0-gcc-4.8.5
sqlite/3.22.0-gcc-4.8.5
sra-toolkit/2.7.0-gcc-4.8.5
sra-toolkit/2.8.2-1-gcc-4.8.5
STAR/2.5.0.b-gcc-4.9.2
STAR/2.5.1-gcc-4.9.2
stringtie/1.3.1c-gcc-4.8.5
stringtie/1.3.5-gcc-4.8.5
SU2/6.1.0
swift-t/1.3-gcc-4.8.5
swift/0.96.2
swig/3.0.10-gcc-4.8.5
swig/3.0.10-icc-14.0.2
swig/3.0.10-icc-15.0.1
swig/3.0.12-gcc-4.8.5
szip/2.1-gcc-4.9.2
szip/2.1-icc-14
szip/2.1-icc-14-Gnu
szip/2.1-icc-15
szip/2.1.1-gcc-4.8.5
szip/2.1.1-icc-13.0.1
szip/2.1.1-icc-14.0.2
szip/2.1.1-icc-15.0.1
tar/1.29-gcc-4.8.5
TAU/2.28.1-icc-14.0.2
tcl/8.5.18-gcc-4.8.5
tcl/8.6.6-gcc-4.8.5
tcl/8.6.8-gcc-4.8.5
tcltk/8.6.8-gcc-4.8.5
telemac-mascaret/v7p1-icc-14.0.2
tk/8.6.6-gcc-4.8.5
tmux/2.0
tophat/2.0.14-gcc-4.8.5
training/1.0-icc-14.0.2
trilinos/12.6.3-icc-14.0.2
trilinos/12.10.1-icc-15.0.1
trim_galore/0.4.5-gcc-4.8.5
trimmomatic/0.35-gcc-4.8.5
udunits/1.12.11-icc-14.0.2
udunits/2.2.26-icc-13.0.1
udunits/2.2.26-icc-14.0.2
udunits/2.2.26-icc-15.0.1
UnderWorld/1.7.0-icc-14.0.2
UnderWorld/1.7.0-icc-14.0.2-with-mumps
UnderWorld2/2.4.0b-gcc-4.8.5
util-macros/1.19.1-gcc-4.8.5
util-macros/1.19.1-icc-14.0.2
vc/1.3.3-gcc-4.9.2
vcftools/0.1.13-gcc-4.8.5
velvet/1.2.10-gcc-4.8.5
videoproto/2.3.3-gcc-4.8.5
ViennaRNA/2.4.3-gcc-4.8.5
vim/8.1
visit/2.12.3
visit/2.13.0
vmd/1.9.2-gcc-4.8.5
VTK/7.1.1-gcc-4.8.5
Wannier90/2.0.0-icc-14-mpich3.2.1
Wannier90/2.0.1-icc-14-mpich3.2.1
WPS/3.6.1-icc-14.0.2
WPS/3.7.1-icc-14.0.2
WPS/3.9.0.1-icc-15.0.1
WPS/3.9.1-icc-15.0.1
WRF/3.6.1-icc-14.0.2
WRF/3.7.1-icc-14.0.2
WRF/3.9-icc-15.0.1
WRF/3.9.1-icc-15.0.1
xcb-proto/1.12-gcc-4.8.5
xextproto/7.3.0-gcc-4.8.5
xhmm/20180226-gcc-4.8.5
xmgrace/5.1.25-gcc-4.8.5
xproto/7.0.31-gcc-4.8.5
xtrans/1.3.5-gcc-4.8.5
xz/5.2.2-gcc-4.8.5
xz/5.2.2-icc-14.0.2
xz/5.2.3-gcc-4.8.5
xz/5.2.3-icc-13.0.1
xz/5.2.3-icc-14.0.2
Yambo/4.2.1
Yambo/4.3.2
yaml/0.1.5-icc-14
yasm/1.3.0-gcc-4.8.5
zlib/1.2.7-gcc-4.8.5
zlib/1.2.8-icc-14.0.2
zlib/1.2.8-icc-15.0.1
zlib/1.2.11-gcc-4.8.5
zlib/1.2.11-gcc-5.2.0
zlib/1.2.11-icc-13.0.1
zlib/1.2.11-icc-14.0.2
zlib/1.2.11-icc-15.0.1
zsh/5.0.7-gcc-4.8.5