== FLEXPART Performance == This is the place to document performance tests with different hardware, compilers, compiler options, and run configurations so that we can learn from each other and avoid reinventing the wheel. Please document all relevant parameters. ||= '''Hardware''' =||= '''Version''' =||= '''Compilation'''=||= '''Setup''' =||= '''Runtime''' =||= Note=|| || Xeon E5-2690 (A)||Fp8.2.3fr|| if13 (O2)|| AL-500-300 || '''21:29'''|| || Xeon E5-2690 (A)||Fp8.2.3fr|| if13 (O3a)|| AL-500-300 || '''21:05'''|| || Xeon E5-2697 (B)||Fp8.2.3fr|| if13 (O3a)|| AL-500-300 || '''20:50'''|| || Xeon E5-2697 (B)||Fp8.2.3fr|| if13 (O3a)|| AL-250-300 || '''13:18'''|| /1/|| === Hardware === (A):: Intel(R) Xeon(R) CPU E5-2690 0 @ 2.90GHz, microcode 1808, cache size: 20480 KB. 2 CPUs with each 8 cores. (B):: Intel(R) Xeon(R) CPU E5-2697 v3 @ 2.60GHz, microcode 57, cache size: 35840 KB. 2 CPUs with each 14 cores. === Compiler === '''if13''' ifort 13.1.2 === Compiler Options === '''(O2)''' `-O2 -mcmodel=medium`. grib_api-1.12.3 (compiled with ifort, `FCFLAGS = -g -O1 -fp-model precise`) [[BR]] '''(O3a)''' `-ipo -O3 -mcmodel=medium -no-prec-div -opt-prefetch3`. grib_api-1.12.3 === Fp Setup === AL-500-300:: 500k particles, 300 s `lsynctime`. Output size 17M. `COMMAND`: {{{ -1 LDIRECT 20170418 000000 20170423 000000 3600 OUTPUT EVERY 3600 TIME AVERAGE OF OUTPUT 300 SAMPLING RATE OF OUTPUT 999999999 TIME CONSTANT FOR PARTICLE SPLITTING 300 SYNCHRONISATION INTERVAL 3.0 CTL 4 IFINE 1 IOUT 0 IPOUT 1 LSUBGRID 0 LCONVECTION 0 LAGESPECTRA 0 IPIN 1 IOUTPUTFOREACHREL 0 IFLUX 0 MDOMAINFILL 1 IND_SOURCE 2 IND_RECEPTOR 0 MQUASILAG 0 NESTED_OUTPUT 0 LIMIT_COND }}} `OUTGRID` dimensions: 450 x 300 x 2. Met. input dimensions: = 161 x 81 x 91 AL-250-300:: 250k particles, otherwise as AL-500-300 AL-250-120:: as AL-250-300 except: {{{ 240 SAMPLING RATE OF OUTPUT 120 SYNCHRONISATION INTERVAL 1.0 CTL 2 IFINE }}} === Notes === /1/:: implies that in this case runtime = 13.3m + 0.03m * npart(k) (or +1.8 ms per particle)