Context Navigation

← Previous Revision
Latest Revision
Next Revision →
Blame
Revision Log

README.txt @ 16

Last change on this file since 16 was 16, checked in by jebri, 11 years ago
sources for flexwrf v3.1
File size: 6.0 KB

Line
1	J. Brioude, Sept 19 2013
2	**************************************************************
3	To compile flexwrf, choose your compiler in makefile.mom (line 23), the path to the NetCDF library and then type
4	make -f makefile.mom mpi for MPI+OPENMP hybrid run
5	make -f makefile.mom omp for OPENMP parallel run
6	make -f makefile.mom serial for a serial run
7	********************************************************************
8	To run flexwrf, you can pass an argument to the executable that gives the name of the input file.
9	for instance
10	./flexwrf31_mpi /home/jbrioude/inputfile.txt
11	Otherwise, the file flexwrf.input in the current directory is read by default.
12
13	Examples of forward and backward runs are available in the examples directory.
14
15
16	*****************************************************************
17	Versions timeline
18
19	version 3.1: bug fix on the sign of sshf in readwind.f90
20	modifications of advance.f90 to limit the vertical velocity from cbl scheme
21	bug fix in write_ncconc.f90
22	modifications of interpol*f90 routines to avoid crashes using tke_partition_hanna.f90 and tke_partition_my.f90
23	version 3.0 First public version
24
25	version 2.4.1: New modifications on the wet deposition scheme from Petra Seibert
26
27	version 2.3.1: a NetCDF format output is implemented.
28
29	version 2.2.7: CBL scheme is implemented. a new random generator is implemented.
30
31	version 2.0.6:
32	-map factors are used in advance.f90 when converting the calculated distance
33	into a WRF grid distance.
34	-fix on the divergence based vertical wind
35
36	version 2.0.5:
37	the time over which the kernel is not used has been reduced from 10800 seconds
38	to 7200 seconds. Those numbers depend on the horizontal resolution, and a more
39	flexible solution might come up in a future version
40	version 2.0.4:
41	- bug fix for regular output grid
42	- IO problems in ASCII have been fixed
43	- add the option of running flexpart with an argument that gives the name of
44	the inputfile instead of flexwrf.input
45	version 2.0.3:
46	- bug fix when flexpart is restarted.
47	-bug fix in coordtrafo.f90
48	- a new option that let the user decide if the time for the the time average
49	fields from WRF has to be corrected or not.
50
51	version 2.0.2:
52	- bug fix in sendint2_mpi_old.f90
53	- all the mpi.f90 have been changed to handle more properly the memory.
54	- timemanager_mpi has changed accordingly. Some bug fix too
55	- bug fix in writeheader
56	- parallelization of calcpar and verttransform.f90, same for the nests.
57
58	version 2.0.1:
59	-1 option added in flexwrf.input to define the output grid with dxout and dyout
60	-fix in readinput.f90 to calculate maxpart more accurately
61
62	version 2.0: first OPENMP/MPI version
63
64	version 1.0:
65	This is a fortran 90 version of FLEXPART.
66	Compared to PILT, the version from Jerome Fast available on the NILU flexpart website, several bugs and improvements have been made (not
67	necessarily commented) in the subroutines.
68	non exhaustive list:
69	1) optimization of the kein-fritch convective scheme (expensive)
70	2) possibility to output the flexpart run in a regular lat/lon output grid.
71	flexwrf.input has 2 options to let the model know which coordinates are used
72	for the output domaine and the release boxes.
73	3) Differences in earth radius between WRF and WRF-chem is handled.
74	4) time averaged wind, instantaneous omega or a vertical velocity internally calculated in FLEXPART can be used now.
75	5) a bug fix in pbl_profile.f due to the variable kappa.
76
77	Turb option 2 and 3 from Jerome Fast's version lose mass in the model. Those
78	options are not recommended.
79
80	***********************************************************************
81	General comments on The hybrid version of flexpart wrf:
82	This version includes a parallelized hybrid version of FLEXPART that can be
83	used with:
84	- 1 node (1 computer) with multi threads using openmp in shared memory,
85	- or several nodes (computers) in distributed memory (using mpi) and several threads in shared memory (using openmp).
86	if a mpi library is not available with your compiler, use makefile.nompi to compile flexwrf
87
88	The system variable OMP_NUM_THREADS has to be set before running the model to define the number of thread used.
89	it can also be fixed in timemanager*f90.
90	If not, flexwrf20_mpi will use 1 thread.
91
92	When submitting a job to several nodes, mpiexec or mpirun needs to know that 1 task has to be allocated per node to let openmp doing the work within each node in shared memory.
93	See submit.sh as an example.
94
95	Compared to the single node version, this version includes modifications of:
96
97	- flexwrf.f90 that is renamed into flexwrf_mpi.f90
98	- timemanager.f90 that is renamed into timemanager_mpi.f90
99	- the interpolf90 and hanna has been modified.
100	- the routines mpi.f90 are used to send or receive data between nodes.
101
102	The most important modifications are in timemanager_mpi.f90, initialize.f90 and advance.f90.
103	search for JB in timemanager_mpi.f90 to have additional comments.
104	in advance.f90, I modified the way the random number is picked up (line 187). I use a simple count and the id of the thread instead of the random pick up that uses ran3.
105	If the series of random number is output for a give release box (uncomment lines 195 to 198), the distribution is quite good, and I don't see any bigger bias that the one in the single thread version.
106	of course, the distribution is less and less random when you increase the number of nodes or threads.
107
108
109	*********************************************************
110	performance:
111	this is the performance of the loop line 581 in timemanager_mpi.f90 that calculates the trajectories.
112	I use the version v74 as the reference (single thread, fortran 77).
113	There is a loss in performance between v74 and v90 because of the temporary variables th_* that has to be used as private variables in timemanager_mpi.f90
114	v74
115	v90 1thread 0.96
116	v90 2threads 1.86
117	v90 4threads 3.57
118	v90 8threads 6.22
119
120	performance of the communication between nodes:
121	depends on the system. The super computer that I use can transfer about 1Gb in 1 second.
122	in timemanager_mpi.f90, the output lines 540 and 885 give the time needed by the system to communicate between nodes. using 100 millions particles and say 4 nodes, it takes about 1 second.
123

Note: See TracBrowser for help on using the repository browser.

Context Navigation

source: branches/jerome/src_flexwrf_v3.1/README.txt @ 16

Download in other formats: