source: flexpart.git/flexpart_code/checkgrib/README.md @ 9411952

FPv9.3.2
Last change on this file since 9411952 was 9411952, checked in by Don Morton <Don.Morton@…>, 3 years ago

Synchronised repo with the current CTBTO master.
The sole change is the addition of the flexpart_code/checkgrib/ dir

  • Property mode set to 100644
File size: 13.8 KB
Line 
1
2
3# checkGRIB
4
5This utility was created primarily for CTBTO operations to facilitate a check
6of incoming ECMWF, NCEP and NCEPFV3 GRIB2 files before they are staged into the ATM
7pipeline.  The idea is to catch problems in the files long before they are
8actually used, rather than to have the problems discovered, mysteriously, in
9a termination of an application that relies on the files.
10
11## Author information
12
13Don Morton
14Boreal Scientific Computing
15Fairbanks, Alaska, USA
16Don.Morton@borealscicomp.com
17
18
19
20I'm horrible at legalese but, as far as I'm concerned, this utility is totally
21free and open to the public for use and modification, and is included with
22the FLEXPART distribution with the same permissions. 
23
24----
25
26## Installation
27
28Code in this directory is self-contained, but relies on an installation of
29GRIB-API library.  Additionally, a test package is provided that depends
30on Python v2.7 (also works with Python 3.6) and GRIB-API command line tools.
31
32Compilation simply requires setting the `GRIB_API` variable in the *Makefile*
33to the local location, and then invoking *make*
34
35----
36
37## Usage
38
39Usage is pretty straightforward, requiring flags `--source` and `--levels`,
40where the source is expected to be `ECMWF`, `NCEP` or `NCEPFV3`.
41Originally, we assumed GRIB2 (all messages) files, but we've learned that
42is not valid for EF files, so `--checkgrib2` is now an optional test, not
43performed by default.  The number of levels should be the expected number
44of levels for 3D variables.
45
46The utility currently checks that all variables required for FLEXPART
47(including dry and wet deposition) are in the GRIB file, that all expected
48levels are included, that the source is, indeed the expected source.
49
50There are many more ambitious (and time-consuming) checks that could be
51added in the future.  For example, the GRIB messages have fields for max,
52min and mean values, and we could actually read in each field and verify these.
53
54The utility prints a simple message to *stdout* if all is well with a file,
55and if an error is found, a message is printed to *stdout*, and the utility
56aborts with a system return code of 1 to indicate a problem with the GRIB file,
57 and a 2 to indicate a problem with the utility.
58
59Some examples follow:
60
61### A normal, successful test
62```
63$ ./checkGRIB --source ECMWF --levels 137 test/gribfiles/ecmwf0p5/EN19011112
64 Passed: test/gribfiles/ecmwf0p5/EN19011112
65
66$ echo $?
670
68```
69
70### Checking with an incorrect source parameter
71
72```
73$ ./checkGRIB --source NCEP --levels 137 test/gribfiles/ecmwf0p5/EN19011112
74 ERROR: Unexpected grib_centre:           98
75 Failed on: test/gribfiles/ecmwf0p5/EN19011112
76
77$ echo $?
781
79```
80
81### Non-existent file
82
83```
84$ ./checkGRIB --source NCEP --levels 137 Huh???
85GRIB_API ERROR   :  IO ERROR: No such file or directory: Huh??? (No such file or directory)
86 ERROR: problem opening GRIB file: Huh???
87 Failed on: Huh???
88
89$ echo $?
902
91```
92
93### Finding more levels than expected
94
95```
96$ ./checkGRIB --source NCEP --levels 27 test/gribfiles/ncep0p5/GD19010818
97ERROR: Found more than  27 levels
98 Failed on: test/gribfiles/ncep0p5/GD19010818
99
100$ echo $?
1011
102```
103
104### Finding less levels than expected
105
106```
107$ ./checkGRIB --source NCEP --levels 48 test/gribfiles/ncep0p5/GD19010818
108ERROR: Only found   31 levels
109 Failed on: test/gribfiles/ncep0p5/GD19010818
110
111$ echo $?
1121
113```
114----
115
116## Structure
117
118All of the GRIB-related code is in the module, *cgutils.F90*.  There is a
119lot of "file-specific" code in here for ECMWF, NCEP  and NCEP FV3 GRIB files. 
120The main program, *checkGRIB.F90* merely collects and parses command
121line arguments, and invokes appropriate routines in the *cgutils* module.
122
123----
124
125## Testing
126
127A comprehensive testing package, located in the *test* subdirectory, 
128was developed to test this utility under a wide variety of conditions,
129including successful execution, bad command line arguments, missing levels
130and missing variables, etc.  Obviously, not every potential problem is tested,
131but a broad assortment of "spot-checking" leaves me confident that this
132is robust. 
133
134The testing package requires the default Python v2.7 found on *devlan*
135(it also runs with Python v3.6), and the command line grib tools from
136GRIB-API.  The grib tools are required because some of the tests require
137taking a good GRIB file and making it bad.  I decided it would be better
138to do this programatically rather than requiring a large number of GRIB
139files for testing.
140
141
142
143
144
145The current tests include:
146
147```
148--------test_missing_source_cli_arg_detected--------
149--------test_invalid_source_cli_arg_detected-------
150--------test_missing_levels_cli_arg_detected--------
151--------test_checkanl_and_checkfcst_detected--------
152--------test_reports_no_cli_args--------
153--------test_reports_no_path_args--------
154--------test_successful_ecmwfgrib2_check--------
155--------test_successful_ecmwfgrib_ef_check--------
156--------test_reports_ecmwfgrib_ef_fails_with_checkgrib2--------
157--------test_reports_unexpected_ecmwf_grib_center--------
158--------test_reports_unexpected_ecmwf_level--------
159--------test_reports_missing_ecmwf_levels--------
160--------test_reports_failed_ecmwf_anl_check--------
161--------test_reports_failed_ecmwf_fcst_check--------
162--------test_ecmwf_detects_missing_level_14--------
163--------test_ecmwf_detects_missing_level_14_etadot--------
164--------test_ecmwf_detects_missing_var_lsp--------
165--------test_ecmwf_detects_grib1_nsss--------
166--------test_reports_unexpected_ncep_grib_center--------
167--------test_successful_ncepgrib2_check--------
168--------test_detects_ncep_more_pressure_levels_than_expected------
169--------test_detects_ncep_fewer_pressure_levels_than_expected------
170--------test_ncep_detects_missing_level_850------
171--------test_ncep_detects_missing_level_100_r------
172--------test_ncep_detects_missing_var_10u------
173--------test_ncep_detects_missing_var_tsig1------
174--------test_ncep_detects_grib1_2t------
175--------test_detects_ncepfv3_more_pressure_levels_than_expected------
176--------test_detects_ncepfv3_fewer_pressure_levels_than_expected------
177--------test_ncepfv3_detects_missing_level_850------
178--------test_ncepfv3_detects_missing_level_100_r------
179--------test_ncepfv3_detects_missing_var_10u------
180--------test_ncepfv3_detects_missing_var_tsig1------
181--------test_ncepfv3_detects_grib1_2t------
182```
183
184Successful execution of the tests looks like
185
186```
187$ ./checkgrib_test.py
188Compiling checkGRIB in dir: /home/morton/git/MyFlexpartTools/CTBTO_SWEATM/WO02/checkgrib
189compile_passed: True
190--------test_missing_source_cli_arg_detected--------
191 args missing or out of sync
192 
193 Usage:
194 
195 checkgrib --source [ECMWF | NCEP | NCEPFV3] --levels <int>
196           [ --checkfcst | --checkanl | --checkgrib2 ]
197           path1 path2 ...
198 
199passed: True
200-----------------------------------------------
201.
202.
203.
204--------test_ecmwf_detects_missing_var_lsp--------
205 ERROR: lsp: not found
206 Failed on: /tmp/1fbd642c-316e-4fc6-910a-24870fc2611b.gr2
207passed: True
208-----------------------------------------------
209.
210.
211.
212--------test_ncep_detects_grib1_2t------
213 Grib message not GRIB2
214 ERROR: Bad gribmsg_shortname: 2t, gribmsg_level:            2
215 Failed on: /tmp/badeefa8-2ab8-4474-849c-45ac2ec4ac44.gr2
216passed: True
217-----------------------------------------------
218.
219.
220.
221--------test_ncepfv3_detects_grib1_2t------
222 Grib message not GRIB2
223 ERROR: Bad gribmsg_shortname: 2t, gribmsg_level:            2
224 Failed on: /tmp/da9dd1fc-d843-4b9e-ab74-82ea8769fbe3.gr2
225passed: True
226-----------------------------------------------
227
228************************
229Passed tests: 35
230Failed tests: 0
231************************
232
233```
234
235
236## Testing files
237
238The testing package requires a set of GRIB files.  I have this set up
239so that it can use a prepared package of files or files in place at
240CTBTO
241
242### Prepared test files
243
244These are available at http://borealscicomp.com/CTBTO_SWEATM/checkgrib_testfiles/gribfiles/.  Unfortunately, because they contain ECMWF files and ECMWF is
245very strict about posting of such files, I have to make these non-readable.
246If somebody wants to retrieve them, they should notify me and I can make them
247temporarily readable.  Once readable, they can be placed in the test
248package by going to the *checkgrib/test/* directory, then
249
250```
251$ wget --recursive --no-parent --cut-dirs=2 -nH -R "index.html*" --execute robots=off http://borealscicomp.com/CTBTO_SWEATM/checkgrib_testfiles/gribfiles
252```
253
254Then, in *checkgrib_test.py* be sure to set the following
255
256```
257ECMWF_PREFIX = 'gribfiles/ecmwf0p5'
258NCEP_PREFIX = 'gribfiles/ncep0p5'
259NCEPFV3_PREFIX = 'gribfiles/ncepfv30p5'
260```
261
262
263There is a *gribfiles* entry in *test/.gitignore* so that these large files
264won't be committed to the repo.
265
266### Using CTBTO files
267
268#WARNING - this section is not relevant right now#
269
270The ncep subdir structure now has subdirectories *0.5* and
271*0.5.fv3*, which would require some recoding to make this work
272correctly.
273
274In *checkgrib_test.py* one would want to set (for example)
275
276```
277CTBTO_PREFIX = '/ops/data/atm'
278ECMWF_PREFIX = CTBTO_PREFIX + '/ecmwf/2019/01/11/0.5'
279NCEP_PREFIX = CTBTO_PREFIX + '/ncep/2019/01/08/0.5'
280```
281
282Of course, you would want to make sure that the paths are actually valid,
283as they may change over time.
284
285
286----
287
288## Notes
289
290
291### ECMWF notes
292
293* Depending on the version of *GRIB-API* used, snow depth in ECMWF files may
294have a *shortName* of  *sd* or *sde.*  The utility will handle both cases. 
295
296* EF files seem to have the old standard GRIB2 for model level variables
297and GRIB1 for the surface variables, and tests have been adjusted to
298account for this.
299
300
301
302
303
304
305### NCEP notes
306
307* Vertical wind, *w*, seems to be only available at pressure levels of 100mb
308and lower in altitude, so that's all I'm checking for, and it's done in its
309own function, *ncep_all_expected_w_levels_present()*.  I have verified
310through my own operations with other NCEP files that *w* is not available
311for levels higher than 100mb.
312
313* It doesn't seem like total precipitation, *tc*, convective precipitation,
314*acpcp*, or total cloud cover, *tcc* are available in the NCEP files being
315downloaded to CTBTO.  They used to be available.  So, I'm not checking for
316those right now (I've commented out the checks in *ncep_all_2dvars_present()*). 
317* For 2-meter RH, one needs to look for leveltype *heightAboveGround* and
318then, for older GRIB-API, look for *r* specifically at *level* 2, because
319there are other messages with *shortName* of *r*.   But, for newer GRIB-API
320the *shortName* to look for is a unique *2r* (also at *level* 2, but that
321doesn't really matter once you have the *2r*).  The utility handles both
322situations.
323
324
325
326### NCEP FV3 notes
327
328* The notes concerning NCEP, above, apply. 
329
330* NCEP has added 15 mb and 40 mb pressure levels to the GRIB files, but only for
331the temperature variable and some other variables we don't care about.  This
332ends up breaking the NCEP-checking code, and FV3-specific code has been added.
333
334
335
336### forecast vs analysis messages
337
338This got complicated and confusing and is not complete, pending more
339complete future understanding of how we really want to categorise a GRIB
340file as being *analysis* or *forecast*
341
342* GRIB messages are checked for analysis or forecast by looking at the
343*dataType* field for values of *an* or *fc*.  I'm a little confused on
344these values, and I have some suspicion that they may not always be correct. 
345There is also a *forecastTime* field that "seems" to be zero for what might
346be analysis messages and nonzero (number of hours since last analysis)
347for forecast messages.
348
349* For ECMWF, it appears that the analysis files (as determined by a
350*dataType* of *an* are 12Z.  Most of the messages are *an* in the 12Z files,
351yet there are six messages in these files that are *fc*:
352
353
354```
3552            ecmf         20190111     fc           regular_ll   surface      0            0            lsp          grid_jpeg   
3562            ecmf         20190111     fc           regular_ll   surface      0            0            acpcp        grid_jpeg   
3572            ecmf         20190111     fc           regular_ll   surface      0            0            sshf         grid_jpeg   
3582            ecmf         20190111     fc           regular_ll   surface      0            0            ewss         grid_jpeg   
3592            ecmf         20190111     fc           regular_ll   surface      0            0            nsss         grid_jpeg   
3602            ecmf         20190111     fc           regular_ll   surface      0            0            ssr          grid_jpeg   
361```
362
363
364* Meanwhile, the non-12Z ECMWF files seem to have *dataType* of *fc* for all
365messages except for four:
366
367
368```
3692            ecmf         20190111     an           regular_ll   surface      0            0            sdor         grid_jpeg   
3702            ecmf         20190111     an           regular_ll   surface      0            0            cvl          grid_jpeg   
3712            ecmf         20190111     an           regular_ll   surface      0            0            cvh          grid_jpeg   
3722            ecmf         20190111     an           regular_ll   surface      0            0            sr           grid_jpeg   
373```
374
375
376The problem with all of this is that a test for a successful *fcst* or *anl*
377file will always fail, since they are currently mixed.
378
379* For NCEP files, it appears that all messages in all files are *fc*.
380
381Meanwhile, correspondence with both Leo and Henrik suggest that people
382"generally" assume that 00/06/12/18 files are *analysis* (though some
383messages will still be *forecast*).  So, the problems are
384
385* For ECMWF files, I "think" we will always have a mixture of *analysis*
386and *forecast* messages
387* The use of *dataType* values of *fc* or *an* doesn't seem to correlate
388with the expected 00/06/12/18 *analyses* files.
389
390So, in short, we need to better define what we want to look for. 
391The code is generally in the utility, but would need to be modified for
392specific definitions.
393
Note: See TracBrowser for help on using the repository browser.
hosted by ZAMG