Another problem. Apparently I should have derived TMN and TMX from DTR and TMP, as that's
what v2.10 did and that's what people expect. I disagree with publishing datasets that are
simple arithmetic derivations of other datasets published at the same time, when the real
data could be published instead.. but no.
This introduces the problem of derivation. TMN = TMP -DTR/2 and TMX = TMP + DTR/2, but this
does not tell us what to do when either or both values (TMP, DTR) are missing. One thing to
check is the climatologies. Here are the first two cell normals for all four parameters:
Well, making allowances for rounding errors, they do seem to hold to the relationship.
Wrote maketmnx.for to derive TMN and TMX from TMP and DTR grids. Works with the output files
from glo2abs.for. Ran makegrids to produce .dat and .nc files ( still pre-station count
inclusion).
On to precip problems. Tim O ran some comparisons between 2.10 and 3.00, in general things are
much improved but there are a few hair raisers (asterisked for special concern):
Cape Verde Isl, MAM & DJF, P
Galapagos, All, T&P
Guinea, MAM 1901, P
Bangladesh, All 1991-2000, P **
Bhutan, DJF 1939 & 1945, P
Laos/Vietnam, DJF 1991, P **
Looked at Bangladesh first. Here, the 1990s show a sudden drop that really can only be some
stations having data a factor of 10 too low. This ties in with the WWR station data that DL
added for 1991-2000, which aprently was prone to scaling issues. Wrote stnx10.for to scale
a file of WWR Bangladesh records, then manually C&P'd the decade over the erroneous ones in
the database. Also fixed country name from 'BNGLADESH'!
Then Laos/Vietnam. Here we have an anomalously high peak for 1991 DJF. Used getllstations.for
to extract all stations in a box around Laos & Vietnam (8 to 25N, 100 to 110E), a total of 96
stations from Thailand, Vietnam, Laos, Kampuchea, and China. Eeeek. Tim O's program only worked
with boxes though. Also, I'm not 100% sure which year DJF belongs to in Tim's world.. hopefully
it's the December year (as it was the fourth column in his plot table). However.. plotted *all*
the data as overlapping years, and there is no trace of a spike in DJF. Uh-oh.
I'm not actually convinced that the 'country box' approach is much cop. Better to examine each
land cell and automagically mark any with excessions? Say 5 SD to begin with. Could then be
extra clever and pull the relevant stations and find the source of the excession? Of course, this
shouldn't happen, since there is a 4SD limit imposed by anomdtb.f90 for precip (3SD for others).
Wrote vietlaos.for to run through the lists of Vietnam and Laos cells (provided by Tim O) and
extract the DJF precip values for each (from the 1901-2006 gridded file). It then calculates the
standard deviation of each series, normalises, and notes any values over 6.0 SDs (1991 onwards).
Index 273 can be related to time as follows. The series begins in 1901 and we take three values
per year (J,F,D). So 1990 would be the 90th year and the 268th-270th values. Thus 273 = Dec 1991.
The cells are all contiguous, implying a single station's influence via the gridding process:
'n/a' means the cell isn't in the Laos or Vietnam areas.
The 'epicentre' of the anomaly looks to be cell (213,571), which is in the Laos file:
Box Column Row Lon Lat
205773 571 213 105.75 16.75
So we're looking for stations in the vicinity of 105.75E, 16.75N. Well the precip database has a
total of EIGHT Laos stations, so that should be straightforward:
Summary: LUANG PRABANG shows a significant anomaly of 1372 for Dec 1992. Unfortunately, this
finds echoes both temporal (1994 has 816) and spatial (SAYABOURY's 1992 is 879). So, if these
values are causing the spike, it's genuine (if exaggerated in a way yet to be determined).
Wrote vietlaos2, to gather data from the cells AND stations. It also gets the climatology. Initially
it only gathered 13 stations with data in 1991/2, using 'VIETNAM' and 'LAOS' to select on country
name. However, taking the cell [214,574] in December 1991 as the peak incident, we can use those
coordinates (17.25N, 107.25E) to centre a bounding box for station selection. A box 10degs square
yields only 17 stations, none of which have anything remotely spikey in Dec 1991. A box 20degs
square (some would say unfeasibly large) yields 98 stations, one of which does have a bit of a spike
in Dec 91.. not impressively so though, and it's a long way away:
Over 10.5 degrees South and over 7 degrees West of the target cell. Not very convincing, especially
as closer stations are bound to have masked it.
One FINAL try with vietlaos3.for. Just looking at December, now, and getting the original station
normals as well as the climatological ones. The whole chain. This proves to be surprisingly
complicated.
On a parallel track (this would really have been better as a blog), Tim O has found that the binary
grids of primary vars (used in synthetic production of secondary parameters) should be produced with
'binfac' set to 10 for TMP and DTR. This may explain the poor performance and coverage of VAP in
particular.
Back to VietLaos.. the station output from vietlaos3.for had a couple of stations with missing
anomaly values:
LAT LON ALT NORM VAL ANOM
17.15 104.13 171.00 29.00 62.00 -9999.00
15.80 102.03 182.00 45.00 40.70 -9999.00
I eventually worked out that I hadn't collapsed a universal probability, it was just the 4 standard
deviation screen in anomdtb (4 for precip, 3 for temp). To confirm, I did a short anomdtb run (just
for 1991) with the sd limit set to 10, and sure enough:
They both look high enough to trigger the 4sd cap. However, since the spike we're investigating is
from a regular process run, where that limit was in place, we can't use those values. Program is thus
amended to omit any stations without anomalies (for Dec 1991)
Next issue is to make sense of the output. The first line from the station file is (headings added):
LAT LON ALT NORM VAL ANOM
22.60 114.10 25.00 29.00 21.60 -25.50
Remembering it's percentage anomalies! So 25.5% of 29 is 29*.255 = 7.395. Add that to 21.6? 29.0 :-)
By contract, the cell file looks like this:
ROW COL LAT LON VAL NORM
220 561 20.25 100.75 12.90 15.00
There are 63 stations and 204 cells (196 when missing values (sea) eliminated). I guess one approach
would be to grid the anomalies, to see if a peak is visible. I did. It is. The simple interpolation
in Matlab puts the peak at 17.25N, 105.25E - matches the grid peak for lat and a little west for lon.
The nearest high station anomaly is 2369.2, that's from:
Note that the Dec 1991 value is anomalous, but not as extreme as the 1945 datum,
which would get the same treatment with normals and climatologies, so should
produce an even bigger spike for 1945 DJF! Unless of course it's screened out by
the 4SD rule.. which it is! Well - no value in pre.1945.12.txt for this location.
Anyway.. this is the highest value in the Vietnam/Laos cells for Dec 1991:
ROW COL LAT LON VAL NORM
198 571 9.25 105.75 63.50 130.00
With a normal of 130, that makes the anomaly -48.85. Now I'm confused. How can
an anomalously high value be well below the 61-90 mean? Aaarrgghhhh. Perhaps I
should look at the highest anomaly. That turns out to be 80, from here:
216 563 18.25 101.75 1.80 1.00
Not exactly a show stopper. Time to look at the .glo files, which glo2abs processes
into absolutes. Here's a Far-Eastern region with a spike:
The spike is at [213,569]. Yes, I know, it's the n-th set of coordinates. You should see the
plots! But looking at the anomalies is the closest we'll get to what Tim's program was doing,
ie, calculating DJF standard deviations. Or something. Now, the coordinates are 16.75N, 104.75E.
And wouldn't you know it, our prime suspect (see above) is on top of it:
So OK, here we go with the full run-down for December 1991, in the 16.75N,105.75E region:
TYPE VALUE COMMENT
Raw data: 321 Highest unscreened December for this station (67 years)
Normal: 13 Looks right - of course, very low for the target data!
Anomaly: 2369.2 Correctly calculated
Gridded anomaly: 2252.6 Believable interpolation
Gridded actual: er... Strangely, it seems to be 0.
Ah well - had enough. It looks like it's an extreme but believable event in a Thai station, let's
leave it like that. Re-running precip, with the new updated database pre.0803271802.dtb:
> ***** AnomDTB: converts .dtb to anom .txt for gridding *****
> Enter the suffix of the variable required:
.pre
> Will calculate percentage anomalies.
> Select the .cts or .dtb file to load:
pre.0803271802.dtb
> Specify the start,end of the normals period:
1961,1990
> Specify the missing percentage permitted:
25
> Data required for a normal: 23
> Specify the no. of stdevs at which to reject data:
4
> Select outputs (1=.cts,2=.ann,3=.txt,4=.stn):
3
> Check for duplicate stns after anomalising? (0=no,>0=km range)
0
> Select the generic .txt file to save (yy.mm=auto):
pre.txt
> Select the first,last years AD to save:
1901,2006
> Operating...
/tmp_mnt/cru-auto/cruts/version_3_0/primaries/precip/pre.0803271802.dtb
> NORMALS MEAN percent STDEV percent
> .dtb 7315040 73.8
> .cts 299359 3.0 7613600 76.8
> PROCESS DECISION percent %of-chk
> no lat/lon 17911 0.2 0.2
> no normal 2355275 23.8 23.8
> out-of-range 13249 0.1 0.2
> accepted 7521017 75.9
> Dumping years 1901-2006 to .txt files...
crua6[/cru/cruts/version_3_0/primaries/precip] ./glo2abs
Welcome! This is the GLO2ABS program.
I will create a set of absolute grids from
a set of anomaly grids (in .glo format), also
a gridded version of the climatology.
Enter the path and name of the normals file: clim.6190.lan.pre
Enter a name for the gridded climatology file: clim.6190.lan.pre.grid4
Enter the path and stem of the .glo files: preglo/pregrid.
Enter the starting year: 1901
Enter the ending year: 2006
Enter the path (if any) for the output files: preabs/
Now, CONCENTRATE. Addition or Percentage (A/P)? P
Do you wish to limit the output values? (Y/N): Y
1. Set minimum to zero
2. Set single minimum and maximum values
3. Set minima and maxima based on days in month
4. Set integer values >=1, (ie, positive)
5. Changed my mind, no limits
Choose: 1
Right, erm.. off I jolly well go!
pregrid.01.1901.glo
(etc)
pregrid.12.2006.glo
uealogin1[/cru/cruts/version_3_0/primaries/precip] ./makegrids
Welcome! This is the MAKEGRIDS program.
I will create decadal and full gridded files,
in both ASCII text and NetCDF formats, from
the output files of (eg) glo2abs.for.
Enter a gridfile with YYYY for year and MM for month: preabs/pregrid.YYYY.MM.glo.abs
Enter Start Year: 1901
Enter Start Month: 01
Enter End Year: 2006
Enter End Month: 12
Please enter a sample OUTPUT filename, replacing
start year with SSSS and end year with EEEE, and
ending with '.dat', eg: cru_ts_3_00.SSSS.EEEE.tmp.dat : cru_ts_3_00.SSSS.EEEE.pre.dat
Now please enter the 3-ch parameter code: pre
Enter a generic title for this dataset, eg:
CRU TS 3.00 Mean Temperature : CRU TS 3.00 Precipitation
Writing: cru_ts_3_00.1901.1910.pre.dat
cru_ts_3_00.1901.1910.pre.nc
Writing: cru_ts_3_00.1911.1920.pre.dat
cru_ts_3_00.1911.1920.pre.nc
Writing: cru_ts_3_00.1921.1930.pre.dat
cru_ts_3_00.1921.1930.pre.nc
Writing: cru_ts_3_00.1931.1940.pre.dat
cru_ts_3_00.1931.1940.pre.nc
Writing: cru_ts_3_00.1941.1950.pre.dat
cru_ts_3_00.1941.1950.pre.nc
Writing: cru_ts_3_00.1951.1960.pre.dat
cru_ts_3_00.1951.1960.pre.nc
Writing: cru_ts_3_00.1961.1970.pre.dat
cru_ts_3_00.1961.1970.pre.nc
Writing: cru_ts_3_00.1971.1980.pre.dat
cru_ts_3_00.1971.1980.pre.nc
Writing: cru_ts_3_00.1981.1990.pre.dat
cru_ts_3_00.1981.1990.pre.nc
Writing: cru_ts_3_00.1991.2000.pre.dat
cru_ts_3_00.1991.2000.pre.nc
Writing: cru_ts_3_00.2001.2006.pre.dat
cru_ts_3_00.2001.2006.pre.nc
<END_QUOTE>
On to the reproduction of binaries for TMP and DTR, and subsequent regeneration of VAP and FRS.
crua6[/cru/cruts/version_3_0/secondaries/vap] ./glo2abs
Welcome! This is the GLO2ABS program.
I will create a set of absolute grids from
a set of anomaly grids (in .glo format), also
a gridded version of the climatology.
Enter the path and name of the normals file: clim.6190.lan.vap
Enter a name for the gridded climatology file: clim.6190.lan.vap.grid4
Enter the path and stem of the .glo files: vapglo/vap.
Enter the starting year: 1901
Enter the ending year: 2006
Enter the path (if any) for the output files: vapabs/
Now, CONCENTRATE. Addition or Percentage (A/P)? A
Do you wish to limit the output values? (Y/N): Y
1. Set minimum to zero
2. Set single minimum and maximum values
3. Set minima and maxima based on days in month
4. Set integer values >=1, (ie, positive)
5. Changed my mind, no limits
Choose: 4
Right, erm.. off I jolly well go!
vap.01.1901.glo
(etc)
vap.12.2006.glo
VAP Output Files:
uealogin1[/cru/cruts/version_3_0/secondaries/vap] ./makegrids
Welcome! This is the MAKEGRIDS program.
I will create decadal and full gridded files,
in both ASCII text and NetCDF formats, from
the output files of (eg) glo2abs.for.
Enter a gridfile with YYYY for year and MM for month: vapabs/vap.YYYY.MM.glo.abs
Enter Start Year: 1901
Enter Start Month: 01
Enter End Year: 2006
Enter End Month: 12
Please enter a sample OUTPUT filename, replacing
start year with SSSS and end year with EEEE, and
ending with '.dat', eg: cru_ts_3_00.SSSS.EEEE.tmp.dat : cru_ts_3_00.SSSS.EEEE.vap.dat
Now please enter the 3-ch parameter code: vap
Enter a generic title for this dataset, eg:
CRU TS 3.00 Mean Temperature : CRU TS 3.00 Vapour Pressure
Writing: cru_ts_3_00.1901.1910.vap.dat
cru_ts_3_00.1901.1910.vap.nc
Writing: cru_ts_3_00.1911.1920.vap.dat
cru_ts_3_00.1911.1920.vap.nc
Writing: cru_ts_3_00.1921.1930.vap.dat
cru_ts_3_00.1921.1930.vap.nc
Writing: cru_ts_3_00.1931.1940.vap.dat
cru_ts_3_00.1931.1940.vap.nc
Writing: cru_ts_3_00.1941.1950.vap.dat
cru_ts_3_00.1941.1950.vap.nc
Writing: cru_ts_3_00.1951.1960.vap.dat
cru_ts_3_00.1951.1960.vap.nc
Writing: cru_ts_3_00.1961.1970.vap.dat
cru_ts_3_00.1961.1970.vap.nc
Writing: cru_ts_3_00.1971.1980.vap.dat
cru_ts_3_00.1971.1980.vap.nc
Writing: cru_ts_3_00.1981.1990.vap.dat
cru_ts_3_00.1981.1990.vap.nc
Writing: cru_ts_3_00.1991.2000.vap.dat
cru_ts_3_00.1991.2000.vap.nc
Writing: cru_ts_3_00.2001.2006.vap.dat
cru_ts_3_00.2001.2006.vap.nc