The HARRY_READ_ME.txt file

Part 35u

Onto whole runs of the update program. With a lot of debugging!

Discovered that WMO codes are still a pain in the arse. And that I'd forgotten to match Australian
updates by BOM code (last field in header) instead of WMO code - so I had to modify newmergedbauto.
Also found that running fixwmos.for was less than successful on VAP, because it's already screwed:

uealogin1[/cru/cruts/version_3_0/update_top/db/vap] grep -i 'jan mayen' vap.0804231150.dtb
0100100 7093 -867 9 JAN MAYEN(NOR-NAVY) NORWAY 2003 2007 -999 0
1001000 7093 -866 9 JAN MAYEN(NOR NAVY) NORWAY 1971 2003 -999 -999
uealogin1[/cru/cruts/version_3_0/update_top/db/vap]

Started work on fixdupes.for, to cleanse a given database of obvious duplicate stations, but self-
diverted back onto getting the whole update process compiled and running end to end. Almost
immediately found that match rated in the merging were mixed. Added a section to newmergedbauto
that did a quick matchmaking exercise on any update stations that failed the code matching. Just
lat/lon and character fields really. Didn't seem to make a lot of difference. Here are the merge
results for all updates and parameters, in the order they would have happened:


uealogin1[/cru/cruts/version_3_0/update_top/logs/logs.0903091631] tail merg.mcdw.tmp.0903091631.log
OUTPUT(S) WRITTEN

New master database: updates/MCDW/db/db.0903091631/int1.tmp.0903091631.dtb

Update database stations: 2802
> Matched with Master stations: 1759
(automatically: 1759)
(by operator: 0)
> Added as new Master stations: 1043
> Rejected: 0
uealogin1[/cru/cruts/version_3_0/update_top/logs/logs.0903091631] tail merg.mcdw.pre.0903091631.log
OUTPUT(S) WRITTEN

New master database: updates/MCDW/db/db.0903091631/int1.pre.0903091631.dtb

Update database stations: 2807
> Matched with Master stations: 2783
(automatically: 2783)
(by operator: 0)
> Added as new Master stations: 24
> Rejected: 0
uealogin1[/cru/cruts/version_3_0/update_top/logs/logs.0903091631] tail merg.mcdw.vap.0903091631.log

New master database: updates/MCDW/db/db.0903091631/int1.vap.0903091631.dtb

Update database stations: 2804
> Matched with Master stations: 2677
(automatically: 2677)
(by operator: 0)
> Added as new Master stations: 124
> Rejected: 3
Rejects file: updates/MCDW/db/db.0903091631/mcdw.vap.0903091631.dtb.rejected
uealogin1[/cru/cruts/version_3_0/update_top/logs/logs.0903091631] tail merg.mcdw.wet.0903091631.log

New master database: updates/MCDW/db/db.0903091631/int1.wet.0903091631.dtb

Update database stations: 2801
> Matched with Master stations: 2634
(automatically: 2634)
(by operator: 0)
> Added as new Master stations: 163
> Rejected: 4
Rejects file: updates/MCDW/db/db.0903091631/mcdw.wet.0903091631.dtb.rejected
uealogin1[/cru/cruts/version_3_0/update_top/logs/logs.0903091631] tail merg.mcdw.cld.0903091631.log

New master database: updates/MCDW/db/db.0903091631/int1.cld.0903091631.dtb

Update database stations: 2204
> Matched with Master stations: 2199
(automatically: 2199)
(by operator: 0)
> Added as new Master stations: 0
> Rejected: 5
Rejects file: updates/MCDW/db/db.0903091631/mcdw.cld.0903091631.dtb.rejected
uealogin1[/cru/cruts/version_3_0/update_top/logs/logs.0903091631] tail merg.climat.tmp.0903091631.log

New master database: updates/CLIMAT/db/db.0903091631/int2.tmp.0903091631.dtb

Update database stations: 3065
> Matched with Master stations: 2629
(automatically: 2629)
(by operator: 0)
> Added as new Master stations: 345
> Rejected: 91
Rejects file: updates/CLIMAT/db/db.0903091631/climat.tmp.0903091631.dtb.rejected
uealogin1[/cru/cruts/version_3_0/update_top/logs/logs.0903091631] tail merg.climat.vap.0903091631.log

New master database: updates/CLIMAT/db/db.0903091631/int2.vap.0903091631.dtb

Update database stations: 3039
> Matched with Master stations: 2912
(automatically: 2912)
(by operator: 0)
> Added as new Master stations: 38
> Rejected: 89
Rejects file: updates/CLIMAT/db/db.0903091631/climat.vap.0903091631.dtb.rejected
uealogin1[/cru/cruts/version_3_0/update_top/logs/logs.0903091631] tail merg.climat.wet.0903091631.log

New master database: updates/CLIMAT/db/db.0903091631/int2.wet.0903091631.dtb

Update database stations: 3047
> Matched with Master stations: 2718
(automatically: 2718)
(by operator: 0)
> Added as new Master stations: 232
> Rejected: 97
Rejects file: updates/CLIMAT/db/db.0903091631/climat.wet.0903091631.dtb.rejected
uealogin1[/cru/cruts/version_3_0/update_top/logs/logs.0903091631] tail merg.climat.pre.0903091631.log

New master database: updates/CLIMAT/db/db.0903091631/int2.pre.0903091631.dtb

Update database stations: 3054
> Matched with Master stations: 2801
(automatically: 2801)
(by operator: 0)
> Added as new Master stations: 229
> Rejected: 24
Rejects file: updates/CLIMAT/db/db.0903091631/climat.pre.0903091631.dtb.rejected
uealogin1[/cru/cruts/version_3_0/update_top/logs/logs.0903091631] tail merg.climat.cld.0903091631.log

New master database: updates/CLIMAT/db/db.0903091631/int2.cld.0903091631.dtb

Update database stations: 2038
> Matched with Master stations: 1964
(automatically: 1964)
(by operator: 0)
> Added as new Master stations: 71
> Rejected: 3
Rejects file: updates/CLIMAT/db/db.0903091631/climat.cld.0903091631.dtb.rejected
uealogin1[/cru/cruts/version_3_0/update_top/logs/logs.0903091631] tail merg.climat.tmn.0903091631.log

New master database: updates/CLIMAT/db/db.0903091631/int2.tmn.0903091631.dtb

Update database stations: 2921
> Matched with Master stations: 2406
(automatically: 2406)
(by operator: 0)
> Added as new Master stations: 387
> Rejected: 128
Rejects file: updates/CLIMAT/db/db.0903091631/climat.tmn.0903091631.dtb.rejected
uealogin1[/cru/cruts/version_3_0/update_top/logs/logs.0903091631] tail merg.climat.tmx.0903091631.log

New master database: updates/CLIMAT/db/db.0903091631/int2.tmx.0903091631.dtb

Update database stations: 2921
> Matched with Master stations: 2406
(automatically: 2406)
(by operator: 0)
> Added as new Master stations: 387
> Rejected: 128
Rejects file: updates/CLIMAT/db/db.0903091631/climat.tmx.0903091631.dtb.rejected
uealogin1[/cru/cruts/version_3_0/update_top/logs/logs.0903091631] tail merg.bom.tmn.0903091631.log

New master database: updates/BOM/db/db.0903091631/int3.tmn.0903091631.dtb

Update database stations: 906
> Matched with Master stations: 783
(automatically: 783)
(by operator: 0)
> Added as new Master stations: 120
> Rejected: 3
Rejects file: updates/BOM/db/db.0903091631/bom.tmn.0903091631.dtb.rejected
uealogin1[/cru/cruts/version_3_0/update_top/logs/logs.0903091631] tail merg.bom.tmx.0903091631.log

New master database: updates/BOM/db/db.0903091631/int3.tmx.0903091631.dtb

Update database stations: 906
> Matched with Master stations: 783
(automatically: 783)
(by operator: 0)
> Added as new Master stations: 120
> Rejected: 3
Rejects file: updates/BOM/db/db.0903091631/bom.tmx.0903091631.dtb.rejected
uealogin1[/cru/cruts/version_3_0/update_top/logs/logs.0903091631]


Probably the worst story is temperature, particularly for MCDW. Over 1000 new stations! Highly
unlikely. I am tempted to blame the different lat/lon scale, but for now it will have to rest.

Still hitting the problem with TMP lats and lons being a mix of deg*10 and deg*100, it's screwing
up the station counts work (of course). Unfortunately, I did some tests and the 'original' TMP
database has the trouble, it's not my update suite :-(((

Then.. I worked it out. Sample headers from the 'original' TMP db tmp.0705101334.dtb:

10010 709 -87 10 Jan Mayen NORWAY 1921 2006 341921 -999.00
10050 780 142 9 ISFJORD RADIO NORWAY 1912 1979 101912 -999.00
10080 783 155 28 Svalbard Lufthavn NORWAY 1911 2006 341911 -999.00
10100 693 162 -999 ANDENES 1868 1955 101868 -999.00
10250 697 189 10 TROMSO/LANGNES NORWAY 1949 2006 101949 -999.00
10260 697 189 100 Tromsoe NORWAY 1890 2006 341890 -999.00
10280 745 190 16 Bjoernoeya NORWAY 1920 2006 341920 -999.00

And from the fixwmos-wmo-fixed version tmp.0903081416.dtb:

0100100 7090 -870 10 Jan Mayen NORWAY 1921 2006 341921 -999.00
0100500 7800 1420 9 ISFJORD RADIO NORWAY 1912 1979 101912 -999.00
0100800 7830 1550 28 Svalbard Lufthavn NORWAY 1911 2006 341911 -999.00
0101000 6930 1620 -999 ANDENES 1868 1955 101868 -999.00
0102500 6970 1890 10 TROMSO/LANGNES NORWAY 1949 2006 101949 -999.00
0102600 6970 1890 100 Tromsoe NORWAY 1890 2006 341890 -999.00
0102800 7450 1900 16 Bjoernoeya NORWAY 1920 2006 341920 -999.00

FM! fixwmos fixed coordinates as well! Here's the snippet:

locfac = 10 ! init location factor assuming lat & lon are in degs*10
do i=1,10000
read(10,'(a86)',end=12)buffy
read(buffy,fmt)wmo,lat,lon,alt,sname,ctry,sy,ey,flag,extref
if (lat.gt.maxlat) maxlat = lat
if (lat.gt.900) goto 12
do j=sy,ey+norml
read(10,'()')
enddo
enddo
12 if (maxlat.gt.900) locfac = 1 ! lat & lon are in degs *100

So it was written with TMP in mind! Oh, for a memory. So we don't need to fret about TMP
lat/lon being *10 any more!!

And it's taken me until NOW to realise that the IDL synthetic generators (vap_gts_anom,
frs_gts_tdm, rd0_gts_anom) all need to calculate 1961-1990 normals! So they will need
TMP, DTR and/or PRE binary normals for 1961 to 1990. Which means anomalies will have to
be automatically generated for that period regardless of the requested period!!! *Cries*

Introduced suitable conditionals to ensure that 61-90 anomalies and gridded binaries are
automatically produced if the relevant secondary parameters are requested.

More run-time issues, VAP is still giving an apparently non-fatal error:

% Program caused arithmetic error: Floating underflow
% Program caused arithmetic error: Floating overflow
% Program caused arithmetic error: Floating illegal operand

(this only appears once the vap_gts_anom.pro program has finished, so can't be identified)

Stuck on WET production, getting an error from rd0_gts_anom_05.pro, from the same bit of
code that works fine in rd0_gts_anom.pro! The error:

% Attempt to subscript RD0SYN with NSEA is out of range.
% Execution halted at: RD0_GTS_ANOM_05 34 /cru-auto/cruts/version_3_0/BADC_AREA/programs/idl/rd0_gts_anom_05.pro
% $MAIN$

Line 34:

rd0syn=float(prenorm)*0.0 & rd0syn(nsea)=-9999

Do you know, I actually worked this one out by myself. Preen. It turned out that nsea was -1,
which meant that it was finding no hits here:

nsea =where(rd0norm eq -9999 or prenorm eq -9999)

When I looked - the min and max of rd0norm and prenorm were:

-32768 32514

..and I thought, what a coincidence, that's 2^16. Aha! Must be an Endian problem. Looked it up
on the web, abd the IDL ref manual, and found that adding:

,/swap_if_big_endian

..to the end of the openr statements in rdbin.pro, it all worked! :-)))

and then, of course, another problem I should have anticipated: half-degree gridding of synthetics
needs half-degree primary binaries. So the precip binaries must be half-degree for WET (after 1989)
and the usual 2.5-degrees earlier. More modifications to update.for!! And it took me a further 24
hours to cotton on that I'd need half-degree TMP and DTR binaries for FRS. VAP won't mind as it's
using the synthetics as an adjunct to the observations - the exceptions are those secondaries where
no observations can be used. WET after 1989, FRS, and CLD after 2002 (but CLD considerately works
from DTR anomalies, ungridded).

So I'm going to have to produce half-degree gridded binary TMP and DTR anomalies, adding HALF AN
HOUR to the run time. Bollocks. Though I could be clever and save it.. then I'd have to monitor
when 1961-1990 databases were altered, and compare, and.. wibble.

Got that done. Then got it all working (though outputs not tested). Whooo. Now for BADC.

...

Actually, BADC wasn't too bad. Took a day or so to get everything to compile, mainly having to
shift to gfortran rather than f77, and also to use -w to suppress warnings. Discovered that
the IDL there didn't look at IDL_STARTUP, bah, but then found a way to specify a startup file
on the command line, ie:

call system('idl -e @runs/runs.'//dtstr//'/'//
* 'idl.syn1.wet.txt.'//dtstr//'.dat -quiet') ! idl no startup

..so that's all right then. Got it all running without errors at BADC. Well, I say that, I'm
still getting this for VAP:

gridding PRE anomalies at 0.5 for synthetics
Producing secondary: VAP
% Program caused arithmetic error: Floating overflow
% Program caused arithmetic error: Floating illegal operand
gridding VAP anomalies and synthetics
Producing secondary: WET

I haven't been able to identify what's causing that. Um.

Anyway the next items are the tricky saving of 2.5 and 0.5 binaries for 1961-1990, only
regenerating then if the dbs have been altered. Requires multi-process cooperation, since we
can't tell from the database timestamps which years were potentially changed. Admittedly, with
this system that only accepts MCDW/CLIMAT/BOM updates, a pre-1991 change is all but impossible,
but build for the case you can't anticipate..! Also up next is the deconstruction of the early
cloud data (ie to 2002) so we can generate NetCDF files for the whole shebang. degroupcld.for
will do the honours.

Did CLD first. Having reverse-engineered gabs ('gridded absolute') files for 1901-2002, I
then modified update (extensively) to skip anything to do with CLD (including station counts)
before 2003. Then, at the anoms-to-absolutes stage, unzipped and copied over any pre-2003
CLD gabs files from the reference repository.

I suppose I'll have to do CLD station counts (just 'n' obviously) at some stage, too.

Ran update, just for CLD, just for 1901-06/2006. Realised halfway through that I'd really have
to do station counts as well because update does 'em for DTR anyway! That ought to cut out but
doesn't at the moment.

It's getting faster.. implementing the 'saved binaries' was easier than I thought as well. Lots
to change but straightforward. Now the IDL synthetics generators will always look in the
reference area for 1961-1990 gridded binaries, whether 2.5-degree or 0.5-degree. And those
datasets *should* be regenerated if flags are set that 1961-1990 data has been changed in the
databases.

Then, a big problem. Lots of stars ('*********') in the PET gridded absolutes. Wrote sidebyside.m
to display the five input parameters; VAP looks like being the culprit, with unfeasibly large
values (up to 10034 in fact). And that's after the standard /10. So, erm.. a drains-up on VAP
is now required. Oh, joy. And CLD also looks unacceptable, despite all that work - big patches
of 100% and 0% dominate, no doubt a result of clipping by glo2absauto. The clipping is necessary,
but shouldn't be needed so often!

Reassuringly, the 3_00 VAP and CLD that are published look fine, so it's soemthing I've done in
the automation process. Mis-scaling is most likely.

Started chaining back through (initially) VAP. The gabs files were identical to the finals (now,
if that had failed it would have been a problem!). The gridded anomaly files were a lot more
interesting, because although they looked just as bad, their max values were exactly 9999. That
ain't no coincidence!

Trailing fiurther back.. VAP anoms are OK, so suspicion falls on the synthetics. And lo and behold,
re-running the TMP and DTR 2.5-grid binary productions with quick_interp_tdm2 gives:

IDL> quick_interp_tdm2,2006,2006,'interim_data/gbins/gbins.0903201540/tmp/tmp.',1200,gs=2.5,pts_prefix='interim_data/anoms/anoms.0903201540/tmp/tmp.',dumpbin='dumpbin',startm=07,endm=12
Defaults set
2006
grid 2006 non-zero-8341.4424 8341.4424 3712.6726 cells= 74509
IDL> quick_interp_tdm2,2006,2006,'interim_data/gbins/gbins.0903201540/dtr/dtr.', 750,gs=2.5,pts_prefix='interim_data/anoms/anoms.0903201540/dtr/dtr.',dumpbin='dumpbin',startm=07,endm=12
Defaults set
2006
grid 2006 non-zero-9116.9639 9116.9639 2825.0928 cells= 68171
IDL>

Those strings of numbers? They're supposed to be mean, average magnitude, and std dev! Should
look something like this:

IDL> quick_interp_tdm2,2006,2006,'testdtrglo/dtr.',750,gs=0.5,pts_prefix='dtrtxt/dtr.',dumpglo='dumpglo',dumpbin='dumpbin'
Defaults set
2006
grid 2006 non-zero 0.1125 2.1122 2.9219 cells= 202010

If I run with info='info', I get:

MEAN AV MAG STD DEV
2006
data 2006 month 7 0.0981 1.1909 1.7665
data 2006 month 8 0.3129 1.3504 1.9677
data 2006 month 9 0.2413 1.2774 2.0954
data 2006 month 10 -0.0024 1.3375 2.0739
data 2006 month 11 0.2594 1.1632 2.0542
data 2006 month 12 0.0874 1.3236 2.2353

..confirming that the DTR (in this case) incoming anomalies are all within expected tolerances.

Ooh! Just found this a few thousand lines back, which may be relevant:

<QUOTE>
On a parallel track (this would really have been better as a blog), Tim O has found that the binary
grids of primary vars (used in synthetic production of secondary parameters) should be produced with
'binfac' set to 10 for TMP and DTR. This may explain the poor performance and coverage of VAP in
particular.
<END_QUOTE>

Did that help? Not much at all, unfortunately. This is frustrating, I can't see what's different. I
even enabled a commented-out line that prints the ranges of pts2(*,2) and r, and they look OK:

IDL> quick_interp_tdm2,2006,2006,'interim_data/gbins/gbins.0903201540/dtr/dtr.', 750,gs=2.5,pts_prefix='interim_data/anoms/anoms.0903201540/dtr/dtr.',dumpbin='dumpbin',startm=07,endm=12,info='info'
MEAN AV MAG STD DEV
2006
data 2006 month 7 0.0981 1.1909 1.7665
-10.4367 17.5867
-7.15403 14.5088
data 2006 month 8 0.3129 1.3504 1.9677
-7.00800 25.6929
-4.89867 15.7781
data 2006 month 9 0.2413 1.2774 2.0954
-18.4621 26.2400
-15.1905 22.7162
data 2006 month 10 -0.0024 1.3375 2.0739
-8.61333 18.5400
-6.05684 15.7678
data 2006 month 11 0.2594 1.1632 2.0542
-6.91852 33.3200
-5.10848 28.5915
data 2006 month 12 0.0874 1.3236 2.2353
-8.76667 24.4500
-6.03609 21.9419
grid 2006 non-zero-9124.9951 9124.9951 2813.1790 cells= 68111
IDL>

OK, I *think* I've got it. It's the fact that we're writing a yearly binary file but only have data
for the second half of that year:

minmax Jan-Jun: -9999.00 -9999.00
minmax Jul-Dec: -15.1905 28.5915

Now, I don't see how we get:

grid 2006 non-zero-9125.1289 9125.1289 2813.0332 cells= 68110

..but I do see how vap_gts_anom might just read the *first* six months, which would all be -9999.

So, we need to be able to write six-month binaries. Oh, my giddy aunt. What a crap crap system. We'll
have to switch to monthly binaries, it's the only unambiguous way. Meaning major modifications to
numerous IDL proglets. Fuck. Everything from the main progs (vap_gts_anom, quick_interp_tdm2, etc) to
the supporting ones (rdbin for one).

After HOURS, I think I've sussed it, at least for VAP. The incoming *integer* binaries had to be
constructed with binfac=10, because otherwise the integer bit renders most anomalies 0. Then, in
the vap_gts_anom script, the values have to be divided by 100, to give degrees. any other combination
of scaling factors throws the sat vap pressure calculations into the weeds. Of course, monthly binaries
are still required. Ho hum.

Modified quick_interp_tdm2 to take another additional command-line option: dumpmobin, which if set
will see that binaries are saved monthly, not yearly. Of course, the 2.5-degree TMP and DTR binary
grids are only used by VAP - FRS uses 0.5-degree.

So, rather than carry on with mods, I thought I'd mod update enough to fix VAP, then run it all again.

Well, it ran. Until PET production, where it crashed with the same (understandable) read error as
before ('********' not being an integer). However, when I invoked the Matlab sidebyside proglet to
examine the VAP, it was much improved on the previous VAP. The max was still 10000, just a shade too
high, but the actual spatial pollution was much reduced. There's hope! I think this all stems from
the sensitivity of the saturated vapour pressure calculations, where a factor of 10 error in an
input can make a factor of 1000 difference to the output.

Had to briefly divert to trick makegridsauto into thinking it was in the middle of a full 1901-2006
update, to get CLD NetCDF files produced for the whole period to June '06. Kept some important users
in Bristol happy.

So, back to VAP. Tried dividing the incoming TMP 7 DTR binaries by 1000! Still no joy. Then had the
bright idea of imposing a threshold on the 3.00 vap in the Matlab program. The result was that
quite a lot of data was lost from 3.00, but what remained was a very good match for the 2.10 data
(on which the thresholds were based).

I think I've got it! Hey - I might be home by 11. I got quick_interp_tdm2 to dump a min/max
for the synthetic grids. Guess what? Our old friend 32767 is here again, otherwise known as big-endian
trauma. And sure enough, the 0.5 and 2.5 binary normals (which I inherited, I've never produced them),
both need to be opened for reading with:

openr,lun,fname,/swap_if_big_endian

..so I added that as an argument to rdbin, and used it wherever rdbin is called to open these normals.

So, I went through all the IDL routines. I added an integer-to-float conversion on all binary reads,
and generally spruced things up. Also went through the parameters one by one and fixed (hopefully)
their scaling factors at each stage. What a minefield!

The PET problem, or unwriteable numbers, was solved by this tightening of secondaries, particularly
VAP, and also putting in a clause to abs() any negative values from the wind climatology. I really
don't think there should be any, but there are!

Finally I'm able to get a run of all ten parameters. The results, compared to 2.10 with sidebyside3col.m,
are pretty good on the whole. Not really happy with FRS (range OK but mysterious banding in Southern
Hemisphere), or PET:

pet
range210 = 0 573
range300 = 0 17.5000

So I've ended up with a range that doesn't scale simply to the 2.10 range. I also have no idea what
the actual range ought to be. And they said PET would be easy. Next step has to be a comparison of
max/min values of PET precursors vs. PET actuals for the two sources. Did that. No significant
differences, except that of course the 2.10 PET was produced with uncorrected wind. When I took
out the correction for 3.00, it shot up to even higher levels, so we'll just have to ignore 2.10
comparisons with PET.

Still, a top whack of 17.5 isn't too good for PET. Printed out the ranges of the precursors:

PET precursor parameters: ranges

tm -49.40 39.20
tn -52.80 39.50
tx -45.10 59.80
vp 0.00 36.60
wn 0.00 29.00
cl 0.00 1.00

So the temps are in degs C, vapour pressure's in hPa, wind's in m/s and cloud's fractional.

Then I thought about it. 17.5mm/day is pretty good - especially as it looks to be Eastern Sahara.

As for FRS.. with those odd longitudinal stripes - I just tidied the IDL prog up and it, er..

..went away. How very comforting.

Did a complete run for 7/06 to 12/06, ran the Matlab visuals, all params looked OK (if not special).

FTP'd the program suite and reference tree to BADC, replacing the existing ones, and tried the
same full run there.

Well the first thing I noticed was how slow it was! Ooops. Maybe 3x slower than uealogin1. Then,
lots of error messages (see below). I had wondered whether the big endian scene was going to show,
maybe this is it. Anyway, it finished! Here's the screen dump:

<QUOTE>
date25: 0903270742
date05: 0903270742
last6190: 0901010001
Producing anomalies
Producing station counts
Gridding primary parameters
Producing gridded binaries for synthetics
gridding TMP binary anomalies for secondary support
% Program caused arithmetic error: Floating illegal operand
gridding DTR binary anomalies for secondary support
% Program caused arithmetic error: Floating illegal operand
gridding TMP anomalies at 0.5 for synthetics
gridding DTR anomalies at 0.5 for synthetics
gridding PRE anomalies at 0.5 for synthetics
% Program caused arithmetic error: Floating illegal operand
Producing secondary: VAP
% Program caused arithmetic error: Floating divide by 0
% Program caused arithmetic error: Floating underflow
% Program caused arithmetic error: Floating overflow
% Program caused arithmetic error: Floating illegal operand
gridding VAP anomalies and synthetics
Producing secondary: WET
Producing secondary: CLD
Making synthetic CLD from DTR anomalies
gridding CLD anomalies and synthetics
Producing secondary: FRS
Converting anomalies to absolutes
Deriving PET
Creating output data and station files
creating final n-station tmpdtr files
creating final 0-station tmpdtr files

All work completed satisfactorarily
see: logs/completion/infolog.0904010108.dat
and: logs/logs.0904010108/update.0904010108.log


-bash-3.00$
<END_QUOTE>

Pulled back the output files and ran the sidebyside3col Matlab script to compare
with ours. Interesting. Here are the ranges:

tmp: BADC 300 m/m: -49.4 39.2, CRU 300 m/m: -49.4 39.2
tmn: BADC 300 m/m: -52.8 39.5, CRU 300 m/m: -52.8 39.5
tmx: BADC 300 m/m: -45.1 59.8, CRU 300 m/m: -45.1 59.8
dtr: BADC 300 m/m: 1 39.2, CRU 300 m/m: 1 39.2
pre: BADC 300 m/m: 0 4573, CRU 300 m/m: 0 4573
vap: BADC 300 m/m: 0 47.9, CRU 300 m/m: 0 36.3
wet: BADC 300 m/m: 0 310, CRU 300 m/m: 0 309.5
cld: BADC 300 m/m: 0 99.9, CRU 300 m/m: 0 100
frs: BADC 300 m/m: 0 310, CRU 300 m/m: 0 310
pet: BADC 300 m/m: 0 17.1, CRU 300 m/m: 0 17.5

I don't know which is more worrying - the VAP discrepancy or the fact that the
minimum DTR is 1 degree (for both!), the maximum BADC CLD is 99.9%, and the
maximum CRU WET is 30.95 days! Well I guess the VAP issue is the show-stopper, and
must be related to those errors:

Producing secondary: VAP
% Program caused arithmetic error: Floating divide by 0
% Program caused arithmetic error: Floating underflow
% Program caused arithmetic error: Floating overflow
% Program caused arithmetic error: Floating illegal operand

Now, these are IDL errors, and probably from our old pal vap_gts_anom_m.pro. So,
the established procedure is to re-run just that program, with all the info
turned on. Oh, my:

IDL> !path = 'programs/idl:' + !path
IDL> vap_gts_anom_m,2006,2006,dtr_prefix='interim_data/gbins/gbins.0904010108/dtr/dtr.',tmp_prefix='interim_data/gbins/gbins.0904010108/tmp/tmp.',outprefix='interim_data/syns/syns.0904010108/vap/vap.syn.',dumpbin=1,startm=07,endm=12
% Compiled module: VAP_GTS_ANOM_M.
Land,sea: 56016 68400
Calculating tmn normal
Calculating synthetic vap normal
Calculating synthetic anomalies
200607 vap (x,s2,<<,>>): -Inf -NaN -Inf 12.1306
200608 vap (x,s2,<<,>>): -Inf -NaN -Inf 15.4191
200609 vap (x,s2,<<,>>): -Inf -NaN -Inf 23.2317
200610 vap (x,s2,<<,>>): -Inf -NaN -Inf 22.4792
200611 vap (x,s2,<<,>>): -Inf -NaN -Inf 15.7444
200612 vap (x,s2,<<,>>): -Inf -NaN -Inf 11.2271
% Program caused arithmetic error: Floating divide by 0
% Program caused arithmetic error: Floating underflow
% Program caused arithmetic error: Floating overflow
% Program caused arithmetic error: Floating illegal operand
IDL>

Yes, it's back. Right back where we started with VAP at CRU, all those, er, days ago.
Well last time it was big endian stuff, wasn't it? And presumably the little Linux
box at BADC is big endian. So I might try changing those rdbin calls, just to see..

..that didn't seem to help. Here's a dump of key array ranges, just before the main
loop kicks in:

norgrd min/max: -68.9000 36.6000
tadj min/max: -715.433 445.602
tmpgrd min/max: -3165.30 3153.60
dtrgrd min/max: -1868.90 1612.80
vapsyn min/max: 0.01000000 Inf
v min/max: 0.00000 Inf

So tmpgrd and dtrgrd look waaay too high, though could just be *100. v and vapsyn are shot.

This does look like scaling. Boo hoo. I *fixed* that!! These are the ranges on UEALOGIN1:

norgrd min/max: -68.9000 36.6000
tadj min/max: -46.7671 25.8468
tmpgrd min/max: -68.9000 37.6000
dtrgrd min/max: -7.40000 6.30000
vapsyn min/max: 0.0100000 33.4119
v min/max: 0.00521439 41.9604

I wonder if I need to reverse the rdbin logic? reverse_if_little_endian on the non-clim calls?
Since normals are reading OK without it (CRU version has bigend=1, BADC version doesn't). Let's
try with just the tmpgrd & dtrgrd reads.. ooh:

norgrd min/max: -68.9000 36.6000
tadj min/max: -715.433 445.602
tmpgrd min/max: -68.9000 37.6000
dtrgrd min/max: -7.40000 6.30000
vapsyn min/max: 0.01000000 59.3142
v min/max: 0.00521439 61.0593
200607 vap (x,s2,<<,>>): 0.614392 1.57785 -5.33517 10.2892
200608 vap (x,s2,<<,>>): 0.593988 1.69958 -3.55033 12.3374
200609 vap (x,s2,<<,>>): 0.448958 0.793516 -5.20586 10.4787
200610 vap (x,s2,<<,>>): 0.525755 1.15223 -2.83614 8.92979
200611 vap (x,s2,<<,>>): 0.243011 0.939122 -4.66185 21.2776
200612 vap (x,s2,<<,>>): 0.302154 0.628504 -5.05943 5.84549
% Program caused arithmetic error: Floating underflow
% Program caused arithmetic error: Floating overflow
IDL>

So, just tadj to 'fix', then? Though surely I should read the 2006 tmp & dtr the same way.
Or is it that I copied the 61-90 over from here, but generated the 2006 there. Ah. Should
probably regenerate the 61-90 binaries at BADC? Yes. Anyway, found the 'other' tmp/dtr
reads and adjusted those, and behold:

norgrd min/max: -68.9000 36.6000
tadj min/max: -46.7671 25.8468
tmpgrd min/max: -68.9000 37.6000
dtrgrd min/max: -7.40000 6.30000
vapsyn min/max: 0.01000000 33.4119
v min/max: 0.00521439 41.9604
200607 vap (x,s2,<<,>>): 0.493936 1.15553 -6.14130 5.82037
200608 vap (x,s2,<<,>>): 0.460588 0.994776 -4.42765 5.69217
200609 vap (x,s2,<<,>>): 0.381708 0.674991 -4.56722 8.27954
200610 vap (x,s2,<<,>>): 0.506931 0.895855 -4.37382 5.13692
200611 vap (x,s2,<<,>>): 0.263565 0.656560 -3.84704 11.5171
200612 vap (x,s2,<<,>>): 0.374605 0.791715 -3.15873 6.41023

The CRU version has the same ranges, but some month stats differ:

norgrd min/max: -68.9000 36.6000
tadj min/max: -46.7671 25.8468
tmpgrd min/max: -68.9000 37.6000
dtrgrd min/max: -7.40000 6.30000
vapsyn min/max: 0.0100000 33.4119
v min/max: 0.00521439 41.9604
200607 vap (x,s2,<<,>>): 0.462426 1.11003 -6.14130 4.69486
200608 vap (x,s2,<<,>>): 0.428764 0.999707 -4.42765 6.48673
200609 vap (x,s2,<<,>>): 0.342277 0.642287 -4.56722 8.27954
200610 vap (x,s2,<<,>>): 0.485457 0.858445 -4.37382 5.13692
200611 vap (x,s2,<<,>>): 0.276767 0.685921 -3.72504 11.5171
200612 vap (x,s2,<<,>>): 0.373327 0.862642 -3.15873 15.3975

December in particular has quite a drift! No idea why, since the data going in
should be the same.

So, another full run, with regeneration of binary reference grids enforced:

tmp: BADC 300 m/m: -49.4 39.2, CRU 300 m/m: -49.4 39.2
tmn: BADC 300 m/m: -52.8 39.5, CRU 300 m/m: -52.8 39.5
tmx: BADC 300 m/m: -45.1 59.8, CRU 300 m/m: -45.1 59.8
dtr: BADC 300 m/m: 1 39.2, CRU 300 m/m: 1 39.2
pre: BADC 300 m/m: 0 4573, CRU 300 m/m: 0 4573
vap: BADC 300 m/m: 0 36.5, CRU 300 m/m: 0 36.3
wet: BADC 300 m/m: 0 309.3, CRU 300 m/m: 0 309.5
cld: BADC 300 m/m: 0 99.9, CRU 300 m/m: 0 100
frs: BADC 300 m/m: 0 310, CRU 300 m/m: 0 310
pet: BADC 300 m/m: 0 17.5, CRU 300 m/m: 0 17.5

I honestly don't think it'll get closer. So, I guess I'll clear out and reset
the BADC process, and let Kevin loose on it.

Well, BADC have had it for a good while, without actually doing anything. what
a surprise. It's lucky actually, as I've ironed out a few bugs (including PET
being garbage). One bug is eluding me, however - I can't get a full 1901-2008
run to complete! It gets stuck after producing the final TMP files (data plus
stations), it just seems to sit there indefinitely. So I tried different periods.

1901-2008 failed
1901 only worked
2008 only worked
1901-1910 worked
1901-1950 worked
1951-2008 worked
1901-2008 failed

**sigh** WHAT THE HELL'S GOING ON?! Well, time to ask the compiler. So I recompiled
as follows:

g77 -o update -Wall -Wsurprising -fbounds-check programs/fortran/update.for

Then, I re-ran. This time I got an error almost immediately:

Producing anomalies
Subscript out of range on file line 1011, procedure programs/fortran/update.for/MAIN.
Attempt to access the 6-th element of variable dobin25[subscript-1-of-2].
Abort (core dumped)

Hurrah! In a way.. thyat bug was easy enough, I'd just forgotten to put an extra
test (ipar.le.5) in the test for binary production, so as it was in a 1..8 loop,
there was bound (ho ho) to be trouble. There was a second, identical, instance.

After all that - final success:

date25: 0903270742
date05: 0903270742
last6190: 0901010001
Producing anomalies
Producing station counts
Gridding primary parameters
Producing gridded binaries for synthetics
gridding TMP binary anomalies for secondary support
gridding DTR binary anomalies for secondary support
gridding PRE binary anomalies for secondary support
gridding TMP anomalies at 0.5 for synthetics
gridding DTR anomalies at 0.5 for synthetics
gridding PRE anomalies at 0.5 for synthetics
Producing secondary: VAP
gridding VAP anomalies and synthetics
Producing secondary: WET
gridding WET anomalies and synthetics
Producing secondary: CLD
Making synthetic CLD from DTR anomalies
gridding CLD anomalies and synthetics
Producing secondary: FRS
Converting anomalies to absolutes
Deriving PET
Creating output data and station files
creating final n-station tmpdtr files
creating final 0-station tmpdtr files

All work completed satisfactorarily
see: logs/completion/infolog.0905070939.dat
and: logs/logs.0905070939/update.0905070939.log


uealogin1[/esdata/cru/f098/update_top]

..and in terms of disk usage (um, remember it's not *that* reliable):

uealogin1[/esdata/cru/f098/update_top] du -ks *
64 anomauto
32 batchdel
64 bom2cruauto
64 climat2cruauto
32 compile_all
629856 db
32 dtr2cldauto
64 glo2absauto
16108896 gridded_finals
13822176 interim_data
18368 logs
416 makegridsauto
64 makepetauto
64 mcdw2cruauto
32 movenormsauto
32 newdata.latest.date
288 newmergedbauto
2368 programs
1101088 reference
2848 results
3008 runs
32 saved_timings_090420_1716
64 stncountsauto
64 stncountsauto_safe
704 timings
32 tmnx2dtrauto
32 tmpdtrstnsauto
352 update
638432 updates
uealogin1[/esdata/cru/f098/update_top]

Meaning that a complete 1901-2008 run will need about 14gb of working data and the
resulting files will need approximately 16gb. All gzipped!!

Then, of course (or 'at last', depending on your perspective), Tim O had a look at the
data with that analytical brain thingy he's got. Oooops. Lots of wild values, even for
TMP and PRE - and that's compared to the previous output!! Yes, this is comparing the
automated 1901-2008 files with the 1901-June2006 files, not with CRu TS 2.1. So, you
guessed it, bonnet up again.

First investigation was WET, where variance was far too low - usually indicative of a
scaling issue, and thus it was. Despite having had a drains-up on scaling, WET seems
to have escaped completely. The initial gridding (to binary) outputs at x10, which is
absolutely fine. But the PRE-to-WET converters are not so simple. The 2.5-degree
converter (rd0_gts_anom_m.pro) has reasonable output values (five monthly examples):

minmax rd0 2.5 binaries: -100.000 357.327
minmax rd0 2.5 binaries: -94.2232 250.621
minmax rd0 2.5 binaries: -93.0808 512.557
minmax rd0 2.5 binaries: -100.000 623.526
minmax rd0 2.5 binaries: -95.1105 521.668

The trouble is, when written to binary, these will be rounded to integer and a degree
of accuracy will be lost. They should be x10. Then there's the 0.5-degree converter
(rd0_gts_anom_05_m.pro), which has indescribably awful output values:

minmax rd0 0.5 binaries: -1.00000 8.33519
minmax rd0 0.5 binaries: -0.970328 8.13772
minmax rd0 0.5 binaries: -0.951749 4.33032
minmax rd0 0.5 binaries: -1.00000 9.26219
minmax rd0 0.5 binaries: -0.960226 3.80590

These are basically 1000 times too small!!! How did this happen when I specifically
had a complete review of scaling factors?! FFS.

Aha. Not so silly. The 0.5 grids are saved as .glo files (because after 1989 it's all
synthetic). So they're not rounded. On the other hand, they are still 100x too low for
percentage anomalies. and the 2.5 grids are sent to the gridder as 'synthfac=100'!!
When currently it's 1! So.. some changes to make :P

The 2.5-degree PRE/WET path is now at x10 all the way to the final gridding. The
0.5-degree PRE/WET path is at x10 until the production of the synthetic WET, at which
point it has to be x1 to line up with the pre-1990 output from the gridder (the gridder
outputs .glo files as x1 only, we haven't used the 'actfac' parameter yet and we're
not going to start!!).

Got all that fixed. Then onto the excessions Tim found - quite a lot that really should
have triggered the 3/4 sd cutoff in anomauto.for. Wrote 'retrace.for', a proglet I've
been looking for an excuse to write. It takes a country or individual cell, along with
dates and a run ID, and preforms a reverse trace from final output files to database. It's
not complete yet but it already gives extremely helpful information - I was able to look
at the first problem (Guatemala in Autumn 1995 has a massive spike) and find that a
station in Mexico has a temperature of 78 degrees in November 1995! This gave a local
anomaly of 53.23 (which would have been 'lost' amongst the rest of Mexico as Tim just
did country averages) and an anomaly in Guatemala of 24.08 (which gave us the spike):

7674100 1808 -9425 22 COATZACOALCOS, VER. MEXICO 1951 2009 101951 -999.00

1994 188-9999 244-9999-9999 286 281 275 274 274 262-9999
1995 237-9999-9999-9999 300-9999 281 283 272-9999 780 239
1996 219 232 235 256 285 276 280 226 285 260 247 235

Now, this is a clear indication that the standard deviation limits are not being applied.
Which is extremely bad news. So I had a drains-up on anomauto.for.. and.. yup, my awful
programming strikes again. Because I copied the anomdtb.f90 process, I failed to notice
an extra section where the limit was applied to the whole station - I was only applying
it to the normals period (1961-90)! So I fixed that and re-ran. Here are the before and
after outputs from trace.for:

Trace on Mexico, 11/1995, run #0905070939:
crua6[/cru/cruts/version_3_0/fixing_tmp_and_pre] cat retrace.Mexico.tmp.1995.11.stat
1995 11 21.41 9.80 239 145 72.80 216 173
1995 11 21.41 9.80 239 145 72.80 216 173
1995 11 18.33 8.70 239 145 23.80 216 173
1995 11 3.08 1.13 239 145 49.04 216 173
1995 11 3.23 -0.57 238 148 2232 53.23 217 172 2244
1995 11 22.39 12.80 243 148 7227000 78.00 217 172 7674100

Trace on Mexico, 11/1995, run #0907031504:
crua6[/cru/cruts/version_3_0/fixing_tmp_and_pre] cat retrace.Mexico.tmp.1995.11.stat
1995 11 19.51 9.80 239 145 28.90 216 156
1995 11 19.51 9.80 239 145 28.90 216 156
1995 11 18.33 8.70 239 145 28.50 216 156
1995 11 1.18 1.13 239 145 0.36 216 156
1995 11 0.73 -0.57 238 148 2227 1.82 227 148 2231
1995 11 22.39 12.80 243 148 7227000 78.00 217 172 7674100

The column to be looking at is this one ---------------------^

Because it's a traceability program, it works backwards in time:
Row 1 Final gridded output
Row 2 Gridded absolutes (should == Row 1)
Row 3 Climatology
Row 4 Gridded anomalies
Row 5 Anomalies
Row 6 Actual station values

Columns are as follows:
Cols 1,2 Year, Month
Col 3 Mean
Cols 4-6 Min with cell indices
Cols 7-9 Max with cell indices

Row 5:
Cols 4-7 Min with cell indices and line # in anoms file
Cols 8-11 Max with cell indices and line # in anoms file

Row 6:
Cols 4-7 Min with cell indices and WMO code in database
Cols 8-11 Max with cell indices and WMO code in database

In this case, the erroneous value of 78 degrees has been counted in the earlier run,
giving an anomaly of 53.23. In the later run, it hasn't - the anomaly of 1.82 is from a
different cell (227,148 instead of 217,172).

So, re-running improved matters. The extremes have vanished. But the means are still out,
sometimes significantly.

Stand by, we're about to go down the rabbit-hole again!

I took the twelve 1990 anomaly files from the original 1901-2006 run (that was done with
some flavour of anomdtb.f90). They were here:

/cru/cruts/version_3_0/primaries/tmp/tmptxt/*1990*

Then I modified the update 'latest databases' file to say that tmp.0705101334.dtb was the
current database, and made a limited run of the update program for tmp only, killing it
once it had produced the anomaly files. The run was #0908181048.

So, under /cru/cruts/version_3_0/fixing_tmp_and_pre/custom_anom_comparisons, we have a
'manual' directory and an 'automatic' directory, each with twelve 1990 anomaly files. And
how do they compare? NOT AT ALL!!!!!!!!!

Example from January:

crua6[/cru/cruts/version_3_0/fixing_tmp_and_pre/custom_anom_comparisons] head manual/tmp.1990.01.txt
70.90 -8.70 10.0 4.20000 10010
78.30 15.50 28.0 9.10000 10080
69.70 18.90 10.0 0.90000 10250
69.70 18.90 100.0 1.30000 10260
74.50 19.00 16.0 5.40000 10280
69.50 25.50 129.0 0.30000 10650
70.40 31.10 14.0 -0.40000 10980
66.00 2.00 0.0 1.70000 11000
67.30 14.40 13.0 0.80000 11520
66.80 14.00 39.0 2.50000 11530
crua6[/cru/cruts/version_3_0/fixing_tmp_and_pre/custom_anom_comparisons] head automatic/tmp.1990.01.txt
7.09 -0.87 10.0 2.97558 10010
7.83 1.55 28.0 7.87895 10080
6.97 1.89 10.0 0.50690 10250
6.97 1.89 100.0 0.49478 10260
7.45 1.90 16.0 3.88554 10280
6.95 2.55 129.0 -1.48960 10650
7.04 3.11 14.0 -0.46391 10980
6.60 0.20 0.0 1.63333 11000
6.73 1.44 13.0 0.12662 11520
6.68 1.40 39.0 2.03333 11530

The numbers of values in each pair are not always identical, but are within 2 or 3 so that's
not too worrying. Worrying, yes, just not fatal.

There are a number of things going on. The lats and lons are the same, just scaled (because
originally the TMP coordinates were x10 not x100). We can ignore that problem. The real
problem is the completely different results from the automated system - I don't understand
this because I painstakingly chcked the anomauto.for file to ensure it was doing the right
job!! The overall pattern of anomalies is roughly the same - it's just that the actual values
differ. Not always lower - sometime higher. Could be a rounding error..

Got anomauto to dump the first month of the first station (10010). The clue to the problem is
in the first lines - we're only getting the full-length mean (used for SD calculations) and
not the 61-90 mean. nv should be <= 30.

WMO = 10010, im = 01
(nv.gt.5) sums = -384.90, nv = 86, ave = -4.48
onestn(0490,01) = -1.50
d( 1,0490,01) = 2.98

Aaaaand.. FOUND IT! What happened was this: in the original anomdtb.f90 program, there's a
test for existing normals (in the header of each station). If they are present, then SD is
calculated (to allow excessions to be screened out). If not, SD is calculated and then
used to screen excessions, then a 61-90 normal is built provided there are enough values
after the screening. However, in my version, I followed the same process - but crucially,
I wasn't using the same variable to store the existing normals and the calculated ones!!
So we were ending up with the 'full length' normal (n=86 in the above example) instead.
All I had to add was a single line:

ave = xstnrms(im)*fac ! NEW - actually *use* existing normals!!

We then get:

onestn(0490,01) = -1.50
d( 1,0490,01) = 4.20

Which is what we want. So, a complete re-run (just tmp) for 1990, still using the old db.

Tadaa:
uealogin1[/cru/cruts/version_3_0/fixing_tmp_and_pre/custom_anom_comparisons/new_automatic] head tmp.1990.01.txt
7.09 -0.87 10.0 4.20000 10010
7.83 1.55 28.0 9.10000 10080
6.97 1.89 10.0 0.90000 10250
6.97 1.89 100.0 1.30000 10260
7.45 1.90 16.0 5.40000 10280
6.95 2.55 129.0 0.30000 10650
7.04 3.11 14.0 -0.40000 10980
6.60 0.20 0.0 1.70000 11000
6.73 1.44 13.0 0.80000 11520
6.68 1.40 39.0 2.50000 11530

Spot on! A 100% match with the previously-generated anomalies. Phew!


Go on to part 35v, back to index or Email search