Onto whole runs of the update program. With a lot of debugging!
Discovered that WMO codes are still a pain in the arse. And that I'd forgotten to match Australian
updates by BOM code (last field in header) instead of WMO code - so I had to modify newmergedbauto.
Also found that running fixwmos.for was less than successful on VAP, because it's already screwed:
Started work on fixdupes.for, to cleanse a given database of obvious duplicate stations, but self-
diverted back onto getting the whole update process compiled and running end to end. Almost
immediately found that match rated in the merging were mixed. Added a section to newmergedbauto
that did a quick matchmaking exercise on any update stations that failed the code matching. Just
lat/lon and character fields really. Didn't seem to make a lot of difference. Here are the merge
results for all updates and parameters, in the order they would have happened:
uealogin1[/cru/cruts/version_3_0/update_top/logs/logs.0903091631] tail merg.mcdw.tmp.0903091631.log
OUTPUT(S) WRITTEN
New master database: updates/MCDW/db/db.0903091631/int1.tmp.0903091631.dtb
Update database stations: 2802
> Matched with Master stations: 1759
(automatically: 1759)
(by operator: 0)
> Added as new Master stations: 1043
> Rejected: 0
uealogin1[/cru/cruts/version_3_0/update_top/logs/logs.0903091631] tail merg.mcdw.pre.0903091631.log
OUTPUT(S) WRITTEN
New master database: updates/MCDW/db/db.0903091631/int1.pre.0903091631.dtb
Update database stations: 2807
> Matched with Master stations: 2783
(automatically: 2783)
(by operator: 0)
> Added as new Master stations: 24
> Rejected: 0
uealogin1[/cru/cruts/version_3_0/update_top/logs/logs.0903091631] tail merg.mcdw.vap.0903091631.log
New master database: updates/MCDW/db/db.0903091631/int1.vap.0903091631.dtb
New master database: updates/BOM/db/db.0903091631/int3.tmx.0903091631.dtb
Update database stations: 906
> Matched with Master stations: 783
(automatically: 783)
(by operator: 0)
> Added as new Master stations: 120
> Rejected: 3
Rejects file: updates/BOM/db/db.0903091631/bom.tmx.0903091631.dtb.rejected
uealogin1[/cru/cruts/version_3_0/update_top/logs/logs.0903091631]
Probably the worst story is temperature, particularly for MCDW. Over 1000 new stations! Highly
unlikely. I am tempted to blame the different lat/lon scale, but for now it will have to rest.
Still hitting the problem with TMP lats and lons being a mix of deg*10 and deg*100, it's screwing
up the station counts work (of course). Unfortunately, I did some tests and the 'original' TMP
database has the trouble, it's not my update suite :-(((
Then.. I worked it out. Sample headers from the 'original' TMP db tmp.0705101334.dtb:
FM! fixwmos fixed coordinates as well! Here's the snippet:
locfac = 10 ! init location factor assuming lat & lon are in degs*10
do i=1,10000
read(10,'(a86)',end=12)buffy
read(buffy,fmt)wmo,lat,lon,alt,sname,ctry,sy,ey,flag,extref
if (lat.gt.maxlat) maxlat = lat
if (lat.gt.900) goto 12
do j=sy,ey+norml
read(10,'()')
enddo
enddo
12 if (maxlat.gt.900) locfac = 1 ! lat & lon are in degs *100
So it was written with TMP in mind! Oh, for a memory. So we don't need to fret about TMP
lat/lon being *10 any more!!
And it's taken me until NOW to realise that the IDL synthetic generators (vap_gts_anom,
frs_gts_tdm, rd0_gts_anom) all need to calculate 1961-1990 normals! So they will need
TMP, DTR and/or PRE binary normals for 1961 to 1990. Which means anomalies will have to
be automatically generated for that period regardless of the requested period!!! *Cries*
Introduced suitable conditionals to ensure that 61-90 anomalies and gridded binaries are
automatically produced if the relevant secondary parameters are requested.
More run-time issues, VAP is still giving an apparently non-fatal error:
% Program caused arithmetic error: Floating underflow
% Program caused arithmetic error: Floating overflow
% Program caused arithmetic error: Floating illegal operand
(this only appears once the vap_gts_anom.pro program has finished, so can't be identified)
Stuck on WET production, getting an error from rd0_gts_anom_05.pro, from the same bit of
code that works fine in rd0_gts_anom.pro! The error:
% Attempt to subscript RD0SYN with NSEA is out of range.
% Execution halted at: RD0_GTS_ANOM_05 34 /cru-auto/cruts/version_3_0/BADC_AREA/programs/idl/rd0_gts_anom_05.pro
% $MAIN$
Line 34:
rd0syn=float(prenorm)*0.0 & rd0syn(nsea)=-9999
Do you know, I actually worked this one out by myself. Preen. It turned out that nsea was -1,
which meant that it was finding no hits here:
nsea =where(rd0norm eq -9999 or prenorm eq -9999)
When I looked - the min and max of rd0norm and prenorm were:
-32768 32514
..and I thought, what a coincidence, that's 2^16. Aha! Must be an Endian problem. Looked it up
on the web, abd the IDL ref manual, and found that adding:
,/swap_if_big_endian
..to the end of the openr statements in rdbin.pro, it all worked! :-)))
and then, of course, another problem I should have anticipated: half-degree gridding of synthetics
needs half-degree primary binaries. So the precip binaries must be half-degree for WET (after 1989)
and the usual 2.5-degrees earlier. More modifications to update.for!! And it took me a further 24
hours to cotton on that I'd need half-degree TMP and DTR binaries for FRS. VAP won't mind as it's
using the synthetics as an adjunct to the observations - the exceptions are those secondaries where
no observations can be used. WET after 1989, FRS, and CLD after 2002 (but CLD considerately works
from DTR anomalies, ungridded).
So I'm going to have to produce half-degree gridded binary TMP and DTR anomalies, adding HALF AN
HOUR to the run time. Bollocks. Though I could be clever and save it.. then I'd have to monitor
when 1961-1990 databases were altered, and compare, and.. wibble.
Got that done. Then got it all working (though outputs not tested). Whooo. Now for BADC.
...
Actually, BADC wasn't too bad. Took a day or so to get everything to compile, mainly having to
shift to gfortran rather than f77, and also to use -w to suppress warnings. Discovered that
the IDL there didn't look at IDL_STARTUP, bah, but then found a way to specify a startup file
on the command line, ie:
..so that's all right then. Got it all running without errors at BADC. Well, I say that, I'm
still getting this for VAP:
gridding PRE anomalies at 0.5 for synthetics
Producing secondary: VAP
% Program caused arithmetic error: Floating overflow
% Program caused arithmetic error: Floating illegal operand
gridding VAP anomalies and synthetics
Producing secondary: WET
I haven't been able to identify what's causing that. Um.
Anyway the next items are the tricky saving of 2.5 and 0.5 binaries for 1961-1990, only
regenerating then if the dbs have been altered. Requires multi-process cooperation, since we
can't tell from the database timestamps which years were potentially changed. Admittedly, with
this system that only accepts MCDW/CLIMAT/BOM updates, a pre-1991 change is all but impossible,
but build for the case you can't anticipate..! Also up next is the deconstruction of the early
cloud data (ie to 2002) so we can generate NetCDF files for the whole shebang. degroupcld.for
will do the honours.
Did CLD first. Having reverse-engineered gabs ('gridded absolute') files for 1901-2002, I
then modified update (extensively) to skip anything to do with CLD (including station counts)
before 2003. Then, at the anoms-to-absolutes stage, unzipped and copied over any pre-2003
CLD gabs files from the reference repository.
I suppose I'll have to do CLD station counts (just 'n' obviously) at some stage, too.
Ran update, just for CLD, just for 1901-06/2006. Realised halfway through that I'd really have
to do station counts as well because update does 'em for DTR anyway! That ought to cut out but
doesn't at the moment.
It's getting faster.. implementing the 'saved binaries' was easier than I thought as well. Lots
to change but straightforward. Now the IDL synthetics generators will always look in the
reference area for 1961-1990 gridded binaries, whether 2.5-degree or 0.5-degree. And those
datasets *should* be regenerated if flags are set that 1961-1990 data has been changed in the
databases.
Then, a big problem. Lots of stars ('*********') in the PET gridded absolutes. Wrote sidebyside.m
to display the five input parameters; VAP looks like being the culprit, with unfeasibly large
values (up to 10034 in fact). And that's after the standard /10. So, erm.. a drains-up on VAP
is now required. Oh, joy. And CLD also looks unacceptable, despite all that work - big patches
of 100% and 0% dominate, no doubt a result of clipping by glo2absauto. The clipping is necessary,
but shouldn't be needed so often!
Reassuringly, the 3_00 VAP and CLD that are published look fine, so it's soemthing I've done in
the automation process. Mis-scaling is most likely.
Started chaining back through (initially) VAP. The gabs files were identical to the finals (now,
if that had failed it would have been a problem!). The gridded anomaly files were a lot more
interesting, because although they looked just as bad, their max values were exactly 9999. That
ain't no coincidence!
Trailing fiurther back.. VAP anoms are OK, so suspicion falls on the synthetics. And lo and behold,
re-running the TMP and DTR 2.5-grid binary productions with quick_interp_tdm2 gives:
MEAN AV MAG STD DEV
2006
data 2006 month 7 0.0981 1.1909 1.7665
data 2006 month 8 0.3129 1.3504 1.9677
data 2006 month 9 0.2413 1.2774 2.0954
data 2006 month 10 -0.0024 1.3375 2.0739
data 2006 month 11 0.2594 1.1632 2.0542
data 2006 month 12 0.0874 1.3236 2.2353
..confirming that the DTR (in this case) incoming anomalies are all within expected tolerances.
Ooh! Just found this a few thousand lines back, which may be relevant:
<QUOTE>
On a parallel track (this would really have been better as a blog), Tim O has found that the binary
grids of primary vars (used in synthetic production of secondary parameters) should be produced with
'binfac' set to 10 for TMP and DTR. This may explain the poor performance and coverage of VAP in
particular.
<END_QUOTE>
Did that help? Not much at all, unfortunately. This is frustrating, I can't see what's different. I
even enabled a commented-out line that prints the ranges of pts2(*,2) and r, and they look OK:
IDL> quick_interp_tdm2,2006,2006,'interim_data/gbins/gbins.0903201540/dtr/dtr.', 750,gs=2.5,pts_prefix='interim_data/anoms/anoms.0903201540/dtr/dtr.',dumpbin='dumpbin',startm=07,endm=12,info='info'
MEAN AV MAG STD DEV
2006
data 2006 month 7 0.0981 1.1909 1.7665
-10.4367 17.5867
-7.15403 14.5088
data 2006 month 8 0.3129 1.3504 1.9677
-7.00800 25.6929
-4.89867 15.7781
data 2006 month 9 0.2413 1.2774 2.0954
-18.4621 26.2400
-15.1905 22.7162
data 2006 month 10 -0.0024 1.3375 2.0739
-8.61333 18.5400
-6.05684 15.7678
data 2006 month 11 0.2594 1.1632 2.0542
-6.91852 33.3200
-5.10848 28.5915
data 2006 month 12 0.0874 1.3236 2.2353
-8.76667 24.4500
-6.03609 21.9419
grid 2006 non-zero-9124.9951 9124.9951 2813.1790 cells= 68111
IDL>
OK, I *think* I've got it. It's the fact that we're writing a yearly binary file but only have data
for the second half of that year:
..but I do see how vap_gts_anom might just read the *first* six months, which would all be -9999.
So, we need to be able to write six-month binaries. Oh, my giddy aunt. What a crap crap system. We'll
have to switch to monthly binaries, it's the only unambiguous way. Meaning major modifications to
numerous IDL proglets. Fuck. Everything from the main progs (vap_gts_anom, quick_interp_tdm2, etc) to
the supporting ones (rdbin for one).
After HOURS, I think I've sussed it, at least for VAP. The incoming *integer* binaries had to be
constructed with binfac=10, because otherwise the integer bit renders most anomalies 0. Then, in
the vap_gts_anom script, the values have to be divided by 100, to give degrees. any other combination
of scaling factors throws the sat vap pressure calculations into the weeds. Of course, monthly binaries
are still required. Ho hum.
Modified quick_interp_tdm2 to take another additional command-line option: dumpmobin, which if set
will see that binaries are saved monthly, not yearly. Of course, the 2.5-degree TMP and DTR binary
grids are only used by VAP - FRS uses 0.5-degree.
So, rather than carry on with mods, I thought I'd mod update enough to fix VAP, then run it all again.
Well, it ran. Until PET production, where it crashed with the same (understandable) read error as
before ('********' not being an integer). However, when I invoked the Matlab sidebyside proglet to
examine the VAP, it was much improved on the previous VAP. The max was still 10000, just a shade too
high, but the actual spatial pollution was much reduced. There's hope! I think this all stems from
the sensitivity of the saturated vapour pressure calculations, where a factor of 10 error in an
input can make a factor of 1000 difference to the output.
Had to briefly divert to trick makegridsauto into thinking it was in the middle of a full 1901-2006
update, to get CLD NetCDF files produced for the whole period to June '06. Kept some important users
in Bristol happy.
So, back to VAP. Tried dividing the incoming TMP 7 DTR binaries by 1000! Still no joy. Then had the
bright idea of imposing a threshold on the 3.00 vap in the Matlab program. The result was that
quite a lot of data was lost from 3.00, but what remained was a very good match for the 2.10 data
(on which the thresholds were based).
I think I've got it! Hey - I might be home by 11. I got quick_interp_tdm2 to dump a min/max
for the synthetic grids. Guess what? Our old friend 32767 is here again, otherwise known as big-endian
trauma. And sure enough, the 0.5 and 2.5 binary normals (which I inherited, I've never produced them),
both need to be opened for reading with:
openr,lun,fname,/swap_if_big_endian
..so I added that as an argument to rdbin, and used it wherever rdbin is called to open these normals.
So, I went through all the IDL routines. I added an integer-to-float conversion on all binary reads,
and generally spruced things up. Also went through the parameters one by one and fixed (hopefully)
their scaling factors at each stage. What a minefield!
The PET problem, or unwriteable numbers, was solved by this tightening of secondaries, particularly
VAP, and also putting in a clause to abs() any negative values from the wind climatology. I really
don't think there should be any, but there are!
Finally I'm able to get a run of all ten parameters. The results, compared to 2.10 with sidebyside3col.m,
are pretty good on the whole. Not really happy with FRS (range OK but mysterious banding in Southern
Hemisphere), or PET:
pet
range210 = 0 573
range300 = 0 17.5000
So I've ended up with a range that doesn't scale simply to the 2.10 range. I also have no idea what
the actual range ought to be. And they said PET would be easy. Next step has to be a comparison of
max/min values of PET precursors vs. PET actuals for the two sources. Did that. No significant
differences, except that of course the 2.10 PET was produced with uncorrected wind. When I took
out the correction for 3.00, it shot up to even higher levels, so we'll just have to ignore 2.10
comparisons with PET.
Still, a top whack of 17.5 isn't too good for PET. Printed out the ranges of the precursors:
So the temps are in degs C, vapour pressure's in hPa, wind's in m/s and cloud's fractional.
Then I thought about it. 17.5mm/day is pretty good - especially as it looks to be Eastern Sahara.
As for FRS.. with those odd longitudinal stripes - I just tidied the IDL prog up and it, er..
..went away. How very comforting.
Did a complete run for 7/06 to 12/06, ran the Matlab visuals, all params looked OK (if not special).
FTP'd the program suite and reference tree to BADC, replacing the existing ones, and tried the
same full run there.
Well the first thing I noticed was how slow it was! Ooops. Maybe 3x slower than uealogin1. Then,
lots of error messages (see below). I had wondered whether the big endian scene was going to show,
maybe this is it. Anyway, it finished! Here's the screen dump:
<QUOTE>
date25: 0903270742
date05: 0903270742
last6190: 0901010001
Producing anomalies
Producing station counts
Gridding primary parameters
Producing gridded binaries for synthetics
gridding TMP binary anomalies for secondary support
% Program caused arithmetic error: Floating illegal operand
gridding DTR binary anomalies for secondary support
% Program caused arithmetic error: Floating illegal operand
gridding TMP anomalies at 0.5 for synthetics
gridding DTR anomalies at 0.5 for synthetics
gridding PRE anomalies at 0.5 for synthetics
% Program caused arithmetic error: Floating illegal operand
Producing secondary: VAP
% Program caused arithmetic error: Floating divide by 0
% Program caused arithmetic error: Floating underflow
% Program caused arithmetic error: Floating overflow
% Program caused arithmetic error: Floating illegal operand
gridding VAP anomalies and synthetics
Producing secondary: WET
Producing secondary: CLD
Making synthetic CLD from DTR anomalies
gridding CLD anomalies and synthetics
Producing secondary: FRS
Converting anomalies to absolutes
Deriving PET
Creating output data and station files
creating final n-station tmpdtr files
creating final 0-station tmpdtr files
All work completed satisfactorarily
see: logs/completion/infolog.0904010108.dat
and: logs/logs.0904010108/update.0904010108.log
-bash-3.00$
<END_QUOTE>
Pulled back the output files and ran the sidebyside3col Matlab script to compare
with ours. Interesting. Here are the ranges:
I don't know which is more worrying - the VAP discrepancy or the fact that the
minimum DTR is 1 degree (for both!), the maximum BADC CLD is 99.9%, and the
maximum CRU WET is 30.95 days! Well I guess the VAP issue is the show-stopper, and
must be related to those errors:
Producing secondary: VAP
% Program caused arithmetic error: Floating divide by 0
% Program caused arithmetic error: Floating underflow
% Program caused arithmetic error: Floating overflow
% Program caused arithmetic error: Floating illegal operand
Now, these are IDL errors, and probably from our old pal vap_gts_anom_m.pro. So,
the established procedure is to re-run just that program, with all the info
turned on. Oh, my:
Yes, it's back. Right back where we started with VAP at CRU, all those, er, days ago.
Well last time it was big endian stuff, wasn't it? And presumably the little Linux
box at BADC is big endian. So I might try changing those rdbin calls, just to see..
..that didn't seem to help. Here's a dump of key array ranges, just before the main
loop kicks in:
I wonder if I need to reverse the rdbin logic? reverse_if_little_endian on the non-clim calls?
Since normals are reading OK without it (CRU version has bigend=1, BADC version doesn't). Let's
try with just the tmpgrd & dtrgrd reads.. ooh:
So, just tadj to 'fix', then? Though surely I should read the 2006 tmp & dtr the same way.
Or is it that I copied the 61-90 over from here, but generated the 2006 there. Ah. Should
probably regenerate the 61-90 binaries at BADC? Yes. Anyway, found the 'other' tmp/dtr
reads and adjusted those, and behold:
I honestly don't think it'll get closer. So, I guess I'll clear out and reset
the BADC process, and let Kevin loose on it.
Well, BADC have had it for a good while, without actually doing anything. what
a surprise. It's lucky actually, as I've ironed out a few bugs (including PET
being garbage). One bug is eluding me, however - I can't get a full 1901-2008
run to complete! It gets stuck after producing the final TMP files (data plus
stations), it just seems to sit there indefinitely. So I tried different periods.
1901-2008 failed
1901 only worked
2008 only worked
1901-1910 worked
1901-1950 worked
1951-2008 worked
1901-2008 failed
**sigh** WHAT THE HELL'S GOING ON?! Well, time to ask the compiler. So I recompiled
as follows:
Then, I re-ran. This time I got an error almost immediately:
Producing anomalies
Subscript out of range on file line 1011, procedure programs/fortran/update.for/MAIN.
Attempt to access the 6-th element of variable dobin25[subscript-1-of-2].
Abort (core dumped)
Hurrah! In a way.. thyat bug was easy enough, I'd just forgotten to put an extra
test (ipar.le.5) in the test for binary production, so as it was in a 1..8 loop,
there was bound (ho ho) to be trouble. There was a second, identical, instance.
After all that - final success:
date25: 0903270742
date05: 0903270742
last6190: 0901010001
Producing anomalies
Producing station counts
Gridding primary parameters
Producing gridded binaries for synthetics
gridding TMP binary anomalies for secondary support
gridding DTR binary anomalies for secondary support
gridding PRE binary anomalies for secondary support
gridding TMP anomalies at 0.5 for synthetics
gridding DTR anomalies at 0.5 for synthetics
gridding PRE anomalies at 0.5 for synthetics
Producing secondary: VAP
gridding VAP anomalies and synthetics
Producing secondary: WET
gridding WET anomalies and synthetics
Producing secondary: CLD
Making synthetic CLD from DTR anomalies
gridding CLD anomalies and synthetics
Producing secondary: FRS
Converting anomalies to absolutes
Deriving PET
Creating output data and station files
creating final n-station tmpdtr files
creating final 0-station tmpdtr files
All work completed satisfactorarily
see: logs/completion/infolog.0905070939.dat
and: logs/logs.0905070939/update.0905070939.log
uealogin1[/esdata/cru/f098/update_top]
..and in terms of disk usage (um, remember it's not *that* reliable):
Meaning that a complete 1901-2008 run will need about 14gb of working data and the
resulting files will need approximately 16gb. All gzipped!!
Then, of course (or 'at last', depending on your perspective), Tim O had a look at the
data with that analytical brain thingy he's got. Oooops. Lots of wild values, even for
TMP and PRE - and that's compared to the previous output!! Yes, this is comparing the
automated 1901-2008 files with the 1901-June2006 files, not with CRu TS 2.1. So, you
guessed it, bonnet up again.
First investigation was WET, where variance was far too low - usually indicative of a
scaling issue, and thus it was. Despite having had a drains-up on scaling, WET seems
to have escaped completely. The initial gridding (to binary) outputs at x10, which is
absolutely fine. But the PRE-to-WET converters are not so simple. The 2.5-degree
converter (rd0_gts_anom_m.pro) has reasonable output values (five monthly examples):
The trouble is, when written to binary, these will be rounded to integer and a degree
of accuracy will be lost. They should be x10. Then there's the 0.5-degree converter
(rd0_gts_anom_05_m.pro), which has indescribably awful output values:
These are basically 1000 times too small!!! How did this happen when I specifically
had a complete review of scaling factors?! FFS.
Aha. Not so silly. The 0.5 grids are saved as .glo files (because after 1989 it's all
synthetic). So they're not rounded. On the other hand, they are still 100x too low for
percentage anomalies. and the 2.5 grids are sent to the gridder as 'synthfac=100'!!
When currently it's 1! So.. some changes to make :P
The 2.5-degree PRE/WET path is now at x10 all the way to the final gridding. The
0.5-degree PRE/WET path is at x10 until the production of the synthetic WET, at which
point it has to be x1 to line up with the pre-1990 output from the gridder (the gridder
outputs .glo files as x1 only, we haven't used the 'actfac' parameter yet and we're
not going to start!!).
Got all that fixed. Then onto the excessions Tim found - quite a lot that really should
have triggered the 3/4 sd cutoff in anomauto.for. Wrote 'retrace.for', a proglet I've
been looking for an excuse to write. It takes a country or individual cell, along with
dates and a run ID, and preforms a reverse trace from final output files to database. It's
not complete yet but it already gives extremely helpful information - I was able to look
at the first problem (Guatemala in Autumn 1995 has a massive spike) and find that a
station in Mexico has a temperature of 78 degrees in November 1995! This gave a local
anomaly of 53.23 (which would have been 'lost' amongst the rest of Mexico as Tim just
did country averages) and an anomaly in Guatemala of 24.08 (which gave us the spike):
Now, this is a clear indication that the standard deviation limits are not being applied.
Which is extremely bad news. So I had a drains-up on anomauto.for.. and.. yup, my awful
programming strikes again. Because I copied the anomdtb.f90 process, I failed to notice
an extra section where the limit was applied to the whole station - I was only applying
it to the normals period (1961-90)! So I fixed that and re-ran. Here are the before and
after outputs from trace.for:
The column to be looking at is this one ---------------------^
Because it's a traceability program, it works backwards in time:
Row 1 Final gridded output
Row 2 Gridded absolutes (should == Row 1)
Row 3 Climatology
Row 4 Gridded anomalies
Row 5 Anomalies
Row 6 Actual station values
Columns are as follows:
Cols 1,2 Year, Month
Col 3 Mean
Cols 4-6 Min with cell indices
Cols 7-9 Max with cell indices
Row 5:
Cols 4-7 Min with cell indices and line # in anoms file
Cols 8-11 Max with cell indices and line # in anoms file
Row 6:
Cols 4-7 Min with cell indices and WMO code in database
Cols 8-11 Max with cell indices and WMO code in database
In this case, the erroneous value of 78 degrees has been counted in the earlier run,
giving an anomaly of 53.23. In the later run, it hasn't - the anomaly of 1.82 is from a
different cell (227,148 instead of 217,172).
So, re-running improved matters. The extremes have vanished. But the means are still out,
sometimes significantly.
Stand by, we're about to go down the rabbit-hole again!
I took the twelve 1990 anomaly files from the original 1901-2006 run (that was done with
some flavour of anomdtb.f90). They were here:
Then I modified the update 'latest databases' file to say that tmp.0705101334.dtb was the
current database, and made a limited run of the update program for tmp only, killing it
once it had produced the anomaly files. The run was #0908181048.
So, under /cru/cruts/version_3_0/fixing_tmp_and_pre/custom_anom_comparisons, we have a
'manual' directory and an 'automatic' directory, each with twelve 1990 anomaly files. And
how do they compare? NOT AT ALL!!!!!!!!!
The numbers of values in each pair are not always identical, but are within 2 or 3 so that's
not too worrying. Worrying, yes, just not fatal.
There are a number of things going on. The lats and lons are the same, just scaled (because
originally the TMP coordinates were x10 not x100). We can ignore that problem. The real
problem is the completely different results from the automated system - I don't understand
this because I painstakingly chcked the anomauto.for file to ensure it was doing the right
job!! The overall pattern of anomalies is roughly the same - it's just that the actual values
differ. Not always lower - sometime higher. Could be a rounding error..
Got anomauto to dump the first month of the first station (10010). The clue to the problem is
in the first lines - we're only getting the full-length mean (used for SD calculations) and
not the 61-90 mean. nv should be <= 30.
Aaaaand.. FOUND IT! What happened was this: in the original anomdtb.f90 program, there's a
test for existing normals (in the header of each station). If they are present, then SD is
calculated (to allow excessions to be screened out). If not, SD is calculated and then
used to screen excessions, then a 61-90 normal is built provided there are enough values
after the screening. However, in my version, I followed the same process - but crucially,
I wasn't using the same variable to store the existing normals and the calculated ones!!
So we were ending up with the 'full length' normal (n=86 in the above example) instead.
All I had to add was a single line:
ave = xstnrms(im)*fac ! NEW - actually *use* existing normals!!
We then get:
onestn(0490,01) = -1.50
d( 1,0490,01) = 4.20
Which is what we want. So, a complete re-run (just tmp) for 1990, still using the old db.