After an email enquiry from Wladimir J. Alonso (alonsow@xxxxx), in which unusual behaviour of CRU TS 2.10
Vapour Pressure data was observed, I discovered that some of the Wet Days and Vepour Pressure datasets had been
swapped!!
The files I was looking at were decadal, 1981-1990.
Vapour Pressure, January: Min 0 Max 310
Vapour Pressure, February: Min 0 Max 280
Wet Days, January: Min 0 Max 3220
Wet days, February: Min 0 Max 3240
So I wrote crutsstats.for, whioch returns monthly and annual minima, maxima and means for any gridded output file.
It looks like a consistent problem: all the decadal VAp and WET files should be discarded, and only the 'full run' 1901-2002
files used. But my theory that the error occurred when the 1901-2002 files were converted to decadal doesn't sound true now,
because why would the precision levels change? Surely, if the decadal files are derived from the 1901-2002 files, it's just
a case of copying data across?
Let's look at *just* 1981, to try and assess this issue:
Now, where were we.. ah yes, Vapour Pressure. So far:
Original: vap.0311181410.dtb
+
MCDW: vap.0709111032.dtb
v
v
Intermediate: vap.0710241541.dtb
+
CLIMAT: vap.0710151817.dtb
v
v
Final: vap.0710241549.dtb
> ***** AnomDTB: converts .dtb to anom .txt for gridding *****
> Enter the suffix of the variable required:
.vap
> Select the .cts or .dtb file to load:
vap.0710241549.dtb
> Specify the start,end of the normals period:
1961,1990
> Specify the missing percentage permitted:
25
> Data required for a normal: 23
> Specify the no. of stdevs at which to reject data:
3
> Select outputs (1=.cts,2=.ann,3=.txt,4=.stn):
3
> Check for duplicate stns after anomalising? (0=no,>0=km range)
0
> Select the generic .txt file to save (yy.mm=auto):
vap.txt
> Select the first,last years AD to save:
1901,2006
> Operating...
> NORMALS MEAN percent STDEV percent
> .dtb 908812 45.2
> .cts 35390 1.8 944202 47.0
> PROCESS DECISION percent %of-chk
> no lat/lon 105 0.0 0.0
> no normal 1064261 53.0 53.0
> out-of-range 49 0.0 0.0
> accepted 944153 47.0
> Dumping years 1901-2006 to .txt files...
Well.. 47% accepted, 53% no normals.. pretty much as expected, and unlikely to improve no matter how many new CLIMAT
and MCDW updates there are. We need back data for 1961-1990.
<BEGIN_QUOTE>
crua6[/cru/cruts/version_3_0/secondaries/vap] ./glo2abs
Welcome! This is the GLO2ABS program.
I will create a set of absolute grids from
a set of anomaly grids (in .glo format), also
a gridded version of the climatology.
Enter the path and name of the normals file: clim.6190.lan.vap
Enter a name for the gridded climatology file: clim.6190.lan.vap.grid
Enter the path and stem of the .glo files: vapglo/vap.
Enter the starting year: 1901
Enter the ending year: 2006
Enter the path (if any) for the output files: vapabs/
Now, CONCENTRATE. Addition or Percentage (A/P)? A
Do you wish to limit the output values? (Y/N): Y
1. Set minimum to zero
2. Set a single minimum and maximum
3. Set monthly minima and maxima (for wet/rd0)
Choose: 1
Right, erm.. off I jolly well go!
vap.01.1901.glo
vap.02.1901.glo
(etc)
<END_QUOTE>
and finally, create the output files:
<BEGIN_QUOTE>
crua6[/cru/cruts/version_3_0/secondaries/vap] ./mergegrids
Welcome! This is the MERGEGRIDS program.
I will create decadal and full gridded files
from the output files of (eg) glo2abs.for.
Enter a gridfile with YYYY for year and MM for month: vapabs/vap.MM.YYYY.glo.abs
Enter Start Year: 1901
Enter Start Month: 01
Enter End Year: 2006
Enter End Month: 12
Please enter a sample OUTPUT filename, replacing
start year with SSSS and end year with EEEE: cru_ts_3_00.SSSS.YYYY.vap.dat
Try again.. read instructions this time?
Please enter a sample OUTPUT filename, replacing
start year with SSSS and end year with EEEE: cru_ts_3_00.SSSS.EEEE.vap.dat
Writing cru_ts_3_00.1901.1910.vap.dat
Writing cru_ts_3_00.1911.1920.vap.dat
Writing cru_ts_3_00.1921.1930.vap.dat
Writing cru_ts_3_00.1931.1940.vap.dat
Writing cru_ts_3_00.1941.1950.vap.dat
Writing cru_ts_3_00.1951.1960.vap.dat
Writing cru_ts_3_00.1961.1970.vap.dat
Writing cru_ts_3_00.1971.1980.vap.dat
Writing cru_ts_3_00.1981.1990.vap.dat
Writing cru_ts_3_00.1991.2000.vap.dat
Writing cru_ts_3_00.2001.2006.vap.dat
<END_QUOTE>
Ah - and I was really hoping this time that it would just WORK. But of course not - nothing works first
time in this project. I ran crutsstats on cru_ts_3_00.1901.2006.vap.dat, and:
What?! Every year has the same min (fine, VAP of 0 is probably impossible), max (I can just about believe,
if there's a cell with no stations inside the cdd and the normal for it happens to be the highest value, and
MEAN (oh no, NO WAY!). What's odder - the .glo files are different:
Admittedly, 56 lines different out of 360 isn't hugely different. And looking, they are only slight and
infrequent differences. But the monthly stats are all cloned as well:
Well the first thing to do, after the inevitable wailing and gnashing of teeth, is to re-run glo2abs
without the 'zero minimum' flag (just in case I coded that badly, I was in a hurry):
<BEGIN_QUOTE>
crua6[/cru/cruts/version_3_0/secondaries/vap] ./glo2abs
Welcome! This is the GLO2ABS program.
I will create a set of absolute grids from
a set of anomaly grids (in .glo format), also
a gridded version of the climatology.
Enter the path and name of the normals file: clim.6190.lan.vap
Enter a name for the gridded climatology file: clim.6190.lan.vap.grid2
Enter the path and stem of the .glo files: vapglo/vap.
Enter the starting year: 1901
Enter the ending year: 2006
Enter the path (if any) for the output files: vapabs/
Now, CONCENTRATE. Addition or Percentage (A/P)? A
Do you wish to limit the output values? (Y/N): N
Right, erm.. off I jolly well go!
vap.01.1901.glo
vap.02.1901.glo
(etc)
<END_QUOTE>
<BEGIN_QUOTE>
crua6[/cru/cruts/version_3_0/secondaries/vap] ./mergegrids
Welcome! This is the MERGEGRIDS program.
I will create decadal and full gridded files
from the output files of (eg) glo2abs.for.
Enter a gridfile with YYYY for year and MM for month: vapabs/vap.MM.YYYY.glo.abs
Enter Start Year: 1901
Enter Start Month: 01
Enter End Year: 2006
Enter End Month: 12
Please enter a sample OUTPUT filename, replacing
start year with SSSS and end year with EEEE: cru_ts_3_00.SSSS.EEEE.vap.dat
Writing cru_ts_3_00.1901.1910.vap.dat
Writing cru_ts_3_00.1911.1920.vap.dat
Writing cru_ts_3_00.1921.1930.vap.dat
Writing cru_ts_3_00.1931.1940.vap.dat
Writing cru_ts_3_00.1941.1950.vap.dat
Writing cru_ts_3_00.1951.1960.vap.dat
Writing cru_ts_3_00.1961.1970.vap.dat
Writing cru_ts_3_00.1971.1980.vap.dat
Writing cru_ts_3_00.1981.1990.vap.dat
Writing cru_ts_3_00.1991.2000.vap.dat
Writing cru_ts_3_00.2001.2006.vap.dat
<END_QUOTE>
Sadly, that gave the same result. So what of the published (v2.10) VAP dataset? That looks ~ok:
Not good at all. Or, rather, good that it must be a solvable problem. Except that it's 10 to 5 on a Sunday
afternoon and it's me that's got to solve it.
Where to start? Well, retrace your steps, that's how you get out of a minefield. So first up, to compare
similar months in the anomaly files. Though I already know what I'm going to find, don't I? Because glo2abs
isn't going to do anything unusual, it just adds the normal and there you go. So if the absolutes are very
similar, the anomalies will be, too.. hmm. Well, I *suppose* I could try producing two more copies of the
output files - one with just synthetic data and one with just observed data? It's only a couple of re-runs
of the quick_interp_tdm2.pro IDL routine..
crua6[/cru/cruts/version_3_0/secondaries/vap/syn_only] ./glo2abs
Welcome! This is the GLO2ABS program.
I will create a set of absolute grids from
a set of anomaly grids (in .glo format), also
a gridded version of the climatology.
Enter the path and name of the normals file: ../clim.6190.lan.vap
Enter a name for the gridded climatology file: clim.6190.lan.vap.grid
Enter the path and stem of the .glo files: vapsynglo/vapsyn.
Enter the starting year: 1901
Enter the ending year: 2006
Enter the path (if any) for the output files: vapsynabs/
Now, CONCENTRATE. Addition or Percentage (A/P)? A
Do you wish to limit the output values? (Y/N): N
Right, erm.. off I jolly well go!
vapsyn.01.1901.glo
vapsyn.02.1901.glo
(etc)
crua6[/cru/cruts/version_3_0/secondaries/vap/syn_only] ./mergegrids
Welcome! This is the MERGEGRIDS program.
I will create decadal and full gridded files
from the output files of (eg) glo2abs.for.
Enter a gridfile with YYYY for year and MM for month: vapsynabs/vapsyn.MM.YYYY.glo.abs
Enter Start Year: 1901
Enter Start Month: 01
Enter End Year: 2006
Enter End Month: 12
Please enter a sample OUTPUT filename, replacing
start year with SSSS and end year with EEEE: cru_ts_3_00.SSSS.EEEE.vap.syn.dat
Writing cru_ts_3_00.1901.1910.vap.syn.dat
Writing cru_ts_3_00.1911.1920.vap.syn.dat
Writing cru_ts_3_00.1921.1930.vap.syn.dat
Writing cru_ts_3_00.1931.1940.vap.syn.dat
Writing cru_ts_3_00.1941.1950.vap.syn.dat
Writing cru_ts_3_00.1951.1960.vap.syn.dat
Writing cru_ts_3_00.1961.1970.vap.syn.dat
Writing cru_ts_3_00.1971.1980.vap.syn.dat
Writing cru_ts_3_00.1981.1990.vap.syn.dat
Writing cru_ts_3_00.1991.2000.vap.syn.dat
Writing cru_ts_3_00.2001.2006.vap.syn.dat
<END_QUOTE>
crua6[/cru/cruts/version_3_0/secondaries/vap/obs_only] ./glo2abs
Welcome! This is the GLO2ABS program.
I will create a set of absolute grids from
a set of anomaly grids (in .glo format), also
a gridded version of the climatology.
Enter the path and name of the normals file: ../clim.6190.lan.vap
Enter a name for the gridded climatology file: clim.6190.lan.vap.grid
Enter the path and stem of the .glo files: vapobsglo/vapobs.
Enter the starting year: 1901
Enter the ending year: 2006
Enter the path (if any) for the output files: vapobsabs/
Now, CONCENTRATE. Addition or Percentage (A/P)? A
Do you wish to limit the output values? (Y/N): N
Right, erm.. off I jolly well go!
vapobs.01.1901.glo
vapobs.02.1901.glo
(etc)
crua6[/cru/cruts/version_3_0/secondaries/vap/obs_only] ./mergegrids
Welcome! This is the MERGEGRIDS program.
I will create decadal and full gridded files
from the output files of (eg) glo2abs.for.
Enter a gridfile with YYYY for year and MM for month: vapobsabs/vapobs.MM.YYYY.glo.abs
Enter Start Year: 1901
Enter Start Month: 01
Enter End Year: 2006
Enter End Month: 12
Please enter a sample OUTPUT filename, replacing
start year with SSSS and end year with EEEE: cru_ts_3_00.SSSS.EEEE.vap.obs.dat
Writing cru_ts_3_00.1901.1910.vap.obs.dat
Writing cru_ts_3_00.1911.1920.vap.obs.dat
Writing cru_ts_3_00.1921.1930.vap.obs.dat
Writing cru_ts_3_00.1931.1940.vap.obs.dat
Writing cru_ts_3_00.1941.1950.vap.obs.dat
Writing cru_ts_3_00.1951.1960.vap.obs.dat
Writing cru_ts_3_00.1961.1970.vap.obs.dat
Writing cru_ts_3_00.1971.1980.vap.obs.dat
Writing cru_ts_3_00.1981.1990.vap.obs.dat
Writing cru_ts_3_00.1991.2000.vap.obs.dat
Writing cru_ts_3_00.2001.2006.vap.obs.dat
<END_QUOTE>
So.. how do the stats look for these two datasets?
Oh, GOD. What is going on? Are we data sparse and just looking at the climatology? How can a synthetic
dataset derived from tmp and dtr produce the same statistics as an 'real' dataset derived from observations?
Let's be logical. Here are the two 'separated' gridding runs:
Well they look fine. The synthetic run has no other data inputs ('nostn=1'), and the observed run has no references to
the synthetic data. So.. either quick_interp_tdm2.pro is doing something 'unusual', or, or.. hang on, let's try the
climatology for stats:
Ah, Bingo was his name-o! as I was hoping (well OK it's a bad kind of hope), the reason it's all the same is that it is
by and large defaulting to the climatology. Which means that not much (any?) data is getting through, no matter if we
use synthetic, observed, or both together. What's odd about that conclusion is that the synthetic data is derived from
TMP and DTR - two very well-populated datasets! So synthetics alone should pretty much fill the.. hang on, just though
of something horrendous.. oh, okay, probably not that. I was wondering if glo2abs.for was factoring the normals so that
the anomalies were insignificant, but the equation is:
..so the anomaly is getting the weight! But still - - not a wise thing to leave to automatics. So glo2abs should prompt
the user.. but with what? Just one anomaly and normal? Several? The same one from different timesteps? Eeek. Let's look
at this actual case.
January 1961, lines 11103, 11104 in the glo file (11099, 11100 without header, putting it on about 33.5 degs N)
0.0000E+00 0.0000E+00 0.0000E+00 0.0000E+00 0.0000E+00 0.0000E+00 4.7173E-04 4.7224E-03
5.4273E-03 6.1323E-03 6.8372E-03 7.5422E-03 8.2472E-03 1.9677E-03 0.0000E+00 0.0000E+00
Those anomalies are mighty tiny, given that the absolutes are three-digit integers! Hardly surprising they're not really
appearing on the radar when added to normals typically two orders of magnitude higher! Even with the *10 in the glo2abs
prog, we're still looking at values around 0.06.
Looked at the observed anomalies (output from anomdtb.f90) - here the anomalies are larger! Between -5 and +5, roughly,
which is what I'm used to seeing in .txt files.
To investigate the synthetics, I needed to look at re-run vap_gts_tdm.pro. It says,
; Note that anomalies are in hPa*10 (bin) or hPa (glo)
So the binary file anomaly units - the ones we're using - are in hPa*10. Let's get one o' them synthetic glo files:
For Jan 1961 (may as well stick with it), -999 is the missing value code. The range is -0.0149 to +0.0222 (remember this is
an anomaly in hPa according to the program comment). So if it's telling the truth, the binary anomalies presented to
quick_interp_tdm2.pro will range from roughly -0.3 to +0.3. still nt going to impinge on normals between 1 and 358, is it?
So, what are the normals in? Well according to clim.6190.lan.vap:
crua6[/cru/cruts/version_3_0/secondaries/vap] head -11 clim.6190.lan.vap
Tyndall Centre grim file created on 12.01.2004 at 11:47 by Dr. Tim Mitchell
.vap = vapour pressure (hPa)
0.5deg lan clim:1961-90 MarkNew
[Long=-180.00, 180.00] [Lati= -90.00, 90.00] [Grid X,Y= 720, 360]
[Boxes= 67420] [Years=1975-1975] [Multi= 0.1000] [Missing=-999]
Grid-ref= 1, 148
291 294 296 293 287 279 265 262 271 279 286 287
Grid-ref= 1, 311
14 11 13 21 44 69 92 90 65 37 22 14
Grid-ref= 1, 312
13 10 12 20 43 67 90 87 63 35 21 13
That's what I've been missing! D'oh. That '[Multi= 0.1000]'. That would still only give a range of 0.1 to 35.8 hPa, and
my anomalies are still around 0.006 (or 0.3 for synthetics).
Two things, then. Firstly to get glo2abs to read the multiplicative factor from the climatology header and impose it on the
output. Secondly to work out why all the anomalies have different magnitudes! Or is vapour pressure really so teeny?
Working on glo2abs. Well my theory for additive anomalies is this: I read in the normals, and apply the multiplicative factor
in the header (for VAP it's 0.1). I assume the anomalies are already in the relevant units (ie require no factoring). This
looks to be the case for .txt files anyway. So I can add the anomaly to the adjusted normal. Then (because I need integer
output) I can DIVIDE by the factor (because that got us from integer to real before). Fine in theory but it all depends on
the anomalies being in regular 'units' (why wouldn't they be? They're reals!). OK, check from the beginning, obs first:
Database: hPa*10 (typically 3-digit integers)
anomdtb.for calls subroutine CheckVariSuffix, which contains:
<BEGIN_QUOTE>
else if (Suffix.EQ.".vap") then
Variable="vapour pressure (hPa)"
Factor = 0.1
<END_QUOTE>
And how does anomdtb.f90 use the Factor? well in the original version:
I *think* the factor is being used multiplicatively. I don't understand why it's being used as a divisor though.. I must
have understood last December because I managed to rewrite the 'standard deviation' section, also using it as a divisor!
One obvious thing to try is to use the revised glo2abs. That should now be working in 'units' (but saving in whatever
range the normals are in). After that I could try comparing the old and 'new' (ie modded by me) versions of anomdtb.f90
to ensure I didn't break something (sure I didn't, but still..)
So, I revised glo2abs. It now reads the 'Multi' factor from the climatology header, and applies it to the normals before
they're used.
A sample of the outputs, vap.12.1962.glo, had a range of values from -2.3006 to +1.8388, with the majority being 0. A total
of 56387 cells were nonzero, which given that there are 67420 land cells, isn't too bad. It's a pretty gaussian distribution,
too. It still seems like a small variation (typically +/- 0.5). For the cell where I live (Norwich, 363,286), the normals are:
Well our sample December 1962 range of anomalies was -2.3006 to +1.8388, and the January range is -3.3640 to +2.1250. So, I
have to admit, that's the same order of magnitude for our particular cell, year and month(s).
So, assuming these .glo files are OK, we'll try glo2abs again:
<BEGIN_QUOTE>
crua6[/cru/cruts/version_3_0/secondaries/vap] ./glo2abs
Welcome! This is the GLO2ABS program.
I will create a set of absolute grids from
a set of anomaly grids (in .glo format), also
a gridded version of the climatology.
Enter the path and name of the normals file: clim.6190.lan.vap
Enter a name for the gridded climatology file: deleteme1
Enter the path and stem of the .glo files: vapglo/vap.
Enter the starting year: 1901
Enter the ending year: 2006
Enter the path (if any) for the output files: vapabs/
Now, CONCENTRATE. Addition or Percentage (A/P)? A
Do you wish to limit the output values? (Y/N): Y
1. Set minimum to zero
2. Set a single minimum and maximum
3. Set monthly minima and maxima (for wet/rd0)
Choose: 1
Right, erm.. off I jolly well go!
vap.01.1901.glo
vap.02.1901.glo
(etc)
<END_QUOTE>
..and the result.. look good! For (again) December 1962:
Min 0 (well I did set that, see above)
Max 315
Number of zeros: 1078, perfectly respectable although I do wonder if VAP=0 is illegal.. hmm.. OK, added an option in glo2abs:
<BEGIN_QUOTE>
crua6[/cru/cruts/version_3_0/secondaries/vap] ./glo2abs
Welcome! This is the GLO2ABS program.
I will create a set of absolute grids from
a set of anomaly grids (in .glo format), also
a gridded version of the climatology.
Enter the path and name of the normals file: clim.6190.lan.vap
Enter a name for the gridded climatology file: deleteme3
Enter the path and stem of the .glo files: vapglo/vap.
Enter the starting year: 1901
Enter the ending year: 2006
Enter the path (if any) for the output files: vapabs/
Now, CONCENTRATE. Addition or Percentage (A/P)? A
Do you wish to limit the output values? (Y/N): Y
1. Set minimum to zero
2. Set a single minimum and maximum
3. Set monthly minima and maxima (for wet/rd0)
4. Set all values >0, (ie, positive)
Choose: 4
Right, erm.. off I jolly well go!
vap.01.1901.glo
vap.02.1901.glo
(etc)
<END_QUOTE>
Result for December 1962: Min 1, Max 315. A good spread of values, without a disproportionate number of '1's, I'm please
to say.
So, to generate the output files. Again.
<BEGIN_QUOTE>
crua6[/cru/cruts/version_3_0/secondaries/vap] ./mergegrids
Welcome! This is the MERGEGRIDS program.
I will create decadal and full gridded files
from the output files of (eg) glo2abs.for.
Enter a gridfile with YYYY for year and MM for month: vapabs/vap.MM.YYYY.glo.abs
Enter Start Year: 1901
Enter Start Month: 01
Enter End Year: 2006
Enter End Month: 12
Please enter a sample OUTPUT filename, replacing
start year with SSSS and end year with EEEE: cru_ts_3_00.SSSS.EEEE.vap.dat
Writing cru_ts_3_00.1901.1910.vap.dat
Writing cru_ts_3_00.1911.1920.vap.dat
Writing cru_ts_3_00.1921.1930.vap.dat
Writing cru_ts_3_00.1931.1940.vap.dat
Writing cru_ts_3_00.1941.1950.vap.dat
Writing cru_ts_3_00.1951.1960.vap.dat
Writing cru_ts_3_00.1961.1970.vap.dat
Writing cru_ts_3_00.1971.1980.vap.dat
Writing cru_ts_3_00.1981.1990.vap.dat
Writing cru_ts_3_00.1991.2000.vap.dat
Writing cru_ts_3_00.2001.2006.vap.dat
<END_QUOTE>
And what of the statistics. Well by now I've realised that we don't have complete coverage! So the normals are
bound to poke through quite a bit. In fact, the story is as it was in the beginning! *cries*
Now admittedly, the 106 mean does vary.. it hioits the dizzying heights of 107 on occasion! With a couple of 105s
thrown in to balance the books. Had a look at the stats in detail, compared to those for CRU TS 2.10. And guess
what? Yes.. the old stats are better! Here's the first decade:
Well, OK - I see that a VAP of zero is acceptable. Though as it's a pressure, I don't believe it! I'll stick with 1.
The issue is that the earlier dataset has a variability (in the maximum) that we just don't have in the new one. And
I feel that I've been through every bloody phase of the process and checked we're doing it right!!!
~~~
Right. Let's look at the distributions of values in each dataset. We'll take Jan 1910 and Jun 2000. And as this is
a textual document, I'll have to describe the results.
Offsets. Well each month has 360 lines, so each year has 4320 lines. So for Jan 1910 we need to skip nine years,
or 38880 lines, then take the next 360. For Jun 2000 we need to skip 99 years, or 427680 lines, then another five
months, or 1800 lines, then take the next 360. So:
head -39240 cru_ts_2.10.1901-2002.vap.dat |tail -360 > cru_ts_2.10.Jan.1910.vap.dat
head -39240 cru_ts_3.00.1901.2006.vap.dat |tail -360 > cru_ts_3.00.Jan.1910.vap.dat
head -428040 cru_ts_2.10.1901-2002.vap.dat |tail -360 > cru_ts_2.10.Jun.2000.vap.dat
head -428040 cru_ts_3_00.1901.2006.vap.dat |tail -360 > cru_ts_3_00.Jun.2000.vap.dat
I loaded the resultant monthly files into Matlab, and played with them mercilessly.
Well to start with, they all look the same. Truly. I've got a 4-plot page with TS 2.10 in the left-hand column,
and TS 3.00 on the right. January 1910 on the top, June 2000 on the bottom. and they look pretty much inseparable,
though if I had to Spot The Difference, the TS 2.10 June 2000 distribution is a little flatter (that is, the
massive spike at the low end is a little shorter, and the rest of the entourage are a little taller.
What are particularly worthy of note are the maximums. Because they don't match those produced by crutsstats.for.
Month Model Max (Matlab) Max (crutsstats)
Jan 1910 TS 2.10 312 312
Jan 1910 TS 3.00 311 311
Jun 2000 TS 2.10 319 476
Jun 2000 TS 3.00 317 358
Not entirely sure why the latter ones would be wrong. But I suspect crutsstats - because otherwise I miscounted
the line numbers to extract June 2000 with! Actually, OK, that does seem more likely.
Let's try it from the 1991-2000 files. The offset will be 9*4320 + 5*360 + 360 = 41040.
Well - looks like I did miscount, because the new files are different! And so are the Maxima:
Month Model Max (Matlab) Max (crutsstats)
Jun 2000 TS 2.10 300 476
Jun 2000 TS 3.00 358 358
..so almost perfect. At least the stats for the file I'm creating match.
And now the June 2000 histograms are much more interesting! And of course (for this is THIS project), much
more worrying. The June 2000 plot for the new data (3.00) shows a fall at VAP ->0. This is in contrast to the
other three, which show a more expotential decline from a high near 0 (though admittedly the 2.10 version does have a second
peak at around 120). In fact, the June 2000 3.00 series has peaks at ~90 and ~300! Oh, help.
The big question must be, why does it have so little representation in the low numbers? Especially given that I'm rounding
erroneous negatives up to 1!!
Oh, sod it. It'll do. I don't think I can justify spending any longer on a dataset, the previous version of which was
completely wrong (misnamed) and nobody noticed for five years.