The Netherlands Hydrological Society (NHV) has several working groups. One of them works on Time Series Analysis and caters for several public meetings throughout the year. On 28 January 2016, hosted by TNO in Utrecht, almost 100 experts gathered to discuss the latest findings and developments in time series analysis, bridging statistical expertise with practical applications. Presentations from business practitioners alternated with presentations from scientific experts. Michael van der Valk reports.
It is not that the statistical techniques are state-of-the-art, as many of them have been used since Box and Jenkins opened them up to the general audience in 1970. What is relatively innovative, is the way the statistical techniques are being used in the Dutch water management practise, in drinking water supply, in land management, in groundwater management. Much experience has been acquired in uncertainties, but there are no clear definitions on when time series models can be used, and how exact their outcomes are. In addition, the benefits of different techniques are not always well-known. For example, Box and Jenkins assume a discrete dataset, whereas new statistical software implementations offer analysis of continuous response functions. Is one technique better than the other? If so, why and in what cases?
Statistical analysis always assumes certain conditions to be met in order to make it a valid analysis. Do hydrologists always adhere to these preconditions, and, if not, how wrong is that? Can still something be said about a water system even if not all criteria for a proper model are met? One of the conditions for a good model, for example, is that there is white noise: successive noise values should be independent, i.e. auto-correlation should be zero. If this is not the case, then the derived uncertainties from the model are not correct.
White noise can also be used for multiple time series analysis to simulate data gaps of piezometric heads and to better detect measurement errors. Another topic discussed was what to do with time series that do not have equal time steps: interpolation is needed, so the best thing is to measure with equidistant time internals. Adding parameters often leads to a better fit of the model to the data, but not necessarily to a better model. Chances are that the model will attempt to explain the white noise.
There was some discussion on the correct (or shortest) time length of measurement needed in order to say something sensible about a groundwater system. Rather than just shouting numbers ("6 months", "30 years") the frequency of measurements depend on the process frequency, but, rather, or so I would say, on the frequency of the process that one is interested in. If a system is stable, one measurement value could be sufficient to define the required system characteristic. If a system is highly volatile, we might need many data in a short time frame, but maybe also only need a short time frame of measurements. The real point is that one does not know on forehand what the systems looks like, nor what system characteristics one might need in, say, 30 years from now.
A presented solution for the problem of overparameterisation in case of highly frequent data measurements was the deletion of data. This sounds a bit odd, and how does one chose which data should be deleted? Following a discussion, a measurement frequency of every 5 minutes was presented for groundwater and discharge values, enabling water managers to better estimate capacity and seasonal effects on a water source. The gain here was the lack of need to manually measure the well discharge and a reduction in the number of wells needed.
Although the topic of the day was time series analysis, discussions made clear that a thorough understanding of the subsurface is required to use the available statistical tools. A mathematically good model is not necessarily a good model in real life. But, interestingly, a model with a less than ideal statistical fit can give a proper representation of a geohydrological system (?), and be effective for the aim of the research.
It was shown that in one case the effect of pumping (drawdown) was much larger on one side of the pumping station than on the other side. With geological information this could be explained. But, how does one differentiate the groundwater head signal into a signal of withdrawal and a signal of groundwater recharge? According to one speaker this is only possible when the variance of one signal is about 10 times the variance of the other signal. This, of course, is also an arbitrary measure.
On 8 June 2004 there was a 'kick-off' of the NHV working group on Time Series Analysis with a full day of discussions. It can be concluding that 12 years later we have learned a lot, but also that the questions have remained the same. There is still a gap of understanding between those that manage water and those that measure water – a common language is still missing. Both groups agree that time series analysis is a valuable tool, while at the same time theoretical knowledge of statistics and geohydrology is a necessity in order to properly use the available tools.
In this respect the pattern (of the time series analysis discussion time series), I am inclined to conclude, is similar to that for the use of isotopes on hydrology: very powerful tools, for those who can deal with them.
Michael van der Valk