+

Observered Runoff Data May Inlude NaN Values

Overview

Unlike other input data, the observed runoff values presented to VELMA may contain NaN values. The code within VELMA that computes Nash-Sutcliffe ("NSE") and Root Mean Square ("RMSE") statistics for runoff treats NaN as a "no data" value. Any (simulated, observed) data-pair where one or both values equal NaN is "rejected" (i.e. ignored) by the NSE and RMSE calculators.

Like other driver data files, any observed runoff file presented to a VELMA simulation must have a data value for each Julian day between (inclusively) the simulation's forcing_start and forcing_end parameter year values.

Example
Suppose a simulation configuration has forcing_start=1991 and forcing_end=1993.
Driver data files for that simulation must contain (1993 - 1991 + 1) * 365 = 1095 values.

Unlike other driver data files, observed runoff files are allowed to have NaN as the value for a given step.
The option of explicitly specifying that an observed runoff data value is missing (via NaN) allows VELMA to calculate NSE and RMSE values for your simulation run's runoff even when you don't have observed data for every step of the [forcing_start, forcing_end] span. The NSE and RMSE values are calculated with however many data-pairs are available. VELMA reports how many observed values were used, but not their distribution across the simulated values. You are responsible for knowing and deciding whether their distribution is a problem or not.

Example
Suppose the observed runoff file for our forcing_start=1991 and forcing_end=1993 simulation configuration has NaN values for all of the steps (days) in year 1992.
Further suppose that the simulation is run for year 1992: the NSE for this run will be NaN, and VELMA results will report it as such along with the fact that 365 of 365 elements (obs,sim data-pairs) were rejected.
Now suppose the same simulation configuration is run again, but for 1991 through 1993, and that the NSE for this run computes to 0.75. You must be aware -- apart from anything VELMA inludes in its results -- that the 730 of 1095 data-pairs used to compute the NSE (and RMSE) completely exclude the middle year of the simulation run.

Specific Notes

11.5121
NaN
24.80
26.2279

When the file has a single field, that field contains the observed runoff value for one simulation step, and implicitly, the first value in the file is the value for first day of the forcing_start year, and so on.

Here are a few lines from a triple-field version of the same data:

1991,1,11.5121
1991,2,NaN
1991,3,24.80
1991,4,26.2279

When the file has three fields, the first two fields are interpreted as a year and jday.
However, the first value in the file is still the value for first day of the forcing_start year!.
The year and jday are permitted, but VELMA ignores them. They are permitted to make the file easier for humans to read and verify.