August POTM: time series decomposition with SWMPr

11 months 2 weeks ago - 11 months 2 weeks ago #103 by Marcus Beck
Marcus Beck created the topic: August POTM: time series decomposition with SWMPr
Time series decomposition sounds like a daunting topic but most methods are conceptually simple and readily available in R. This plot of the month gives an overview of two functions in SWMPr for time series decomposition: decomp and decomp_cj

Why do we want to decompose time series?

Any measurement collected over time is a series that can be separated, or decomposed, into different components. The observations in a raw time series are affected by processes that occur at different temporal scales. Depending on our question, we might be interested in variation that occurs at one time scale but not another. Interpreting the effect of one component from a raw time series that represents the sum or aggregate affects of multiple processes can be challenging. For example, maybe we want to evaluate whether or not algal blooms are increasing or decreasing each year at a reserve. We know that phytoplankton can follow distinct seasonal patterns that might be independent of any annual trends. Annual variation could be affected by differences in nutrient inputs into the system (rainfall, management actions, etc.), whereas seasonal variation is more affected by cyclical temperature and light changes throughout each year. How can we evaluate the patterns we care about while reducing or isolating the effect of variation from another process?

The SWMPr package provides two functions to separate raw time series into unique components. Both methods use an additive or multiplicative decomposition to describe raw data as separate components for the trend, annual, seasonal, and 'unexplained' variation. More about this later.

The decomp function:

This function is a simple wrapper to the decompose function in the ts package. It is provided in SWMPr to work with 'swmpr' class objects that are returned using the data import functions. The decompose function separates a time series into additive or multiplicative components describing a trend, cyclical variation (e.g., daily or annual), and the remainder. The additive decomposition assumes that the cyclical component of the time series is stationary (i.e., the variance is constant and the seasonal trend does not change depending on the year), whereas a multiplicative decomposition accounts for non-stationarity.

The decompose function requires a ts object with a specified frequency. This conversion is handled internally within the function. An explicit input is required that defines the frequency in the time series required to complete a full period of the parameter. For example, the frequency of a parameter with diurnal periodicity would be 96 if the time step is 15 minutes (24 hours * 60 minutes / 15 minutes). The frequency of a parameter with annual periodicity at a 15 minute time step would be 35040 (365 days * 24 hours * 60 minutes / 15 minutes). For simplicity, chr strings of 'daily' or 'annual' can be supplied in place of numeric values.
# load SWMPr
# get data
dat <- apadbwq
# subset for daily decomposition
dat <- subset(dat, subset = c('2021-07-01 00:00', '2021-07-31 00:00'))
# daily decomposition of DO and plot
dc_dat <- decomp(dat, param = 'do_mgl', frequency = 'daily')

The decomp_cj function:

This function uses the decompTs function in the wq package and is similar to decomp with a few key differences. First, The decomp function decomposes the time series into a trend, seasonal, and random components, whereas decomp_cj decomposes into the grandmean, annual, seasonal, and events components. For both functions, the random or events components, respectively, can be considered anomalies that don't follow the trends in the remaining categories.

Second, the decomp_cj function provides only a monthly decomposition, which is appropriate for characterizing relatively long-term trends. This approach works well for nutrient data that are typically sampled on a monthly cycle. The function will also work with continuous water quality or weather data but note that the data are first aggregated on the monthly scale within the function before decomposition. Additional arguments passed to decompTs can be used with decomp_cj, such as startyr, endyr, and type. Values passed to type are mult (default) or add, referring to multiplicative or additive decomposition.
## get data
dat <- apacpnut
dat <- qaqc(dat, qaqc_keep = NULL)
## decomposition of chl, ggplot
decomp_cj(dat, param = 'chla_n')

So what does additive and multiplicative mean? Additive means that the original time series can be recreated as the sum of the decomposed time series. Similarly, multiplicative means that the original can be recreated as the product of the components. When would you use one over the other? As noted above, the multiplicative option for decomp should be used if you expect the seasonal trend to vary between years (i.e., it is non-stationary). Further, the documentation for the decompTs function states that multiplicative decomposition works well for data in log-space whereas additive decomposition works well for data with more normal distribution.

A word of caution, just because a time series can be decomposed into nominally different components does not mean that the results are sensible. One could easily estimate a seasonal component from a time series of random noise. Although an empirical estimate is possible, we know that a seasonal component is not actually present in random noise. As with most analysis methods, the use depends on the extent to which the data follow prior assumptions. In other words, don't estimate a seasonal component if there is no reason to believe a seasonal component exists. Fortunately, the decomposition functions in SWMPr were developed with water quality data in mind, so application to most of the data collected by SWMP is appropriate.


Last Edit: 11 months 2 weeks ago by Marcus Beck.

Please Log in to join the conversation.

Time to create page: 0.119 seconds
Powered by Kunena Forum