This Plot of the Month will show you how to make a grouped and stacked bar chart using 11 years of precipitation data from the CBNERR site at Jug Bay, Maryland.
But first, let’s talk about getting the data ready for our plots.
Part of my work involves looking at how many days each year had a certain amount of precipitation fall. For example, how many days in 2005 did more than 20 mm of liquid precipitation fall?
We can use SWMPr and a simple loop to get our data from 15 minute millimeter data to days per year.
#Step 1: Load in your data
library(SWMPr)
mypath <- 'C:/Users/Kari/Documents/RData/OPC'
#Import Meteorological Data previously downloaded from Jug Bay
mydataMET <- import_local(mypath, 'CBMJBMET', trace = T)
#Step 2: Aggregate total precipitation into daily sum
Daily_PRCP<-aggreswmp(mydataMET, 'days', function(x) sum(x, na.rm = T),
params = 'totprcp')
For our plots, we will look at 3 types of operationally defined rainy days.
Wet Days are when more than 10 mm of liquid precipitation fell.
Very Wet Days is more than 20 mm.
Extremely Wet Days is more than 30 mm.
The code below is a very simple loop and filter.
#Step 3: Define our rainy days
#creates a new empty vector for our new data
Wet_Days = NULL
Very_Wet_Days = NULL
Extremely_Wet_Days = NULL
#format date
DATE<-as.Date(Daily_PRCP$datetimestamp, format="%Y-%m-%d %H:%M:%S", tz="UTC")
tmpyear <- as.numeric(format((DATE),"%Y"))
#begin loop
#note: change the iyear range to suit your time series
i=1
for (iyear in 2004:2014 ) {
#sets start of each year
Wet_Days[i] <- length(Daily_PRCP$totprcp[tmpyear==iyear & Daily_PRCP$totprcp > 10])
Very_Wet_Days[i] <- length(Daily_PRCP$totprcp[tmpyear==iyear & Daily_PRCP$totprcp > 20])
Extremely_Wet_Days[i] <- length(Daily_PRCP$totprcp[tmpyear==iyear & Daily_PRCP$totprcp > 30])
i = i+1
}
You can use this script to define your own thresholds of daily precipitation or another variable. Maybe you only care about huge 100 mm rain events. Or maybe you only care about water temperatures above an organism's thermal tolerance. Endless possibilities!
#Step 4: Finally the Plot(s)
#First create a matrix
#we want it to have 3 columns, each with 11 rows (11 years of data)
Rainy_Days <-matrix(c(Wet_Days, Very_Wet_Days,Extremely_Wet_Days),
nrow=11, ncol=3)
#creates labels for your x-axis
Years<-c(2004:2014)
#Grouped Multi-bar chart
barplot(t(Rainy_Days), #this transposes the matrix
main="Rainy Days", ylab= "Days", #label you title and y-axis
beside=TRUE, col=heat.colors(3), #pick your colors!
names.arg=(Years)) #label your x-axis
legend("topright", c("10mm","20mm","30mm"),
cex=1.5, bty="n", fill=heat.colors(3))
#Gray Scale Version (Cheaper for Printing!)
#In R, 0 is black and 1 is white, so fractions are different grays
barplot(t(Rainy_Days), main="Rainy Days", ylab= "Days",
beside=TRUE, col=gray.colors(3, start = 0.2, end = 0.8),
names.arg=(Years))
legend("topright", c("10mm","20mm","30mm"), cex=1.5,
bty="n", fill=gray.colors(3, start = 0.2, end = 0.8))
#Stacked Bar Chart
barplot((Rainy_Days), #no longer a transposed matrix
main="Rainy Days", ylab= "Days",
names.arg=c("10mm", "20mm", "30mm"), #labels each stacked histogram
col=topo.colors(11), #the color palette of 11 repeating colors
ylim=c(0,325)) #sets bounds of y-axis
legend("topright", c("2004","2005","2006","2007","2008","2009",
"2010","2011","2012","2013","2014"), #your legend
bty="n", fill=topo.colors(11),horiz=F, ncol=3,cex=1.25)
I picked these color palettes as examples since I find them pretty, but also to avoid using red/green color combinations, which make data interpretation difficult for anyone with colorblindness.
Lastly, as a technical note, those pesky NA’s can affect our analysis. In this example, all NA’s will be interpreted as days without rainfall, which could bias the results. For my work, I only use years with <10% of days missing (standard for this analysis).
The code below will allow you to check how many days each year are missing and whether you should consider throwing out a year!
#Step 5: A note about NAs
#This time when we aggregate data, do not pass NA's (na.rm=F)
Daily_PRCP<-aggreswmp(mydataMET, 'days', function(x) sum(x, na.rm = F),
params = 'totprcp')
#Are there any NA's. Spoiler, yes!
sum(is.na(Daily_PRCP))
#where are these NAs?
Daily_PRCP$datetimestamp[is.na(Daily_PRCP$totprcp)]
#format our date
DATE<-as.Date(Daily_PRCP$datetimestamp, format="%Y-%m-%d %H:%M:%S", tz="UTC")
tmpyear <- as.numeric(format((DATE),"%Y"))
#loop through each year to determine what percentage of the year is missing
#to make it easy, I used 325.25 as a way to "avoid" leap years
#create a new empty vector for our missing days
Missing_Days = NULL
#begin loop
i=1
for (iyear in 2004:2014 ) {
#sets start of each year
Missing_Days[i] <- sum(is.na(Daily_PRCP)[tmpyear==iyear])/365.25
i = i+1
}
print(Missing_Days)
Missing<-zoo((Missing_Days*100),(c(2004:2014)))
> Missing
2004 2005 2006 2007 2008 2009 2010
25.4620123 14.2368241 2.4640657 7.6659822 1.9164956 0.0000000 0.2737851
2011 2012 2013 2014
4.1067762 0.0000000 0.0000000 0.2737851
It looks like 2004 and 2005 had some missing data, which could affect our analysis! I would either throw these years out or fill them by extrapolation.
This may not be the best or most efficient way to look at precipitation exceedances but practice makes perfect! These bar charts can be adapted to your own liking!
Hope it helps!
-Kari