Having fun with SWMPr

This post is meant to introduce you to a wonderful new R packaged called SWMPr, specifically designed to work with the System Wide Monitoring System (SWMP) data.

SWMPr has a function for that!

Young Osprey at Jug Bay, MD. Credit

Young Osprey at Jug Bay, MD. Credit

The Centralized Data Management Center is home to a multitude of atmospheric, water quality, and nutrient data sets from the 28 National Estuarine Research Reserves (NERR) sites, including those in the Chesapeake Bay. But anyone who has ever worked with high frequency time series data will tell you that a lot of organization is needed to get the most out of these resources!

Lucky for me (and for you), Dr. Marcus Beck created an R package designed specifically to work with these data! SWMPr has many functions allowing for easy importing, aggregating, plotting, and so much more! It really is a great friend to all current and future users of the SWMP data!

Jug Bay is teeming with wildlife! Credit

Jug Bay is teeming with wildlife! Credit

And for anyone new to R, SWMPr has a dedicated forum called SWMPrats to help with questions. It also has an exciting Plot of the Month feature were some of the SWMPrats demonstrate scripts to create fun plots! (This is a slightly shameless plug, since I am one of those moderators!)

Let’s show off a few features of the SWMPr package!

For our extreme climate work, we are looking beyond mean values! So we are interested in determining the hottest and coldest measured temperature for, let’s say, each month. We can apply the aggreswmp function to easily determine what these values were for a 5 year time series from Jug Bay (2010-2014).

#This is the outline of the aggreswmp function that you can easily customize to your needs

aggreswmp(YOUR_Timeseries, ‘Unit_of_time_to_aggregate’, function(x) FUNCTION_TO_APPLY(x, na.rm = TRUE), params = ‘MEASUREMENT’)

#Specific example with the maximum air temperature measured each month

max<-aggreswmp(mydataMET, ‘months’, function(x) max(x, na.rm = TRUE), params = ‘atemp’)

Customizations

Unit_of_time_to_aggregate could be by hours, days, weeks, months, quarters, or years

FUNCTION_TO_APPLY could include max, min, mean, sum, variance (var), etc.

MEASUREMENT includes any of the SWMP parameters including water temperature, dissolved oxygen, salinity, nutrients, wind, etc.

So, you can take these 15 minute interval instantaneous measurements and easily calculate a suite of different values. Maybe you wanted to know the hourly mean dissolved oxygen concentration or maybe you want to determine the annual sum of precipitation! You decide!

Sample Plot: A novice R user’s guide to pretty plots

Now for the fun: the plotting! I am the type of person who loves to visualize data rather than look at tables of numbers. These overlapping histograms are my ‘go-to’ for interpreting data.

Plot 1: An overlapping histogram of the month max, min, and mean temperature at Jug Bay, MD.

Plot 1: An overlapping histogram of the month max, min, and mean temperature at Jug Bay, MD.

Below I will show you a few ways to plot an overlaying histogram. (Need the code to get to this point? Check out the July Plot of the Month in SWMPrats!)

Customization 1: Monthly Max, Min, and Mean temperature at Jug Bay, MD

#This script will get you Plot 1.

 

hist(max, col=rgb(1,0,0,0.5), xlim=c(-20,45), ylim=c(0,20), main=”Air Temperature”, breaks=10, xlab=as.expression(bquote(“”~degree~”C”)))

hist(min, col=rgb(0,0,1,0.5), add=T,breaks=10)

hist(mean, col=rgb(.5,0,.5,0.5), add=T,axes = F,breaks=10)

box()

legend(“topleft”, c(“Max”,”Min”,”Mean”), cex=1.0, bty=”n”, col=c(rgb(200,0,0, 100, maxColorValue=255), rgb(0,0,200, 100, maxColorValue=255), rgb(125,0,175, 100, maxColorValue=255)),pch=19)

 

This plot shows how many times (frequency) a specific monthly temperature was calculated, in a range of 5°C. For example, we can see that the most frequent monthly maximum temperature was between 30-35°C (86-95°F), but a monthly maximum temperature >40°C (104°F) was measured twice in this 5 year time series.

Those two greatest temperatures occurred in the July of 2010 (106.9°F) and July of 2011 (105.3°F). It is fun to look back in the records and news to see that these two months were particularly hot, breaking numerous records across Maryland!

Customization #2: Those colors just aren’t for me.

I love the rgb color specifier in R. ‘rgb’ stands for red, green, blue…the primary colors of the electronic world.

With this basic format, you can customize your color palette to your liking.

col=rgb(RED,GREEN,BLUE,TRANSPARENCY)

Plot 2: Changing u[ the palette.

Plot 2: Changing up the color palette.

The ratio sum of red:green:blue can be up to 1 (It’s a fraction). So having a 1 in the RED slot means you create a 100% red color, .5 in red and .5 in blue equal violet, etc.

The TRANSPARENCY, or alpha, is how “see-though” the colors are. For an overlapping plot, we need this to be <1, or else we won’t get an overlap!

#This new plot is a green mix, with a 75% transparency (Plot 2).

 

 

hist(max, col=rgb(.5,.5,0,0.75),xlim=c(-20,45), ylim=c(0,20), main=”Air Temperature”, breaks=10, xlab=as.expression(bquote(“”~degree~”C”)))

hist(min, col=rgb(0,.5,.5,0.75), add=T,breaks=10)

hist(mean, col=rgb(.25,.5,.25,0.75), add=T,axes = F,breaks=10)

box()

legend(“topleft”, c(“Max”,”Min”,”Mean”), cex=1.0, bty=”n”, col=c(rgb(100,100,0, 100, maxColorValue=255), rgb(0,100,100, 100, maxColorValue=255), rgb(50,100,50, 100, maxColorValue=255)),pch=19)

 

Plot 3: Gray scale version.

Plot 3: Gray scale version.

Customization #3: I don’t have a color printer…and a 5°C range is too big!

As much as I love color, anyone who has published knows colors = $$$. So it is often desirable to have a black & white version of the image. In the rgb color specifier, setting each color to 0 results in BLACK while setting each to 1 results in WHITE.

Below is the adapted script to create a gray scale histogram with a smaller “break” size. Now instead of a 5°C temperature range, we have a 2.5°C range. Read the specifics on histograms here.

 

hist(max, col=rgb(.3,.3,.3,0.5),xlim=c(-20,45), ylim=c(0,10), main=”Air Temperature”, breaks=15, xlab=as.expression(bquote(“”~degree~”C”)))

hist(min, col=rgb(.6,.6,.6,0.5), add=T,breaks=15)

hist(mean, col=rgb(0,0,0,0.5), add=T,axes = F,breaks=15)

box()

legend(“topleft”, c(“Max”,”Min”,”Mean”), cex=1.0, bty=”n”, col=c(rgb(30,30,30, 100, maxColorValue=255), rgb(60,60,60, 100, maxColorValue=255), rgb(90,90,90, 100, maxColorValue=255)),pch=19)

 

Plot 3 is the same as the two plots above, just with a finer temperature breakdown. Looking at the minimum monthly temperature, we can see that 1 month in this time series was colder than -15°C! This turns out to be from January 2014 when Jug Bay, MD recorded a low at 1.8°F, which even made it to the Washington Post as that dreaded “polar vortex.”

Hist 4

Plot 4: The max, min, and mean temperature for July 2010-2014. Note, I kept the break size the same for all, but since the mean had a smaller range, the bar size is smaller.

Customization #4: I only care about the extremes in July!

#subset for only July

#Go ahead and laugh! I’m a newbie to R…. but this approach will subset any data frame!

 

max.7<-max$atemp[which(max$datetimestamp==”2010-07-01″ | max$datetimestamp==”2011-07-01″ | max$datetimestamp==”2012-07-01″ | max$datetimestamp==”2013-07-01″ | max$datetimestamp==”2014-07-01″)]

#Now repeat with your min and mean

 

hist(max.7, col=rgb(1,0,.0,0.5),xlim=c(10,45), ylim=c(0,3), main=”July Air Temperature”, xlab=as.expression(bquote(“”~degree~”C”)))

hist(min.7, col=rgb(0,0,1,0.5), add=T)

hist(mean.7, col=rgb(.50,0,0.5,0.5), add=T,axes = F)

box()

legend(“topleft”, c(“Max”,”Min”,”Mean”), cex=1.0, bty=”n”, col=c(rgb(200,0,0, 100, maxColorValue=255), rgb(0,0,200, 100, maxColorValue=255), rgb(125,0,175, 100, maxColorValue=255)),pch=19)

 

A correlation matric of the annual-based exteme climate indices for the HadEX2. Tune in next week for the discussion!

A correlation matrix of the annual-based extreme climate indices for the HadEX2. Tune in next week for the discussion!

Are you still with me?

So there you go! SWMPr is a fun and helpful way to visualize and work with time series data! I encourage anyone interested to check out the SWMPrats.net site!

But for next week..TEASER

we’ll get into the correlations between our extreme climate indices and the multidecadal and decadal oscillation data sets!

Kari Pohl

About Kari Pohl

I am a post-doctoral researcher at NOAA and the University of Maryland (Center for Environmental Science at Horn Point Laboratory). My work investigates how climate variability and extremes affect the diverse ecosystems in Chesapeake Bay. I received a Ph.D. in oceanography from the University of Rhode Island (2014) and received a B.S. in Environmental Science and a B.A. in Chemistry from Roger Williams University (2009). When I am not busy being a scientist, my hobbies include running, watching (and often yelling at) the Boston Bruins, and taking photos of my cat.
This entry was posted in R script and tagged , , , . Bookmark the permalink.

One Response to Having fun with SWMPr

  1. Victoria Coles says:

    Hey! I never knew how to do transparency! Glad to have learned this!

Leave a Reply

Your email address will not be published. Required fields are marked *