Pre-claimer: This post is different! Victoria and I have decided to start an open conversation with you, the reader and hopeful future user of these climate data! Our goal is to get you thinking and talking about how we could best serve and organize all the data we have compiled for this project so far!
Daaaata… Daaaata… Data… Data… Data…Data…Data…Dataaaaaaaa
(apologies to Pink Panther)
Victoria: One of the first things you learn in graduate school is that the word data is plural (singular=datum). It’s the hallmark of a newbie to say “the data is noisy”, when we all know that “the data are noisy”. Which doesn’t mean that the data speak loudly. The second thing you learn is that the data are always noisy!
In our project, we are not generating any new data. We’re coraling data, wrassling it into submission, branding it, saddling it, crossbreeding it, and now… we want to set it free. That is to say, that even though we haven’t collected these data ourselves, we want the work we’ve done to collate and analyze our climate data sets to be available to other Chesapeake Bay researchers. In the language of marketing, these are an ancillary value added product.
Kari’s hard work has resulted in a bunch of timeseries specific to the Cheaspeake region, including HadEx2, GHCNDex, and 18 local weather station timeseries (=20) for each of 26 climate extreme indices (=520 timeseries). Plus she’s collected together timeseries from the NOAA NERR SWMP dataset (multiple locations) and ancillary data from river discharge (USGS), tide gauges (NOAA). Plus I’ve assembled 8 climate models extreme event indices (8×26). So we have a lot of timeseries that we’d like to make available to other researchers that might find them useful.
But its not just the extreme climate timeseries…
Kari: A major part of this project has been working with the data. It seems intuitive but a lot of time went into calculating these extreme climate indices, as well as applying certain statistical tests and smoothing techniques.
All of these calculations and data analyses were conducted using R, an open source and free computing language! So, I have generated a lot of helpful R-code which could be useful to anyone who wants to understand how we got our extreme climate values or maybe to repeat this analysis in another region. Although I am a novice at R scripting, no one should have to repeat what I did! (Improve it definitely, but redo, definitely not!)
A master online repository, then, gets even more added value by not only storing and archiving our gathered data, but also compiling this R code!
But, I have to admit, I have been finding it hard to visualize how to store, display, and highlight this vast wealth of data in a way that would be useful for all future users! The amount of data is over-whelming!
Victoria: I agree, it’s a lot of data. And its challenging to decide how to make it available in the easiest possible format for other scientists and managers to use. Almost everyone can work in Excel, or can read excel files into another program, so these sound like a good starting point.
Fortunately, even for the timeseries that are available monthly, the amount of data is manageable in an excel .csv file. I’m hoping that we can develop a website that would provide an overview figure of each index and also the files for download.
Kari: At the moment (yesterday that is), I have converted all the annual extreme climate indices (calculated from huge daily time series) into .csv files. But all I started to do the same to the monthly indices, I ran into an organizational dilemma!
Each monthly index has 12 time series (one for each month), but for each type of data (20 time series). So, this data could be archived as 12 separate .csv files for each month (with 20 columns each), or one master .csv file with 240 columns! Which format is more useful?
In our first Think Tank, our partners expressed the desire to have these time series on their own NERR websites. We are looking for your input on what that could look like!
Comment on this post or feel free to email me at email@example.com to give us input and ideas!