Curating Weather Simulation Data. Earthcube Workshop in North Dakota.

“Simulation outputs are important but that does not mean we save them forever” – Gretchen Mullendore

This week I have been attending a workshop on data curation (a key part of open science) specifically on developing guidelines on the data produced by weather and climate simulations. Open science is better science! But a blanket “you must save and provide all data” is not only onerous (especially for underserved institutions) but not what is needed for reproducibility and reusability.

So many great minds focused on open science.

First, this post are my thoughts and do not, necessarily, reflect the views of attendees and organizers. There will be a report. There has been a lot written about measurements and measurements can no be recreated. Model data, to a degree, can be regenerated. By sharing workflows those with the appropriate resources can run the models on provided initialization and configuration data. Furthermore the sharing of workflows allows the exploring of the robustness of conclusions to assumptions (sensitivity) and the reuse of the workflow to address new science questions.

Gretchen kicking off the meeting

I really enjoyed the discussions and applaud the team’s focus on designing rubrics as it brings the conversation up a level and enables the clear measurement of the efficacy of solutions. It was also great seeing a huge diversity in the career stage and “flavor” of participants. We had data creators, curators, representatives from three publishers (AGU, AMS and PLOS), data scientists and more!

Susan from the University of Michigan on data curation.

Also, fittingly, lots of discussions around equity. Open science is better science. Journals are increasingly requiring data to be made available (even FAIR) which can create a burden to institutions without the physical and/or workforce to meet these requirements. There have been discussions of carving out exceptions for underserved communities. My perception is that the community here at the workshop pushed back hard against that idea as, as aforementioned, open science is better science. Rather we need to equip those institutions to meet the open science requirements.

Lots of discussions on just how much data should be required to be made available to be open and how long it should be curated for. Again a focus on designing rubrics to guide the process. The focus should be on the goal and be flexible to aid the scientist in achieving open science and reproducibility and also allow the society driven journals in meeting the aspiration of is members.

A nice atmosphere and a nice atmosphere!

It was great to be back in Grand Forks. The University of North Dakota is a great institution that, in the atmospheric science, punches way about its weight. Two of our recent three hires had a background at UND and I very much enjoy my collaborations with the team there. It was also very nice to be there during a dry cool air outbreak in summer rather than a frigid cold air outbreak in october!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s