Professional Documents
Culture Documents
for Research
Optional: cc http://www.flickr.com/photos/quinnanya/
• Introductions
• Background
• Definitions
• Upfront Decisions
• Data Sharing Impacts
• Fundamentals Practices
• File Organization
• Data Documentation
• Reliable Backup
• Data Lifecycle Strategy
Why are we here?
But why are we really here?
An Impetus: NSF recently released a mandate that all grant
applications submitted after January 18th, 2011 must include a
supplemental “Data Management Plan”
An Effect: This mandate from NSF has had a domino effect,
and many funders that now require or state guidelines for
data management of grant funded research
A Challenge: Data management (and oftentimes research
methods in general) is an area that has not traditionally
received a full treatment in most graduate and doctoral
curricula
What is meant by “data management”?
• Introductions
• Background
• Definitions
• Upfront Decisions
• Data Sharing Impacts
• Fundamentals Practices
• File Organization
• Data Documentation
• Reliable Backup
• Data Lifecycle Strategy
File Organization Practices: Overview
1. Create a file plan for your “When I was a
research project
2. Design a file naming
freshmen I named
convention that works for my assignments
your project
3. Agree on a version control
Paper Paperr
method to assist with file Paperrr Paperrrr”
synchronization
4. Carefully choose file
-Undergrad
formats to maximize
usefulness
1. Create a file plan for your research project
1. At minimum create a
README file that you can
use to document your
project
2. Utilize standards for
describing data including
Metadata Standards
3. If applicable, use in-line
code commentary to
explain code (cc) Will Scullin
1. At minimum create a README file that you
can use to document your project
At minimum, store documentation in readme.txt file or
equivalent, with data
Resource: http://
libraries.mit.edu/guides/subjects/data-management/metadat
a.html
2. Utilize standards for describing data including
Metadata Standards
Dublin Core
Easy-to-create-and-maintain descriptive format to
facilitate cross-domain resource discovery on the Web
Darwin Core
Facilitates reference and sharing of biological diversity
datasets
Data Documentation Initiative (DDI)
Methodology for content, presentation, transport, and
preservation of metadata about datasets in the social
and behavioral sciences
Documentation Practices: Example Metadata Standards
• Flash Drives
• Internal Hard Drives
• External Hard Drives
• Server and Web Storage
• Managed Networked Storage
• Cloud Storage
3. Ensure data redundancy
Backup Do’s:
Make 3 copies
E.g. original + external/local + external/remote
E.g. original + 2 formats on 2 drives in 2 locations
Geographically distribute and secure
Local vs. remote, depending on needed recovery time
Personal computer, external hard drives,
departmental, or university servers may be used
3. Ensure data redundancy (cont.)
Backup Don’ts:
Do not rely on one copy
Do not use CDs and DVDs
Do not rely on ANGEL
Backup Maybe:
Cloud storage
Amazon s3 Note that many
Google enterprise cloud
MS Azure storage services
DuraCloud include a charge for
Rackspace in/out of data
transfers
$$$
Agenda
• Introductions
• Background
• Definitions
• Upfront Decisions
• Data Sharing Impacts
• Fundamentals Practices
• File Organization
• Data Documentation
• Reliable Backup
• Data Lifecycle Strategy
Defi
que ne a
stio
n
Gat
info her
Research is…
rma
tion
For
hyp m a
o th
esis
Tes
hyp t the
ot h
esis
Ana
dat lyze t
a he
Inte
the rpret
dat
a
Pub
res lish
ults
Ret
est
For
Gat hyp m a
info her oth
rma esis
tion
Defi
que ne a
stio
n
Ana
dat lyze t
a he
Tes
hyp t the
oth
esis
Pub
res lish
ults
Inte
the rpret
?
dat
a Ret
est
que ne a
n
stio
res lish
Defi
ults
Pub
The scientific method “is
hyp m a
oth
esis
often misrepresented as a
For
esis
est
hyp t the
Ret
fixed sequence of steps,”
oth
Tes
rather than being seen for he
Ana
the rpret
a
a
Gat
dat
variable and creative
Inte
process” (AAAS 2000:18).
Gauch, Hugh G. Scientific Method in Practice. New York: Cambridge University Press, 2010. Print. (Emphasis added)
Defi
que ne a
stio
n
Gat
info her
rma
tion
For
hyp m a
o th
esis
Tes
hyp t the
ot h
esis
Ana
dat lyze t
a he
Inte
the rpret
dat
a
Pub
res lish
ults
Ret
est
The Research Depth Chart
Scientific Method
More
Research Method
Generic
Research Design
More Specific
Research Tasks
Defi
que ne a
stio
n
Gat
info her
rma
tion
For
hyp m a
o th
esis
Tes
hyp t the
ot h
esis
Ana
dat lyze t
a he
Inte
the rpret
dat
a
Pub
res lish
ults
Ret
est
The Data Management Depth Chart
Research Data Lifecycle Model
Source: DDI Structural Reform Group. “Overview of the DDI Version 3.0 Conceptual Model.“ DDI Alliance.
2004.
http://opendatafoundation.org/ddi/srg/Papers/DDIModel_v_4.pdf
The Data Management Depth Chart
Research Data Lifecycle Model
???
???
???
Study
Concept
Data are brainstormed
Study Data
Concept Collection
Data are collected
Data
Archiving
Data
Archiving
Repurposing
Data can be used and reused
Aaron Collie
Digital Curation Librarian
MSU Libraries
collie@msu.edu