Professional Documents
Culture Documents
SUMMARY
Linux-loving data wrangler with a passion for reproducible research, text & data mining, and innovation in
academic publishing. I'm most at home collaborating on projects in distributed, version-controlled
environments such as GitHub & Bitbucket. I enjoy using social media as a powerful tool to bring change.
EDUCATION
University of Bath
EMPLOYMENT
University of Bath
Postdoctoral Research Associate
2014 to Current
PLUTo: Phyloinformatic Literature Unlocking Tools. Software for making published phyloinformatic data
discoverable, open, and reusable
While there are well-established and excellent repositories for molecular sequence data (NCBI), there are no
comparable comprehensive resources for alignments or morphological data, still less for trees or other metadata (measures of tree support, indices of homoplasy, etc.). These data remain locked down into PDFs, and are
currently not machine-readable. This is hugely detrimental to many biological disciplines. We will develop and
perfect tools (PLUTo) enabling researchers to unlock phyloinformatic data from published PDFs. These will
generate Newick/NeXML tree files (with branch lengths and support metrics) by interpreting SVG and other
graphics, and parsing the text/legends for other data. We will use AMI2 extraction technology, based on
PDFBox, JUMBO and AMI-code. This is presently in prototype.
PhD Student
Exploring the use and impact of fossils in phylogenetic analyses, particularly with regard to homoplasy and
character congruence. As a direct consequence of my primary research and problems I have faced when doing
it, I now also do research on data sharing, data re-use and synthesis in academia - there is much room for
improvement.
Working as part of a team to galvanise and focus the Open Science community to bring real change for the
better. Within Open Science my particular interests and expertise are legal issues, text & data mining, citizen
science & microtasking, Open Data and Open Access.
Panton Fellow
Many scientists believe in the benefits of open data. Many have an idea of what could be done to make open
data in science more feasible, ubiquitous and routine. But what is often lacking are the time and resources to
bring these ideas into fruition. The idea behind the Panton Fellowship came from Jonathan Gray and Peter
Murray-Rust, who saw an opportunity to assist innovative graduate students and career scientists to promote
open science. Thanks to Open Society Foundations, the Open Knowledge Foundation were able to announce
the Panton Fellowship scheme in January 2012.
Science, Evolutionary Biology, Paleontology, Cladistics, Informatics, R, Perl, Bash, JavaScript, TNT, Blogging,
Social Media, Open Data, Science Policy, Science Communication, Publications, Academic Publishing, Biology,
Scientific Writing, Research, Genomics, Bioinformatics, Linux
AWARDS
SKILLS
Apr 2012