Sunday, December 11, 2011

7th International Data Curation Conference

Last week I attended the 7th International Data Curation Conference (IDCC, twitter #idcc11, in Bristol, UK.   The theme of this year's meeting was "Public? Private? Personal? navigating the open data landscape".

In addition to a large number of very stimulating presentations and discussions throughout, the meeting was bracketed by two interesting keynotes.

The opening keynote was delivered by Ewan McIntosh of NoTosh (  He talked about moving away from the often dry messages of research, research data, and the data management that supports reuse and toward stories that demonstrate impact of the research and data.  He talked about the work that his company has done with some  young students to develop TED-like talks (  Some of the lessons learned from this activity, he feels, are more broadly applicable to improving data reuse: 1. tell a story 2. create curiosity 3. create wonder 4. solve pain 5. create reasons to trade data.

The closing keynote was given by Natasa Milic-Frayling of Microsoft Research Cambridge (MSRC). Natasa talked about the Microsoft team's interaction with a working research team. The big lessons from their work relate to the disconnect between how clean and precise we would like the data management processes to be and how messy they are in real life.  The lab team's pragmatism is not typically mirrored by those of us who work in the realm of data management.  A couple of examples:
  • Though the researchers used electronic notebooks and many steps in their process were captured therein, only more informal summaries of these processes fed into next steps in the data provenance chain.
  • New technologies were brought to bear on the research team's workflow.  These technologies were not perfectly aligned with the needs of the project and there were some compromises made to achieve workable integrations.  These compromises made it difficult to accurately track and verify the origins of some information that became part of later data products.
Presence in the researcher's lab was the only way for the MSRC team to detect these issues.  Understanding the lessons of this work will inform the way we engage with researchers in our own institutions.

As I mentioned previously, the conference had many excellent presentations.  Links to presentation slides and other resources, where available, have been added to the online conference program.

1 comment:

  1. Folks: A storify'ed version of the conference is now available here...

    A nice way to view the meeting.