Saturday, December 21, 2013

DLF Forum November 2013

Following are summaries of the sessions I attended at the recent DLF Forum, including links to community Googledocs and abstracts for the papers. Videos of some of sessions are available here:

Opening Keynote by David Lankes 

David focused his talk around the idea of the ‘mortal in the portal’. He addressed the common sense notion that ‘libraries are good and necessary things’ and unpacked this notion throughout his talk. He emphasized that when we say ‘libraries’, we really mean ‘librarians’ and that it is in fact rather odd to think about a building or an institution as doing something when in fact it is the inhabitants who are providing services and maintaining the function of an institution. He advocated that a professional service is more than just a series of functions and that the provider is an essential part of the service.

At the heart of his message was the idea that librarians have a mission to improve society through facilitating knowledge creation in their communities. Being a librarian is more than helping people get to stuff, as this ‘stuff’ can take a variety of formats including conversations, arguments and training, and is not just books on shelves or websites. Further, training people and imparting knowledge is not just about how to use the library’s facilities, but teaching the norms of scholarly communication.

Turning from the first half of the sentence ‘libraries are good and necessary things’ to the second, he unpacked why libraries [librarians] can be considered good and necessary things. He questioned who gets the benefit and why. In his opinion the word ‘information’ means nothing – he prefers talking about knowledge since this is fundamentally a human phenomenon and refers to what’s in people’s heads. Libraries [librarians] are helping people to do something that they couldn’t do before.

In essence ‘libraries are good and necessary things’ under David’s examination mutates into ‘librarians improve society through facilitating knowledge creation in their communities’, or, put more simply ‘librarians make the world a better place’. One of the themes that ran alongside this unpacking of the initial thesis of the speech was the idea that we need to stop worrying about saving libraries and to concern ourselves, instead, with responding to the needs and dynamics of the community in which the library is immersed.

He gave the whole talk via Skype as he was at home recovering from chemo – what a fab dude!

Digital collections: if you build them, will anyone visit? 

In addition to being an excuse for a picture of Kevin Costner, this session looked at how visible digital collections of newspapers are in online searches. The choice to look at newspapers was on account of them being particularly highly used, and therefore you would expect them to have high visibility in a Google search or similar. However, this is rarely the case. This was because, to quote the speakers, sites for digitized content often “suck”. 89% of college students start their research with Google rather than the library website, and are, therefore, in danger of missing resources. The session discussed how digital collections might be marketed more effectively and it was suggested that too much focus and money went on content and not enough on publicity, presentation and SEO. There are some further details in the community Googledoc.

Metadata First: Using Structured Data Markup and the Google Custom Search API to Outsource Your Digital Collection Search Index 

Community notes googledoc: same as previous googledoc link – the notes follow straight on from the Digital Collections discussion, and has links to the slides, a demo and a download.

This talk was about creating indexable content and how library resources need to be discoverable in other venues and systems (the community notes contain a link to a video of Lorcan Dempsey talking about this). They noted that while Solr and Blacklight were flexible, faceted and had stable URLs, the development time was often prohibitive. It was suggested that Google Custom Search might be an alternative as it was already optimized for web search. One commentor noted that making a site more accessible for users with disabilities would improve visability to search engines. There are further notes and links in the community document.

Hunting for Best Practices in Digital Library Assessment

This was a workshop session. It focused on the problem that while research and cultural heritage institutions are creating more and more digital resources, the funding for such institutions is being eroded. As such, we need to hone our skills in being able to measure the value and impact of these resources.

After breaking out into groups with different members of the team presenting, we discussed what the challenges are in assessing digital libraries. There are a large amount of community notes available on the googledoc above, summarizing the thought from the various groups. However, one point that I felt was particularly important was the suggestion that assessment criteria can’t be an afterthought – they need to be built into a digital resource from the outset. That is, we need in creating any kind of web-based collection to determine from the outset what success would look like and how we might measure it (this is often a requirement of grant agencies). This was a huge topic and the conveners decided that we should continue the discussion, so I’ve signed up to the mailing list.

Big Archival Data: Designing Workflows and Access to Large-Scale Digitized Collections 

There was a very cool musicological section to this presentation. Tanya Clement discussed the need to think about how researchers will use digitized audio. This is very broad and can include things like psychoacoustics (understanding what sounds mean to people) and creating spectrograms to analyze audio files visually. In the latter case spectrograms can be used to show the machine what is meaningful to you to search for. In one example she took a particular clip from a sound file and asked the computer to look for things that were spectrographically similar. This method turned up a bunch of examples that initially seemed to be useless, however, they later discovered that although the computer had found clips that contained different words and speakers, it transpired that the speakers had the same accent and came from the same area. There were also examples of how a person speaking two different languages appeared spectrographically and how the computer could find the moments where the speaker changed language. These methods can also be used to look at how different speakers have approached the same content (combined with psychoacoustics this could have all sorts of implications for giving speeches, advertising, drama…).

Pathways to Stimulating Experiential Learning and Technological Innovation in Academic Libraries

This presentation had some very nice examples from three institutions about how students were being used to hack apps and gadgets for their libraries.  They made their experiential learning programs a regular part of library life, and fought to maintain the budget for this when it was under threat. Students gained an insight into library workflows and policies, but at the same time were able to bring their experience as users to the planning table. In most cases the structure of the students’ working teams was non-hierarchical and seemed to allow for some dynamic and creative brainstorming. The library benefitted from a regular turnover of enthusiastic students working for them and giving their perspective, while the students gained a lot of skills for the workplace (in one example, the library provided a form the students could fill in with work they had undertaken and what skills this developed listed – apparently the students found this very helpful for designing their CVs and explaining the usefulness of their experience in job interviews).

Determining Assessment Strategies for Digital Libraries and Institutional Repositories Using Statistics and Altmetrics

This session had some similarities with the session on the previous day that discussed assessment in digital libraries, although this one focused specifically on metrics. It also had a workshop element and I joined the group discussing qualitative vs quantitative metrics. The questions we discussed have been summarized here:

Influence of Academic Rank on Faculty Members’ Attitudes Toward Research Data Management

Presented by Katherine Akers, a current CLIR fellow in her second year. Katherine examined the ways in which humanities faculty differ from members of other faculties in their approach to data curation, and how different ranks within academia also differ within the humanities – see the community notes above for the questions she asked and the responses she collated. As a result of her findings, Katherine suggested some ways in which libraries can better support humanities faculty and how. For example, non-tenure staff tend to desire more outreach and training than senior faculty, while humanities academics in general need better cloud storage since they tend to travel more and some university systems can be very off-putting, slow, or difficult to use. All ranks expressed a desire for more digitized materials to be made available and easily accessible. See the community notes for more of her suggestions and results.

Humanities Data Curation in the Library: The Preservation of Digital Humanities Research Now and To Come

Harriet Green explored 3 case studies that she felt represented three tiers of data curation. The first was the Walt Whitman archive, which she classified as ‘basic’ – it has xml, html, image and recording files stored on optical discs with some basic backup of older iterations; they are beginning to work with university archives. The second project representing the mid-tier was the Victorian Women Writers project, which, in addition to xml files of texts, has annotations and biographical summaries, html files workflow with a fedora repository for storage and another one for creating and editing files [better detail on this is available in the community googledoc]. It was suggested that this project needed more staff to be sustainable. The third example was the Valley of the Shadow and represented a high-level of data curation. As well as having the kind of scope of the VWW project, this online resource had extensive documentation regarding its structure/workflows/programs used etc. The subject librarians were part of the project and costs for the project at each point were clearly calculated and known by both the library staff and faculty.

Based on her analysis of these three projects, Harriet recommended some principles for best practice and suggested that the UVa Sustaining Digital Scholarship was very useful as it identified criteria for defining levels of curation. She also suggested that libraries needed to provide further education and training on digital project curation, and that they needed an evaluation rubric and long term planning.

Her bibliography is available here:

Services for Research Data and Open Access: Strategies and Toolkits Being Implemented at Virginia Tech, the University of California, and Duke

This was a discussion of OA policies. UC recently moved to a policy of required deposit from which people have to choose to opt out (rather than choosing to opt in). One part that interested me was how they were attempting to make it easier for faculty to use their repositories – often in response to complaints from faculty about overly long and confusing deposit forms. This prompted UC to create a new interface which simplified the process (for example if an article has not yet been published, by clicking on that option the request for publication details like ISBNs and issue numbers are removed from the interface). Also they are refining their data harvesting tools in order to make the process simpler. The harvesting program goes and hunts down articles and collects all the metadata, it then adds these articles to a staff member’s profile but in a pending status. The staff member, when they log on to their repository profile can then just click a box to accept the article with its metadata if it is correct (they can adjust this if needs be). The article can be uploaded through a drag and drop mechanism. Alongside making the process easier for staff education and outreach was needed from the office of scholarly communication to dispel the enduring myths of OA! Generally they have found that publishers have not put up resistance to the green OA offered by institutional repositories.

Closing Keynote by Char Booth

In some ways Char challenged (respectfully and graciously) some of David’s remarks in the opening keynote. She felt that libraries do need saving, and in many respects have always needed saving.

She expressed concern that those who are creating tools and those who are disseminating the tools don’t continue communicate with each other, and we need to maintain this dialogue in order for our tools and for our libraries to be a success.

The idea that libraries don’t need saving is an important point for those in the grassroots who are dealing with budgets. Also for students whose local libraries have been closed down, their first experience of libraries and librarians is at university level – this makes our interaction with them and presentation of the institution to them a crucial thing. Furthermore, Char argued that we represent something that is fragile because our not-for-profit, OA ideals do not chime with those of publishers and other for-profit organizations. She believes we are an activist profession.

Char discussed the notion of information privilege. She feels that we need to confront the ‘dark side’ of information privilege, whereby society’s divides are exacerbated by access not being universal. She also looked at  what motivates us to do what we do, and to think about our narratives. There are narratives about libraries which are extremely negative – she mentions a techcrunch article, which asks why libraries are needed when you can just download things to your ipad (see above re ‘dark side’ of information privilege), and ‘libraries in crisis’on Huffington Post.

In essence we need to work together and be involved in the process all along the way. So: we need to create great tools and resources, but then we need to communicate with those who disseminate and those who use them (which, from my perspective, is part of the mission of a CLIR fellow), and also those who unexpectedly come across our tools and resources. We need to follow through on the impact of those tools and understand when and how and if they are being used. She notes that making online resources disability friendly is a good way of ensuring more general good practice in websites and other resources (this point came up in some of the other sessions).

There’s lots more I could say, but the whole speech is online, and she says it better than I do, so do take a look, it will be an hour well spent.