Tuesday, March 8, 2011

Code4Lib 2011

In February, I attended the 2011 Code4Lib Conference.

"Code4Lib" isn't an organization or business or initiative. It's just a name a bunch of people use informally for a community. It has recently occurred to me that it could be described by the academic notion of "community of practice": "It is through the process of sharing information and experiences with the group that the members learn from each other, and have an opportunity to develop themselves personally and professionally."

The Code4Lib community is composed of library software developers and other technology workers. The participants are characterized by a strong desire to innovate and make things easier for our users -- but also by a focus on practical solutions that can be implemented quickly today, in our actually existing messy environment. This contrasts with other library organizations or communities more focused on more long-term or "blue sky" research. This priority put on solutions with immediate 'bang for the buck' springs, in part, from most participants positions in their library organizations: Not in resource-rich grant-funded departments, but in departments working with legacy systems, without explicit R&D budgets, but still with a passion to make our often challenging systems work better for users.

This context makes the 'community of practice' all the more important, becuase such developers are often more or less working solo on projects at their own institutions, without local technical peers. But in order to do what we do well, it's absolutely vital to have peers in a 'community of practice' to exchange ideas with, get advice and tips from, or mentor and be mentored by.

The annual conference, of which this year's was the 6th, seems to me to actually be one of the most powerful ways this community forms itself as an actual community, with shared knowledge, experiences, and social networks. In-person meeting can develop professional collaborative and information-sharing relationships and trust quicker and better than many megabytes of online interaction. This always ends up being the most useful aspect of the conference for me, more than the concrete information in any particular presentation.

On reflection, some aspects of how the conference is run seem nearly ideal for community-building, through some combination of intention and accident. The conference is single-tracked, with all participants seeing the same presentations: This builds the shared knowledge upon which a community of practice is based; and putting us all on the same page gives us lots of things to talk more about in late night hanging out in the conference hospitality suite, developing our shared understanding yet further. Presentations are only 20 minutes long, and supplemented by 5 minute "lightning talks" which can be given by anyone who wants to sign up during the conference itself: One goal is to maximize the number of audience members who are also up in front presenting, and a kind of egalitarianism that hopefully makes it easier for newcomers to integrate themselves into the community.

One trend I noticed at this year's conference were attempts on building software packages by assembling pre-existing mature components, and generally build software based on individual separate components.

This is, I think, a reaction to many previous efforts to build homegrown monolithic solutions, which ended up revealing our lack of capacity to succesfully build such things in a sustainable way. If we can re-use mature open source components from non-library communities, with their own community of developers, we can take advantage of that work, focusing on our own unique needs on top of that, maximizing our efforts. Likewise, if we can break up our own software projects into individual somewhat independent components, one of those individual components can meet the needs of a wider community, in a way that a more focused monolithic project can't. And by meeting the needs of a wider community, there are more potential collaborators to help on that component.

One example in a presentation at the conference, would be UPenn's efforts to build software to compile, manage, and analyze usage data from library software. A few years ago they were engaged in an effort they called the "data farm". We at Hopkins were at one point considering trying to use their software for our own needs, but ended up not having the time or resources to fully investigate. However, in a presentation at the conference, Tom Barker from UPenn revealed that they themselves had come to realize their first effort at this was not creating efficiently sustainable software, it was taking them too much development time to adapt to new data sources, and was ending up too customized for other institutions to easily adopt or collaborate. They switched gears, and have developed new software they currently call MetriDoc, which is based on several pre-existing mature open source components for transforming and storing data.

There are of course downsides to trying to build from several pre-existing powerful components, just different downsides from trying to build your own thing mostly from scratch (or much smaller pre-existing components). More components means more things the developer has to learn how to use, and more places to debug when something isn't going right. A component written for a more general purpose might not do exactly what you need. If you were counting on an open source component being maintained by an external community of developers, and it gets abandoned, it can be disastrous for your project.

But all approaches have risks, by being aware of the risks you can choose your components (and your approaches in general) to minimize them. The more I write software, the more I realize building complicated software such that it is flexible, sustainable, and expandable is actually difficult, taking skill, experience, time, and a bit of luck. It's not in fact -- despite the self-confidence of youth -- something that is obvious if you are just smart enough and know a couple guidelines.

Hopefully the Code4Lib community can continue to help library developers build their personal skill and experience at developing quality software in a sustainable and efficient way, in order to on aggregate increase the library communities capacity at sustainable technological innovation. I know my participation in the community -- in listservs, at the conferences, on IRC, and through the Code4Lib Journal -- has increased my own capacity immeasurably.

No comments:

Post a Comment