Opening up UK archives data (II)

This is the second post relating to the recent UKAD meeting, concentrating on the brainstorming that took place around digital and digitised archives.

The driving forces that were identified:

  • Crowd-sourcing – metadata generation
  • Attracts funding
  • Promotes access
  • Open up wealth of possibility
  • Remain relevant
  • Meet user expectations
  • Centres of excellence in digitisation – common approach
  • Collections already digitised are hidden – in silos – return on investment
  • Potential to capture richer information about users
  • Potential to draw people in
  • Increasing ‘digitisation on demand’ – needs to be harnessed effectively
  • Increasing amount of born-digital media need to be made accessible online – drive to discoverability of digital materials
  • Changing profession – becoming more confident in this area as a result of above
  • Web makes it much easier

The group felt that it all added up to a resouding “we have to do this!”.

The resistors included:

  • Systems don’t talk to each other
  • Insufficient metadata of legacy digitised material – retroconversion – cost*
  • Copyright/IPR – complex, lots of local specificity
  • Work needed to marry user generated content and standard metadata
  • Community resistance to UGC
  • Vast amounts of content – prioritisation is intellectually challenging
  • Bulk digitisation is happening commercially – restricted rights
  • Clashes with business models – or perception that it does (e.g. models based on commercial digitisation assume increasing return on investment; the opposite may occur if the most commercially enticing material digitised first)
  • Fears – grounded in truth – could affect funding: diminish user/visitor numbers on site, diminishes value of on-site expertise
  • Challenges in bringing catalogue data and digital object systems together
  • Query: not ultimately cost effective
  • Cost
  • Web makes it easier – but it’s hard to keep up…

The group looked at actions that are required:

1. Accrue evidence of user demand and current behaviour

  • Identify user communities (family, academic, student researchers)
  • Secondary research of existing analysis
  • Market research
  • Produce cost-benefit analysis – impact on site visits?

2. Systems talking to each other

  • People talking to each other about systems!
  • Develop definitive list of systems in use – a picture of UK situation > crosswalks/maps between (see Library world)
  • Needs to cover both catalogue and digital object management systems
  • Discmap?

3. Copyright/IPR

  • Produce decision tree to help archivists make decisions – risk assessment but beware risk aversion
  • Encourage sharing of experience/lessons learned
  • Gathering what has already been done

4. Impact of digitised resources

  • Gather existing articles/research
  • Share practice in assessing impact in differing contexts

5. Metadata and costs

  • Establish costs of differing levels of metadata generation
  • Identify how much data needs to be converted into digital metadata (how much is not online?)

6. Identify quick wins!

  • Working together to create user cases and examples, sharing experience, getting onvolved in Resource Discovery Task Force and linking projects to this

Of course, the gathering of such evidence can help us to see where we are and where we need to go, and also how to get there. But implementation is quite another thing. The UKAD Network is hoping to build upon this work to encourage collaborative initiatives and the sharing of expertise and experiences. We are considering events and training opportunities that might help. We do feel that it will be useful to create a stronger presence for UKAD, as a means to provide a focus for this work, and we are looking at low-cost options to do this.

Opening up UK archives data (i)

UKAD meetingOn 14th April the UK Archives Discovery Network (UKAD) met in Manchester to discuss challenges surrounding the opening up of archival data. We were looking to develop our understanding of the key issues driving or preventing these developments and to start pulling together an action plan. We also talked about digital and digitised archives, which I’ll blog about in a separate post.

We split into two groups to brainstorm driving and restraining factors. There was no chance of drying up – we all had plenty to say, and of course, the restraining influences grew rapidly, threatening to outstrip the drivers by quite some way. However, in the end we had a good balance, and we felt that the day had been very positive, although summing up the position is one thing, implementing actions is quite another. However, we hope to start putting some things into place that will help to take us along the road to promoting archival discovery.

We are looking to create a UKAD website, which will help us to promote UKAD to archivists and others, and we’ll let you know about that as soon as we can.

With thanks to Melinda Haunton from The National Archives, who, as the UKAD secretary, galliantly pulled together the large number of flip charts and made them into something coherent, here is a summary of the points.

Our driving forces included:

  • Perceived user demand
  • Time saving – easier to search, more effective customer service
  • Opportunities – for use of data and for benefiting from others’ use
  • Government policy drivers in this direction (data.gov.uk is evidence of Govt buy-in)
  • Rich data – think about opportunities to make the most of events, people, places, concepts within the finding aids
  • Serendipitous collaboration – working together is a big driver – a common way to hear about initiatives and experiences of others that could be of benefit to you
  • Potential to get new users – eg via GIS data connected to archives data – users who may not think of using archives
  • Standards exist to drive openness
  • Sustainability of resources – less tied to a single service if data is open
  • Enrichment and adding value – others can enrich our data
  • Archives making use of others’ open data – sector benefits from open data as well as contributing to it
  • Connecting archives – new narratives – data can coalesce around events, people, places, subjects
  • Exposure of holdings – especially for small repositories who have limited resources to promote themselves
  • Unlikely to be restrictions on opening up descriptive data (unlike digital/digitised archives)
  • Could glean evidence of impact – ways to gather usage statistics are increasingly effective – provide evidence of benefits
  • Opening up could reach out to excluded communities more effectively (different routes into archives)
  • Potential for wider impact – e.g. in demonstrating impact of academic research (RAE)

Our restraining forces included:

  • Lack of evidence of user demand – it may not be what we expect/assume
  • APIs – where they exist, are they used? (possibly not)
  • Users’ understanding what they’ll get – you won’t normally get direct access to archives through descriptions
  • Proprietary software providers – may not ‘play ball’
  • Archivists understanding of open data issues – need understanding to get buy-in
  • Access to developer expertise – archivists frequently find getting IT or developer support very difficult
  • Machine to machine – not visual, not easy to sell – need to understand the potential
  • Messy data – all the issues we are so aware of with different data sources; the balkanisation of data
  • Backlogs – if its not catalogued, we can’t open it up
  • Sustainability of resources
  • Data becoming out of date as it gets further from the original source – end up reusing out-of-date data
  • Contractual embargoes – e.g. involving commercial partners e.g. software providers
  • Dependencies  – potentially data may be dependent on other things – e.g. attached to schema, source code, IPR
  • Evidence of impact – can be difficult to get this and prove the worth of open data
  • Branding – or lack of it on reused open data – may affect funding if funders can’t see direct benefits
  • Loss of control causes fear – once its open anything can happen
  • Lack of ‘archival developers’ – very few developers with some understanding of archives and archival issues

Our Actions included:

  • Working together – collaborative evidence gathering and sharing, not competing – use examples/evidence from others
  • Evidence – case studies, knowing what researchers are requesting, evidence for advantages of digitising
  • Understanding funders – shared understanding of funders can help with internal funding
  • Archives developer days – bringing developers together as has been done with Dev8D – collaborative approach to programming
  • Strategy for approaching software vendors to get buy-in – appeal to their commercial interests, a concerted approach from aggregators may be more effective
  • UK based evaluation of archival cataloguing systems – still know little about percentages using different systems and evaluation of systems
  • Conference/workshops to raise awareness of buy in including practical demonstrations – must be interactive and practical and encourage sharing of projects, experiences and ideas