When Archives Become Digital

"Working with digitised and digital archives in theory and practice"

Convenors: Helle Strandgaard Jensen, Andrew Cusworth, Adéla Sobotkova

Hashtags: #digitalarchives and #DHOxSS2019

Computers: please bring your own laptop (no tablets please)

Abstract

This strand helps participants understand and work with online archives of digitised and born-digital cultural materials. It introduces opportunities and challenges created by digital archives, providing hands-on sessions complemented by discussions of theoretical, ethical and political issues surrounding digital archival cultures.

Intended outcomes

Through practical sessions and theoretical discussions, participants will gain insight into the possibilities and obstacles digital archives present and which arise when working with digital cultural heritage materials. Working with both digitised and born-digital material participants will be introduced to a number of digital tools and workflows for collecting, cleaning and processing data from digital archives and archival sources, as well as to open-source software solutions that can be used for cataloging, enriching, and publishing digital and digitised materials.

 

Hands-on sessions will be complemented by discussions of issues surrounding digital archival practice that will help participants to frame digital archives within their theoretical, political, ethical, cultural, and technical contexts.

 
Experience necessary
 

No prior technical knowledge necessary.

 

Convenor biographies

 

Helle Strandgaard Jensen is Associate Professor of Contemporary Cultural History at Aarhus University, Denmark, and co-director of Center for Digital History Aarhus (CEDHAR). One part of her research has media as the historical object of study; the other looks at how digital processes and, in particular, the infrastructures and affordances of digital archives influence the discipline of history.

Adéla Sobotkova is Associate Professor at the Department of History and Classical Studies, Aarhus University, Denmark. As a landscape archaeologist, Adela applies digital methods to the long-term study of human settlement patterns. In her former role as co-director of the Federated Archaeological Information Management Systems (FAIMS) project she has been researching the sociotechnical obstacles to the adoption of digital tools.

Andrew Cusworth is a postdoctoral research fellow at the Bodleian Libraries attached to the Prince Albert Digitisation Project. He has held positions at the National Library of Wales and the University of Exeter Special Collections and his research interests centre around the intersections between digital research, the archive, and cultural history. He is also active as a musician and composer.

"I was extremely happy with the summer school on the whole - I both learned a lot in my workshop and also made useful contacts that I've already followed up on."
DHOxSS 2018 participant

TIMETABLE

 
 
 
 
 
 
Link to overview of the week's timetable including evening events.
 
 
Monday 22nd July

08:00-09:00

Registration (Sloane Robinson building)
Tea and coffee (ARCO building)
09:00-10:00

Opening Keynote (Sloane Robinson lecture theatre)
10:00-10:30

Refreshment break (ARCO building)
10:30-12:00

Introduction of conveners and participants, as well as the set-up for the week

The introduction is followed by seminar-style talk outlining the major changes in the archival landscape in recent decades and their relationship to digital processes. It introduces 'Digital Archival Literacy' as a framework that enables users to critically examine how new policies and economic challenges influence system designs and content in digital archives from traditional archival institutions.

 

Speakers: Helle Strandgaard Jensen, Andrew Cusworth, Adela Sobotkova

12:00-13:30
LUNCH (Dining Hall)

13:30-15:30

Introduction to Data and Language Games: interactive

The session is used to establish a common ground amongst participants. It is structured around a set of hands-on exercises where participants work in groups answering questions such as: what do ‘data’, ‘archives’ and ‘digital methods’ mean within the setting of the workshop? The exercises will build a shared and well-defined basis for our work through out the week.

 

Speakers: Helle Strandgaard Jensen, Andrew Cusworth, Adela Sobotkova

15:30-16:00

Refreshment break (ARCO Building)
 
16:00-17:00
 
Web-scraping with browser extensions: hands-on session

 

Is the data or metadata you want online but not in a structured format ready for download? Web scraping is a technique for extracting information from websites. In this workshop we will manually convert non-tabular or poorly structured data into a usable, structured format, such as a .csv file or spreadsheet, using browser extensions.

 
Speakers: Adela Sobotkova
Tuesday 23rd July

09:00-10:30

Data Organisation in Spreadsheets

Data rarely comes in the form you require, yet tidy, well-organized data is the foundation of any research project. Most researchers have data in spreadsheets, so that is where we will start. In this workshop we will focus on the pitfalls of the 'human' ways of structuring and working with with tabular data and contrast them with the way that computers require that data be organized. In order to use tools that make computation more efficient, such as programming languages like R or Python, we need to structure our data the way that computers need the data.

 

Speakers: Adela Sobotkova

10:30-11:00

Refreshment break (ARCO building)
11:00-13:00

Data Cleaning in Open Refine

Do you have messy data? Do you need to reconcile multiple different sources or improve the quality of your data by refining it? Open Refine is your best friend. It is a powerful, free tool for exploring, normalising and cleaning datasets (and for extending data by accessing the internet through APIs). In this course we'll work through the various features of Refine, including importing data, faceting, clustering, and transforming it using the GREL syntax. 

 

Speakers: Adela Sobotkova

13:00-14.30
LUNCH (Dining Hall)

14:30-15:30

Perfect Pragmatism: What does best practice mean in practice?
 

This session will focus on situating projects within the spectrum of digital archival practices in relation to establishing best possible practice within available means. Giving play to a shortlist of potential project paradigms from academic archival research through to digitisation by national institutions, it will address the thorny matters facing those involved in digital archival scholarship, including digitisation quality, metadata creation, and the presentation and management of digital archives.

 

Speakers: Andrew Cusworth

15:30-16:00

Refreshment break (ARCO Building)
 
16:00-17:00

Lectures (various venues)
Wednesday 24th July

09:00-10:30

Introduction to R

R is one of the most popular open source tools for data wranging, analysis and visualisation. This is an introduction to R designed for participants with no programming experience. We will start with some basic information about R syntax, the RStudio interface, and move through how to import CSV files, explore and manipulate the structure of data frames, and briefly introduce plotting.

 

Speakers: Adela Sobotkova

10:30-11:00

Refreshment break (ARCO building)
11:00-13:00

Text mining and visualisation in R (and/or Voyant)

This session builds on the introduction to R and uses the tidyverse and tidytext packages to apply methods for data wrangling and visualisation to text. We will also look at Voyant and other open virtual laboratories where you can do text analysis and visualisation.

 

Speakers: Adela Sobotkova

13:00-14.30
LUNCH (Dining Hall)

14:30-15:30

Web archives and the challenges of working with born digital material
 

In this session, participants will be introduced to web archives, considering both the potential of this new kind of primary source and the challenges it poses for researchers. The archived web differs from other kinds of (digital) archives in significant ways. The talk will delineate these differences, and highlight the factors that researchers should consider when venturing to use the ‘past web’ as a source. 

 

Speaker: Jane Winters

15:30-16:00

Refreshment break (ARCO Building)
 
16:00-17:00

Lectures (various venues)

Thursday 25th July

An emerging technique in humanities is to simulate historical systems in order to better understand the evidence we consider today - close reading by digital prototyping. We will use a software tool (netlogo) to explore some simple simulations, focusing on early kinds of communications, and look at what-if scenarios. For those who wish, this tool can be used to develop new simulations.

09:00-10:30

Cultural capital and the digital enviornment

Looking critically at the machinic elements of both non-digital and digital archives, economies of data and knowledge, and the radical culture of open data, this seminar-structured session will pose questions about the processes of digitalisation as democratisation of knowledge.

 

Speaker: Andrew Cusworth

10:30-11:00

Refreshment break (ARCO building)
11:00-13:00

A peak behind the scenes: digital archives in practice

In this roundtable we get to hear from practitioners who work on making, improving and assembling digital archives on a day-to-day basis. Participants will introduce to their institutions, the structures that shape them and the work they do. Afterwards there will be ample time to answer questions from the workshop’s participants.

 

Speakers: Alex Green, Judith Siefring, Samantha Callaghan

13:00-14:30
LUNCH (Dining Hall)

14:30-15:30

Ethics, archives, and metadata

The problem of silenced voices and exclusion is well-known and well-described when it comes to traditional, analogue archives. But what do such problems look like in digital archives? This talk introduces questions and challenges related to Indigenous Digital Humanities.

Speaker: Samantha Callaghan

15:30-16:00

Refreshment break (ARCO Building)
 
16:00-17:00

Lectures (various venues)

Friday 26th July
09:00-10:30

Self-digitization and the ten thousand pictures problem

The session introduces participants to Tropy, a free Open Access tool for organising digital photos, relating them both to the provenance of the archive where they were originally taken and any self-made structures you desire, tailor made for exactly the purpose you want. The session is a mixture of a seminar-style talk and simple hands-on experiences with the tool.

 

Speakers: Helle Strandgaard Jensen

10:30-11:00

Refreshment break (ARCO building)
11:00-13:00

Hello world: from digitisation to digital archive

This workshop will introduce and demonstrate some of the freely available and straightforward ways in which we can move from some of the previous practical topics (digitisation, metadata creation, data cleaning) to maing a digital archive available online. Using Omeka and pre-prepared meta/data, the session will include the live creation of a simple online archive, as well as discussion of the differing affordances of off-the-shelf solutions and bespoke projects.

 

Speakers: Andrew Cusworth

13:00-14.30
LUNCH (Dining Hall)

14:30-15:30

Once archives have become digital

Round-up and thoughts about where this all leads

 

Speakers: Helle Strandgaard Jensen, Andrew Cusworth, Adela Sobotkova

15:30-16:00

Refreshment break (ARCO building)

16:00-17:00

Closing keynote (O'Reilly lecture theatre)
Speaker biographies

Jane Winters, University of London. Jane Winters is chair of digital humanities at the School of Advanced Study. She has led or co-directed a range of digital projects using the archived web. She is a Fellow and Councillor of the Royal Historical Society, and a member of RESAW (Research Infrastructure for the Study of the Archived Web).

Samantha Callaghan (Metadata Analyst, King's Digital Lab, London). Samantha Callaghan is the Metadata Analyst for the Georgian Papers Programme. Samantha has an MLIS and received her training in the field of digitisation through her work at the New Zealand Electronic Text Centre. She has worked on a large variety of digitisation projects, large and small, both in NZ and in the UK.

Alex Green, Digital Preservation Services Manager at The National Archives (London). Alex is a qualified archivist with extensive experience of both physical and digital archives specifically in the fields of cataloguing and the preservation of digital records. She is also the Product Owner for The National Archive‘s digital archive. Her current interests are in new ideas around the contextualization, use and reuse of digital records.

 

Judith Siefring, Bodleian Libraries, (Oxford). Judith Siefring is Head of Digital Research at the Bodleian Libraries, University of Oxford. She manages a programme of initiatives focused on creating digital tools and services to enable research and teaching, with a particular focus on making special collections content discoverable online. Resources developed and managed within Bodleian Digital Research include Digital Bodleian, Digital Manuscripts Toolkit, the William Henry Fox Talbot Catalogue Raisonne project, and TEI manuscript catalogues.

  • Black Twitter Icon

© 2019 University of Oxford