An Introduction to Digital Humanities

Expert insights into our digital landscape

Academic Advisor: Professor David De Roure, University of Oxford

Coordinator: Judy Dendy (Department of Engineering Science, University of Oxford)

Hashtags: #introDH and #DHOxSS2019

Computers: Participants are not required to bring their own laptops for this workshop but may find it useful.


Broaden your understanding of the range of work the Digital Humanities encompasses and learn about the tools and techniques available for scholarly purposes.

This lecture-based survey course gives you a thorough overview of the theory and practice of Digital Humanities. Drawing on expertise from across the University of Oxford and our national and international collaborators, and on the University's library collections, it will appeal to anyone new to the field, or curious to broaden their understanding of the range of work the Digital Humanities encompass.


Sessions include talks, presentations, demonstrations, and practical workshops. On completing this course, you will be conversant with the variety and potential of the various technologies used to collate, interrogate, and facilitate digital work in the Humanities, and will have gained insight and practice in methods relevant to your own research.


Intended outcomes

Attendees of this strand will gain:

- a broad overview of the Digital Humanities field
- insights into the state of the art in the practice of digital methods in the humanities

- an awareness of future directions in the field an awareness of future directions in the field

To enable screen reader support, press Ctrl+Alt+Z To learn about keyboard shortcuts, press Ctrl+slash

Experience necessary

No prior technical knowledge is necessary for this course.


Computer and software requirements


Participants are not required to bring their own laptops but may find it useful.

Academic advisor

David De Roure is Professor of e-Research at the University of Oxford's e-Research Centre. Focused on advancing digital scholarship, David works closely with multiple disciplines including social sciences (studying social machines), humanities (computational musicology and experimental humanities), engineering (Internet of Things), and computer science (large scale distributed systems and social computing). He has extensive experience in hypertext, Web Science, Linked Data, and Internet of Things. Drawing on this broad interdisciplinary background he is a frequent speaker and writer on the future of digital scholarship and scholarly communications. Professor De Roure is also a Visiting Researcher at the Alan Turing Institute, working at the intersection of data science with libraries and GLAM (Gardens, Libraries and Museums at the University of Oxford), and a Visiting Professor at Goldsmiths, University of London.


Judy Dendy has been one of the key members of the DHOxSS Events Team since the 2017 Summer School. She assembles the nine-strand programme, as well as coordinating the Introduction to Digital Humanities strand. She handles speakers' travel and accommodation, the contents and production of the conference bags, and the design and produciton of the conference booklets. She works 'behind the scenes' in general support roles as part on the Events Team for DHOxSS.

"The strand was a perfect way to get introduced to the different existing technologies and to stimulate a reflection on how to enhance one's research through the use of digital tools".

DHOxSS 2019 participant


The Introduction to Digital Humanities workshop will be held in the Sloane Robinson O'Reilly lecture theatre.
Link to overview of the week's timetable including evening events.
Monday 22nd July
Registration (Sloane Robinson building)
Tea and coffee (ARCO building)

Opening Keynote (Sloane Robinson lecture theatre)

Refreshment break (ARCO building)

Digital scholarship: Intersection, Scale, and Social Machines (Sloane Robinson lecture theatre)

Today we are witnessing many shifts in scholarly practice, in and across multiple disciplines, as researchers embrace digital techniques to tackle established questions in new ways and new questions afforded by our increasingly digital society and digitized collections. These methods include computational techniques but also citizen science, the notion of Social Machines, "experimental humanities", and artificial intelligence. We take a broad look at Digital Humanities and set the scene for the week's discussions

Speaker: David De Roure.


LUNCH (Dining Hall)


Bodleian Student Editions Workshops (Weston Library)

Bodleian Student Editions workshops bring students from across our University together in the Bodleian's Weston Library with items from Special Collections, curatorial, editorial, digital, and research expertise. Through working hands-on with early modern letters, participants are introduced to special collections handling, palaeography, transcription and editorial practices, metadata, and digital text at scale. The letters' transcriptions and metadata are added to Early Modern Letters Online as citable publications.


This session presents the development of the collaboration behind the workshops, shows some of early modern letters that have been transcribed, pedagogical practice of teaching with collections, and reflects on participants' and workshop leaders' response to the workshops.

Speakers: Helen Brown, Chris Fletcher, Miranda Lewis, Olivia Thompson, Mike Webb


Refreshment break

Foundations of Digital Preservation


Have you ever lost important research data, or found digital files that have been corrupted and unusable? Have you considered what you would do if you did? How would you stop that loss from occurring in the first place? Preventing loss and mitigating risks to your digital materials is a foundational aspect of digital preservation. In order to protect your files for long-term access and use, early intervention with digital preservation practices is necessary. This introduction to digital preservation session will provide background on the risks to digital materials and the techniques that can help prevent them from happening to you. Digital preservation is not just the responsibility of libraries and archives: researchers also have an important role.

Speakers: John Southall

Tuesday 23rd July



An Introduction to the International Image Interoperability Framework

This session will provide an overview of the community-driven standards and software of the International Image Interoperability Framework (IIIF), with demonstrations of several IIIF-based tools for comparing, annotating and remixing digitized images.

Speaker: Emma Stanford

Text Mining

I will talk us through an approach to "forensic stylometry", that is, identifying the author of a text, based on a corpus of documents. This field made headlines in 2013 when two professors of computational linguistics proved that JK Rowling was the author of a detective series which she had written under a pseudonym. Traditionally this would have been done with a hand engineered sequence of components for removing stopwords, lemmatising words, and constructing a bag of words model. However recent advances in deep learning software have made it simple to build text classifiers with almost no feature engineering. In a few hours we will build a classifier to identify authorship which can be trained in a few minutes and will run on a regular laptop.

Speaker: Tom Wood


Refreshment break (ARCO Building)


Digital Archive Literacy

In the talk I will argue why being able to question the economies, policies and systems design of digital archives will benefit humanities scholars in terms of both research and teaching. I offer a framework for supporting a professional reflection on the “everyday use” of digital archives and the powerful forces that shape what content they make available.

Speaker: Helle Strandgaard Jensen


Text Encoding: TEI in a research context

In this talk we will give an overview of the many uses of the Text Encoding Initiative by looking at a range of projects and the different ways in which they create and publish TEI. We will touch on some technical aspects of TEI, but our main focus will be on TEI in a research context and how it can be used to address a variety of research questions.

Speakers: Yasmin Faghihi, Huw Jones


LUNCH (Dining Hall)

Introduction to Visualization for Digital Humanities


This session will explore how visualization can be used in digital humanities projects. We will cover basic concepts of visualization as well as examine existing visualization techniques and applications.


Speaker: Alfie Abdul-Rahman


Refreshment break (ARCO Building)

16:00 -17:00

Lectures (various venues)

Wednesday 24th July

Reproducible Research in the Humanities


Reproducibility, documenting the process as well as the products of study, is an important part of digital research. Many researchers do not have the confidence or training to use some of the tools available to support reproducible research, or to write their own code for analysis. Writing code to automate a process can be one stage of this, and it then needs to be made available and shareable. Using the publicly available Early English Books Online Text Creation Partnership (EEBO-TCP) corpus, this session teaches participants to write some code that will extract data from the catalogue and create a figure based on that data. Participants will learn how to use tools and techniques to support its reproducibility through version control, licensing practices, and some basic Python coding using a pre-existing script. We will import code libraries, and discover data using the index, and to export the data. The session will include a practical session as well as discussion.

Speaker: Iain Emsley


Refreshment break (ARCO Building)

An introduction to computer vision tools for the digital humanities: How to Search, Compare, Classify and Annotate your images

Computer vision has made rapid progress in recent years: images are now as readily searchable as text is in web search engines. In this presentation, we will introduce software tools that enable researchers to organise and search large collections of images instantaneously - by allowing search queries based on images (such as a building or a book illustration) or categories (such as “gothic-architecture” or “birds”). We will demonstrate how these tools are being used in many projects within humanities disciplines such as art and book history; film studies; archaeology and literature. Attendees will leave the session knowing how to match, differentiate, classify and annotate many kinds of images. Since these tools are open-source, researchers can freely use them for any purpose. Attendees will have the opportunity to book an appointment to get these tools installed on their personal laptop computer, or will be provided with instructions for doing so themselves.

Speakers: Giles Bergel, Ernesto Coto, Abhishek Dutta


LUNCH (Dining Hall)

Music Information Retrieval: from SALAMI to AI


The field of Music Information Retrieval demonstrates many good practices in digital humanities and data science. This overview will focus on a previous project (Structural Analysis of Large Amounts of Music Information) and look at current and future directions, including artificial intelligence.

Speaker: David De Roure


Refreshment break (ARCO Building)


Lectures (various venues)

Thursday 25th July

The Zooniverse

The Zooniverse ( is the world's largest online platform for 'people-powered' research. Over the last decade it has grown from a single astronomy project to a platform hosting hundreds of different projects in diverse fields such as ecology, biomedical science, and the humanities, with more than 1.7 million registered volunteers. In this session, you will hear about this transformation from project to platform, the growth of Zooniverse humanities projects, and also about how the Zooniverse continues to evolve, incorporating machine learning and using internal research to ensure that projects continue to support research teams and volunteers alike. You will also find out how easy it is to create your very own crowdsourcing project using the Zooniverse Project Builder (

Speaker: Samantha Blickhan


Hyperspectral Imaging in the Humanities

Heritage Science within the Bodleian Libraries is a service which provides analytical and advanced imaging techniques to assist researchers. This can be material identification such as the analysis of pigments using Raman spectroscopy, and revealing hidden texts using techniques such as hyperspectral imaging. This presentation will demonstrate how these digital technologies, especially hyperspectral imaging, have been applied to a number of Bodleian items. It will showcase some of the results obtained, and describe how the digital infrastructure has evolved to enable data storage and interpretation.

Speaker: David Howell


Refreshment break (ARCO Building)

Linked Data for Digital Humanities: Introducing the Semantic Web

The Semantic Web can be thought of as an extension of the World Wide Web in which sufficient meaning is captured and encoded such that computers can assist in matching, retrieving, and linking resources across the internet that are related to each other. In a scholarly context this offers significant opportunities for publishing, referencing, and re-using digital research output. In this session we introduce the principles and technologies behind this ‘Linked Data’, illustrated through examples from Digital Musicology.


Speaker: Kevin Page


LUNCH (Dining Hall)



​Machine Learning and Music TBC

This session on machine learning uses a case study in mood analysis of music audio as a friendly way to introduce some basic machine learning concepts, including the Weka machine learning toolkit. Participants will also be briefly introduced to the world of music signal processing and analysis.

Speaker: Stephen Downie


Refreshment break (ARCO Building)


Lectures (various venues)

Friday 26th July


Challenges in Visualizing the Past

Cultural heritage represents a large portion of our collective legacy and must be recognized for having contributed to the shaping of society today. Study of these legacies can prove valuable to all sub-fields of the arts and sciences. Unfortunately, a considerable amount of knowledge about the past is lost today. This may subtract from what might be learned about traditions, lifestyles and events in history. The correct documentation, rendition and interpretation of extant monuments are therefore vital in order for us to better our understanding of the past and preserve it for future generations. In this session, we will outline how visualization can be used to investigate the past. We will cover basic concepts of visualization as well as examining techniques, best-practices and applications.

Speaker: Jassim Happa


Refreshment break (ARCO Building)

An Introduction to Relational Databases

This session looks at what a relational database is, and when and why it might be helpful to use one. It introduces some basic database concepts, and works through the process of designing one. We also look at some challenges posed by the sort of data often used in humanities projects, and how these might be addressed. Hands-on exercises give participants a chance to put what they’ve learnt into practice.

Speakers: Meriel Patrick and Duncan Young


LUNCH (Dining Hall)

Round up discussion

Questions and thoughts reflecting on the week and the ways ahead for Digital Humanities.

Speaker: Dave De Roure

Speakers: Edith Halvarsson, Sarah Mason

Refreshment break (ARCO Building)

Closing Keynote

Speaker Biographies

David De Roure is Professor of e-Research at the University of Oxford's e-Research Centre. Focused on advancing digital scholarship, David works closely with multiple disciplines including social sciences (studying social machines), humanities (computational musicology and experimental humanities), engineering (Internet of Things), and computer science (large scale distributed systems and social computing). He has extensive experience in hypertext, Web Science, Linked Data, and Internet of Things. Drawing on this broad interdisciplinary background he is a frequent speaker and writer on the future of digital scholarship and scholarly communications.


Professor De Roure is also a Visiting Researcher at the Alan Turing Institute, working at the intersection of data science with libraries and GLAM (Gardens, Libraries and Museums at the University of Oxford), and a Visiting Professor at Goldsmiths, University of London.

Chris Fletcher is Keeper of Special Collections at the Bodleian Libraries, a member of Oxford’s English faculty and a Fellow of Exeter College. Before coming to Oxford he was a curator of literary manuscripts at the British Library.

Mike Webb is the Curator of Early Modern Archives & Manuscripts, Bodleian Libraries. He has a degree in history and a diploma in Archive Studies, and has a particular interest in the Library’s 17th-century State Paper collections, and letters and diaries 1600-1900. He has curated three exhibitions ranging in subject from the Tudor and Stuart nobility to the First World War. He teaches early modern palaeography to History postgraduates.

Miranda Lewis is the Editor of Early Modern Letters Online [EMLO] and an Associate Member of the Faculty of History at the University of Oxford. With a background in early modern history, art history, and digital scholarship — including ten years on the research project Cultures of Knowledge [CofK] — her own research focusses at present on early modern collections and collecting.

Olivia Thompson is a DPhil candidate in Ancient History at Balliol College, Oxford. Her thesis focuses on changing notions of physical and intellectual property during and after the civil wars of the late Roman Republic. She is more broadly interested in the history of classical scholarship and ways in which digital research tools can be used to reconceptualize ancient sources (in particular, the correspondence of Cicero) and their editorial tradition.

Helen Brown is a third year DPhil candidate at the University of Oxford, based in the Faculty of English. Her research concerns the application of digital editorial and analytical methods to Alexander Pope’s correspondence. Alongside her studies, Helen is a Digital Editorial Assistant at Oxford University Press, working on projects such as Oxford Scholarly Editions Online and the Very Short Introductions series.

Emma Stanford is the Bodleian Libraries’ Digital Curator. She manages the digitization of new and legacy image content via Digital Bodleian, conducts training and outreach, and writes occasionally about digitization policy and public engagement. She holds a BA in Literary Studies from Middlebury College and an MSc in Library Science from City, University of London.

Alfie Abdul-Rahman completed her PhD in Computer Science at Swansea University, focusing on the physically-based rendering and algebraic manipulation of volume models. She was then a Research Associate at the University of Oxford e-Research Centre and joined King's College London in March 2018 as a Lecturer. Her projects include Quill, ViTA: Visualization for Text Alignment, and Poem Viewer. Before joining Oxford, she worked as a Research Engineer in HP Labs Bristol on document engineering, and then as a software developer in London, working on multi-format publishing. Her research interests include visualization, computer graphics, and human-computer interaction.

Jassim Happa is a Research Fellow in the Dept. of Computer Science at the University of Oxford. His research interests include: Computer Graphics, Cyber Security, Human Factors, Human Visual Perception, Resilience, Rendering, Virtual Archaeology and Visualization. He obtained his BSc (Hons) in Computing Science at the University of East Anglia in 2006. After a year of working as an Intrusion Detection System (IDS) analyst, he began his PhD in Engineering at the University of Warwick in October 2007 where he developed a number of novel computer graphics techniques to document and reconstruct real-world heritage sites based on empirical evidence available today. He defended his PhD in January 2012, and has since December 2011 worked at Oxford. In more recent years he has spent his research efforts on cybersecurity analytics through visualization, covering topics such as threat modelling, situational awareness, risk propagation, resilience, decision support, privacy as well as cyber threat intelligence sharing. In his research projects he has been responsible for creating, implementing and assessing novel visualization techniques. Teaching wise, he tutors and lectures the Computer Graphics and Physically-based Rendering courses. He also lectures doctoral students for the Centre for Doctoral Training (CDT) in Cyber Security in topics such as situational awareness, intrusion detection and security architecture. Finally, he supervises undergraduates, MSc and doctoral projects.

Huw Jones,  Head of Digital Library Unit and Digital Humanities Coordinator, Cambridge University Library.

Tom Wood studied physics as his first degree and then got interested in natural language processing. He did a Masters at Cambridge University in Computer Speech, Text and Internet Technology, and since then he has worked in machine learning and AI in various companies, including computer vision and designing dialogue systems (think of Siri), in the UK, Spain and Germany. Most recently he has been working as a data scientist at CV-Library, one of the UK's largest job boards. This involves looking at the many years of job hunting data and CV documents that CV-Library has collected, and trying to find patterns and smart ways to use the information. For example recommending a job to a candidate based on past behaviour, or categorising candidates based on CVs.

Helle Strandgaard Jensen is Associate Professor in Contemporary Cultural History at Aarhus University and the co-director of Center for Digital History Aarhus (CEDHAR).

Giles Bergel is Digital Humanities Research Ambassador in the Visual Geometry Group at the University of Oxford, and Teaching Fellow in Digital Humanities at University College London. As well as computer vision, his interests include text encoding, Linked Data and the study of early printed books.

Ernesto Coto is a Research Software Engineer in the Visual Geometry Group (VGG) at the University of Oxford. He has several years of experience developing software in academic and industry environments. His current research interests are Computer Vision, Machine Learning and Scientific Visualization.

Abhishek Dutta [] is a Research Software Engineer in the Visual Geometry Group (VGG) of the Department of Engineering Science at University of Oxford. He manages several interdisciplinary projects that use Computer Vision to address research questions in many disciplines such as the history of art, book history, zoology, plant sciences and anthropology. He is also the maintainer and developer of many open source software tools developed at the Visual Geometry Group.

Samantha Blickhan is the IMLS Postdoctoral Fellow in the Department of Citizen Science at the Adler Planetarium, working on transcription projects for the Zooniverse. She received her Ph.D. in Musicology from Royal Holloway, University of London, with a thesis on the paleography of British song notation in the 12th and 13th centuries. Her research interests include music and perception, and their relationships with writing systems, technology and pedagogy.

David Howell has been Head of Heritage Science since 2012 before which he was Head of Conservation and Collection Care at Bodleian Libraries. David has worked in heritage for over 35 years have previously been a Conservation Scientist at Historic Royal Palaces where he set up a Conservation Science laboratory. David graduated in Chemistry but is also a graduate in English Mediaeval Studies and has set up a purpose built laboratory within the Weston Library specific to researching library and museum materials.

Kevin Page is a senior researcher and associate member of faculty at the University of Oxford e-Research Centre, where he applies Linked Data to the Digital Humanities. He is investigator of the AHRC ‘Unlocking Musicology’ project, a co-investigator of ‘Digital Delius’, ‘Mapping Manuscript Migrations’ and ‘Workset Creation for Scholarly Analysis’, and runs the AHRC Linked Art research network. As Technical Director of Oxford Linked Open Data (OXLOD) he works with collections across the Gardens, Libraries, and Museums of the University, and has participated in W3C activities including the Linked Data Platform (LDP) working group. From 2012-15 he convened the Linked Data workshop at DHOxSS, where he now runs the Digital Musicology course.

J. Stephen Downie is a professor and the associate dean for research at the Graduate School of Library and Information Science, University of Illinois. Dr. Downie conducts research in music information retrieval. He was instrumental in founding both the International Society for Music Information Retrieval and the Music Information Retrieval Evaluation eXchange. Downie is also the Illinois co-director of the HathiTrust Research Center which provides analytic access to the HathiTrust's massive collections of digitized texts.

Iain Emsley is a PhD student in Digital Media at the University of Sussex. He worked for the Oxford e-Research Centre on various Digital Humanities projects, such as Fusing Audio and Semantic Technologies (FAST) and Workset Creation for Scholarly Analysis (WCSA), and the Square Kilometre Array. His research interests include sustainability and sonification.

Meriel Patrick is an Academic Research Technology Specialist in the Research Support team at IT Services. Much of her work focuses on helping researchers to work more effectively with data. She is also Lecturer in Theology and Philosophy for Wycliffe Hall's visiting student programme, SCIO.

John Southall is Bodleian Data Librarian and Subject Consultant for Economics, Sociology and Social Policy. His role includes work on developing research data management infrastructure and training for researchers, librarians, and support staff.

Yasmin Faghihi, Head of Near and Middle Eastern Department, Cambridge University Library.

Duncan Young is a teacher in the IT Learning Centre, providing courses and consultations in software skills aimed at both staff and researchers across all University disciplines. Since his debut with Excel 4 Introduction in October 1993 he was worked for a number of organisations including Sophos and Microsoft where he was a co-author of their internal Office XP training programme that he then toured around their European offices. Since 2006 he was worked at the University of Oxford and specialises in spreadsheets, databases and programming.



  • Black Twitter Icon

© 2019 University of Oxford