Browsing YUL research and professional contributions by Author "607b102b48520c0aae200458524f0551"
Now showing items 1-12 of 12
-
The Archives Unleashed Notebook: Madlibs for Jumpstarting Scholarly Exploration
Deschamps, Ryan; Ruest, Nick; Lin, Jimmy; Fritz, Samantha; Milligan, Ian (2019)This paper introduces the Archives Unleashed Notebook, which is designed to work with derivative datasets from the Archives Unleashed Cloud, a platform for analyzing web archives. These datasets contain common starting ... -
The Archives Unleashed Project: Technology, Process, and Community to Improve Scholarly Access to Web Archives
Ruest, Nick; Lin, Jimmy; Milligan, Ian; Fritz, Samantha (ACM/IEEE, 2020-08)The Archives Unleashed project aims to improve scholarly access to web archives through a multi-pronged strategy involving tool creation, process modeling, and community building -- all proceeding concurrently in mutually ... -
Building Community and Tools for Analyzing Web Archives through Datathons
Milligan, Ian; Casemajor, Nathalie; Fritz, Samantha; Lin, Jimmy; Ruest, Nick; Weber, Matthew S.; Worby, Nicholas (2019)Starting in March 2016, the Archives Unleashed team and our collaborators have brought together social scientists, humanists, archivists, librarians, computer scientists, and other stakeholders to explore web archives as ... -
Building Community at Distance: A Datathon during COVID-19
Fritz, Samantha; Milligan, Ian; Ruest, Nick; Lin, Jimmy (Digital Library Perspectives, 2020-08-04)This paper aims to use the experience of an in-person event that was forced to go virtual in the wake of COVID-19 as an entryway into a discussion on the broader implications around transitioning events online. It gives ... -
Content Selection and Curation for Web Archiving: The Gatekeepers vs. the Masses
Milligan, Ian; Ruest, Nick; Lin, Jimmy (2016)Any preservation effort must begin with an assessment of what content to preserve, and web archiving is no different. There have historically been two answers to the question "what should we archive?'' The Internet Archive's ... -
Content-Based Exploration of Archival Images Using Neural Networks
Adewoye, Tobi; Han, Xiao; Ruest, Nick; Milligan, Ian; Fritz, Samantha; Lin, Jimmy (ACM/IEEE, 2020-08)We present DAIRE (Deep Archival Image Retrieval Engine), an image exploration tool based on latent representations derived from neural networks, which allows scholars to "query" using an image of interest to rapidly find ... -
The Cost of a WARC: Analyzing Web Archives in the Cloud
Deschamps, Ryan; Fritz, Samantha; Lin, Jimmy; Milligan, Ian; Ruest, Nick (2019)The value of web archives to support scholarship in the humanities and social sciences is slowly being realized by the increasing availability of scalable tools and platforms. The cost of providing scholarly access is a ... -
Desiderata for Exploratory Search Interfaces to Web Archives in Support of Scholarly Activities
Jackson, Andrew; Lin, Jimmy; Milligan, Ian; Ruest, Nick (2016)Web archiving initiatives around the world capture ephemeral web content to preserve our collective digital memory. In this paper, we describe initial experiences in providing an exploratory search interface to web archives ... -
Scalable Content-Based Analysis of Images in Web Archives with TensorFlow and the Archives Unleashed Toolkit
Yang, Hsiu-Wei; Liu, Linqing; Milligan, Ian; Ruest, Nick; Lin, Jimmy (2019)We demonstrate the integration of the Archives Unleashed Toolkit, a scalable platform for exploring web archives, with Google's TensorFlow deep learning toolkit to provide scholars with content-based image analysis ... -
Solr Integration in the Anserini Information Retrieval Toolkit
Clancy, Ryan; Eskildsen, Toke; Ruest, Nick; Lin, Jimmy (2019)Anserini is an open-source information retrieval toolkit built around Lucene to facilitate replicable research. In this demonstration, we examine different architectures for Solr integration in order to address two current ... -
Warclight: A Rails Engine for Web Archive Discovery
Ruest, Nick; Milligan, Ian; Lin, Jimmy (2019)This paper describes the development of Warclight, a portmanteau of the open-source Blacklight platform and the ISO-standard Web ARChive file format. Warclight allows users to explore web archives that have been indexed ... -
We Could, but Should We? Ethical Considerations for Providing Access to GeoCities and Other Historical Digital Collections
Lin, Jimmy; Milligan, Ian; Oard, Douglas W.; Ruest, Nick; Shilton, Katie (ACM, 2020-03)We live in an era in which the ways that we can make sense of our past are evolving as more artifacts from that past become digital. At the same time, the responsibilities of traditional gatekeepers who have negotiated the ...