24 x 7 (to 3:30)
Schedule info
Connecting the Networks
Examination of the role of different systems (repository, VIVO, etc), how the same data can be re-used in each system, and the benefits from creating links between these systems.
Attachment | Size |
---|---|
Connecting Networks.docx | 5.81 KB |
Deconstructing and Reconstructing a Repository – Strategies for Nimble Rebuilding
The University of Cincinnati Libraries has been relying for 5 years on a DSpace institutional repository (http://drc.libraries.uc.edu, now at over 560,000 records, and maintained by OH-Tech and OhioLINK.
We have experienced four migrations from server home (shared infrastructure to the cloud and back again), version of DSpace, and database (from PostGreSQL to Oracle). During the course of those migrations I came to understand DSpace export and import capabilities, how our handles worked, and what we could expect when large amounts of records were physically moved to different disk storage, and databases and collections were exported and imported.
The OhioLINK community determined in January of 2013 that maintaining over thirty institutional instances of DSpace was not sustainable for its future; at the same time the University of Cincinnati Libraries determined that we had larger goals for data management and e-science, for the digital humanities and for our university born-digital records, all of which we hoped a broader repository service could encompass. OhioLINK asked it’s member libraries to migrate content away from the existing DRCs by December 2013. In February and March of 2013 the University of Cincinnati Libraries began to put together a development team and a plan, and began to evaluate our choices for a next generation open source repository. This 24x7 presentation will outline our process for determining the platform and the migration plan for the next generation repository at the University of Cincinnati Libraries.
Attachment | Size |
---|---|
Deconstructing and Reconstructing a Repository.pdf | 130.64 KB |
Synchronize your resources with ResourceSync
In just 7 minutes you’ll know what ResourceSync is, why it is required, what you might use it for, and where to learn more about how to use it.
1. Many applications need up-to-date copies of collections of changing Web resources. Such synchronization is currently achieved using ad-hoc or proprietary solutions — there are no widely adopted, web-based approaches.
2. Use cases include replication and reuse of data and articles in scholarly repositories, maintenance of local copies of Linked Data for improved access and availability, or aggregation of data from multiple sources for indexing or preservation.
3. ResourceSync is a general Web resource synchronization framework that addresses these and other use cases. It provides a set of capabilities that can be combined in a modular manner to meet local or community requirements.
4. From the client perspective synchronization involves one or more of three tasks: baseline synchronization to make the initial copy, incremental synchronization to keep up-to-date, and audit for verification.
5. Servers may implement different combinations of ResourceSync capabilities to support these client tasks in different ways. All documents communicating information about resources and capabilities in ResourceSync are based on the widely used Sitemap XML document formats.
6. The framework is extensible and can support facilities such as the communication of references to mirror locations of synchronization resources, transferring only patches for changed resources, and offering historical data.
7. This is an open standard in beta as of January 2013, and will be finalized before the Open Repositories conference. Tools and libraries are being developed to ease implementation.
Attachment | Size |
---|---|
ResourceSync_OR13_24x7_proposal.pdf | 70.74 KB |
Revelation, Brand, and Quality: Content Acquisition and Curation for the World Bank’s Open Knowledge Repository
In a world of ever-expanding access to information growing at exponential rates, repositories are now not just about discoverability, but are becoming more about collections, filtering, quality, and brand. It is not enough to “have it”: it must be what the researcher needs from a source the reader trusts and organized in a logical intuitive manner.
This presentation provides an overview of the World Bank’s Open Access initiatives and the challenges presented for acquisition, organization, and curation of the Bank’s highest quality research and knowledge materials for its Open Knowledge Repository.
It reviews some of the unique approaches to the use of DSpace software adopted by the Bank to facilitate the Bank’s document submission workflow needs. This outlines the unique nature of content acquisition, evaluation, and ingestion, including sharing information on the nature of content discovery in a diffuse organization like the Bank. Finally, the presentation summarizes the new mission-focused partnership initiatives which the Bank is pursuing in order to increase discoverability of its and other’s research.
Attachment | Size |
---|---|
PresentationProp-World Bank OKR Curation.pdf | 174.59 KB |
The Virtual Skeleton Database: An open access repository for biomedical research and collaboration
The access to the largest source, the images stored by clinical institutions in their PACS, is for scientific researches rarely possible due to technical and legal restrictions. Even after this time consuming process, collections of datasets are often lost or mishandled resulting in replication of work, frequently within the same institution or network. To solve these problems, we proposed a centralized storage system called the Virtual Skeleton Database (VSD).The VSD provides a system tailored to the needs of the medical image analysis community. The VSD offers generic tools to store, exchange and collaborate on virtually any digital format. The hosted data is accessible for the community while collaboration and access tools catalyze their productivity.
Attachment | Size |
---|---|
MichaelKistlerOR2013.pdf | 65.56 KB |
Using easyLOD to Expose Your Repositories as Linked Data
easyLOD (https://github.com/mjordan/easyLOD) is a simple, extensible open-source framework for exposing Linked Data. It uses a flexible plugin architecture to connect to a variety of sources such as relational databases, static files, and repository back ends. This presentation will demonstrate how easyLOD works, provide several examples of how it can be used to enhance common repository platforms, and speculate on opportunities in the LODLAM (Linked Open Data in Libraries, Archives, and Museums) community for increasing the value of our content by exposing it as Linked Data.
Attachment | Size |
---|---|
OR2013 24x7 proposal - MJordan - easyLOD.pdf | 78.48 KB |
WordShack: A Vocabulary Registry for Preservation Repositories
Digital library systems often require controlled vocabularies. They facilitate efficient collection management and improve discovery of content by end users by providing more comprehensive and more precise search results. Unfortunately, these vocabulary lists are often duplicated and managed individually in local databases or applications, leading to duplication of development efforts and the use of similar but different terms to represent the same concept. Recognizing these problems, in 2009, Harvard Library began designing a central vocabulary registry, “WordShack”, for use across the Library’s digital library applications and systems, including its digital preservation repository, the Digital Repository Service (DRS).
WordShack is a registry for controlled vocabulary terms used in the Harvard Library’s digital preservation suite of services. Currently integrated with the DRS and Email Archiving Service beta, WordShack provides
• Persistent identifiers for vocabulary concepts
• An authoritative source for the current preferred term for a concept
• References from alternate terms for a concept
• Single maintenance of vocabulary concepts shared across systems
• Persistent storage of information needed in Harvard’s preservation systems that would be dropped from external sources in the course of normal business (e.g., names of Harvard agents after they leave the university.)
The need for controlled vocabulary is especially evident within the metadata managed within the DRS. For example, in the PREMIS schema for preservation metadata, significant events typically include the name of a person, organization or software program associated with the event. In order to eliminate ambiguity about the entity represented by a name, a vocabulary control mechanism such as WordShack is desirable. By representing these entities in the metadata as persistent identifiers resolving to authority-controlled names within WordShack, and not solely by the name strings themselves, the agents related to events can be unambiguously identified and managed centrally.
In addition to PREMIS event agents, WordShack-managed terms have been found to be useful for many other metadata fields stored in the DRS. A few examples:
• Email addresses and associated persons come into play as metadata associated with archived email collections. By controlling these terms, curators and archivists can unambiguously document the collection source or content creators, as well as tie the email collections to other content associated with the same people.
• Topics are subject terms that can be used to characterize collections of digital objects. By using controlled terms for these topics, it will make it easier for end users to discover the content, from any of the Library’s applications.
The basic unit within WordShack is a “term”. In its simplest form, a term has a unique ID, a preferred value and zero or more variant values. The term types that are supported include topic, software, administrative category, administrative flag, email address, person and organization. Each term type has associated metadata stored and managed in WordShack. For example, a software term includes a name, version and genre. The schema permits one to indicate the authority source of the term, for example to indicate that the term was exported from a different vocabulary registry. Provision is made for including the “foreign” vocabulary’s URI or other formal identifier.
The WordShack implementation includes a database, an API, and user interface widgets. The API is a RESTful API for CRUD (create, read, update, and delete) operations. The API also supports CRUD operations on relationships between terms – for example, relating an email address to a person – as well as term deprecation, reactivation, and superseding operations. Throughout, the API supports variant as well as preferred names for terms. Underlying the API is a relational database modeled on the Library of Congress’ MADS schema.
WordShack also includes a set of jQuery UI JavaScript user interface widgets for embedding controlled vocabulary within administrative and discovery web applications. “Select” widgets provide an auto-complete drop down user interface for locating terms by typing character strings embedded in any term variant. “Edit” widgets provide the ability to create or modify all of the metadata elements associated with a term. Use of the widgets is of course optional – new user interface access methods, such as pull down lists, checkboxes, etc. can be programmed directly using the WordShack API. The WordShack widgets and API have been used in creating an administrative interface for managing WordShack terms and for associating terms with DRS objects, as well as for specifying terms in the DRS ingest application and an email archiving curation interface. The presentation will include screen shots of this user interface.
If there is sufficient interest from the repository community, WordShack will be made available as open source software available on a public code repository. We welcome contributors who might extend the core code, for example by creating an RDF export/import facility, or for enabling federation and interoperation with other authority control systems.
Attachment | Size |
---|---|
Open Repositories 13 proposal_final.docx | 20.33 KB |
- Login to post comments