Building a Hub for Linked Library Data on Culturegraph.org

Building a Hub for Linked Library Data on Culturegraph.org
paly

This article discusses the process of building a hub for Linked Library Data on Culturegraph.org, including the challenges of using linked data, the Culturegraph platform, resolving lookups, RDF modelling, and the current state of the project. It also explores the paradigm shift in modeling knowledge data and creating a network beyond organizational boundaries.

  • Uploaded on | 0 Views
  • eleonora eleonora

About Building a Hub for Linked Library Data on Culturegraph.org

PowerPoint presentation about 'Building a Hub for Linked Library Data on Culturegraph.org'. This presentation describes the topic on This article discusses the process of building a hub for Linked Library Data on Culturegraph.org, including the challenges of using linked data, the Culturegraph platform, resolving lookups, RDF modelling, and the current state of the project. It also explores the paradigm shift in modeling knowledge data and creating a network beyond organizational boundaries.. The key topics included in this slideshow are Linked Data Challenge, Culturegraph Platform, Resolving Lookups, RDF Modelling, Paradigm Shift,. Download this presentation absolutely free.

Presentation Transcript


1. 1 culturegraph.org Aufbau eines Hubs fr Linked Library Data Markus M. Geipel Adrian Pohl

2. 2 1. The Linked Data Challenge 2. Culturegraph Platform 1. Resolving & Lookup 2. Process & Technology 3. RDF Modelling 3. Current State Table of Contents

3. 3 Paradigm shift in modeling knowledge/data Isolated Tables Network beyond organizational boundaries

4. From isolated Tables to a Semantic Network A nave Approach 1. Transform from Marc21/Mab2/Pica to RDF 2. Put everything into a Triplestore 3. SPARQL and Reasoner do the magic What is wrong with this approach? 4

5. 5 Format is not Content! If you pour water into a wine-glass does it change to wine ? How can you expect old Marc21 data to change into a semantically rich, reasoner-ready piece of information just by changing the data format to RDF?

6. Connections dont come for free Some challenges 1. No universally unique id 2. Often no references to entities, just character- strings 3. No controlled vocabulary - Example: 1.3 Mio. different values for the edition field 4. Changing Cataloging Practices 5. Mistakes, Typos 6

7. Culturegraph as a signpost A coherent picture on bibliographic data 7 Hidden duplicates Different services Different interfaces ? Culturegraph !

8. 8 Culturegraph as a Platform to interlink Bibliographic Data 1. Open Tools - Open algorithms and code; reuse 2. Integration into existing Workflows - Synchronization of data - Integration of results into original data sources 3. Publication Results - Connections and views, not the entire aggregated Data - Linked Open Data/RDF 4. Persistence of Results - Integration into URN resolving infrastructure 5. Tracking provenance

9. First Project: Resolving & Lookup Universally Unique and Persistent IDs Input: 6 main German bibliographic catalogues Objective: Bundling of manifestations Service: - Publication of bundles - Minting of URNs for approved bundles - Search bundles using established identifiers Part of the DDB Eco-System - Support for Data Aggregation 9

10. The Process 1. Translate into internal format 1. Mapping of Fields to Properties 2. Normalization, Cleaning, Regexp Matching, etc. defined in XML 2. Database ingest > 80 Million Records > One Billion Properties 10 XM L

11. The Process 3. Generate unique properties > 50 Mio.* - Combinations of Properties defined in XML 4. Group by Unique Properties 5. Merge equivalent Groups ca. 18 Mio. Records* in groups 11 XML * For a first simple Matching Algorithm

12. The Process (next steps) 5. Check quality & mint persistent Ids 6. Publication as Linked Data 12 Id1 Id2 Id3 http://

13. Representing bundles of bibliographic records in RDF 13

14. Namespaces for Internal Bibliographic Description rdf: < http://www.w3.org/1999/02/22-rdf-syntax-ns #> bibo: < http://purl.org/ontology/bibo/ > dcterms: < http://purl.org/dc/terms/ > frbr: < http://purl.org/vocab/frbr/core #> foaf: < http://xmlns.com/foaf/0.1/ > cg: < http://culturegraph.org/vocab #> (not established yet) ...& others 14

15. 15

16. Matching & Bundling Different matching critieria to be discussed Example: sameness of ISBN & year Matching algorithms can be created and modified easily Matched resources are bundled and underlying algorithm indicated Bundle Ontology: http://purl.org/net/bundle 16

17. 17

18. 18

19. Minting ber-Identifiers In the last step IDs for bibliographic resources may be minted urn:nbn:de:cg-12345678 http://culturegraph.org/urn:nbn:de:cg -12345678 Based on reliable, agreed-upon algorithm Record-resource linking by foaf:isPrimaryTopicOf 19

20. 20

21. Future prospects Workflow-Integration Share, enrich and reuse metadata right from the start New Features/Projects From concrete to visionary 1. Integration of GND-references (from BEACON-Files and other sources) 2. Computation of links to further resources (Subject Headings, Geo coordinates, Person names, Wikipedia) 3. Authority file for works 4. Crowdsourcing (enrich and correct descriptions of titles, works, persons, etc.) 21

22. Markus M. Geipel |culturgraph.org | 5. October 2011 22 Summary Culturegraph will - Match the main German library catalogues - give each bibliographic resource a persistent ID State - Basic infrastructure up running with good performance (80 Mio. Records Matched in one hour) - All Source Code published on Sourceforge - First Demonstrator Webportal at www.culturegraph.org Soon to come - January: - Operational Webportal - Publication of first matching results (HTML, RDF, etc.) - Next Year: - Persistent IDs

23. Appendix: Projektmitarbeiter Daniel Schfer (DNB) Projektleitung Katja Mecklinger (DNB) Stellvertretende Projektleitung, A Markus Geipel (DNB) Leiter Architektur und Entwicklung Adrian Pohl (hbz) A, Ontologie Pascal Christoph (hbz) Architektur Julia Hauser (DNB) - Ontologie Lars Svensson (DNB) - Ontologie Jrgen Kett (DNB) Projektsteuerung, A 23

Related