Making and Abusing a Web of Semantic Information.

Late Semantic Web patterns. Cases: DBpedia, Wikitology. Conclusion
Making and Exploiting a Web of Semantic Data Tim Finin, UMBC Earth and Space Science Informatics Workshop 05 August 2009

Overview Introduction Semantic Web 101 Recent Semantic Web patterns Examples: DBpedia, Wikitology Conclusion

The Age of Big Data Massive measures of information is accessible today Advances in numerous fields driven by accessibility of unstructured information, e.g., content, sound, pictures Increasingly, a lot of organized and semi-organized information is likewise online Much of this accessible in the Semantic Web dialect RDF , cultivating reconciliation and interoperability Such organized information is particularly imperative for the sciences

Twenty years back… Tim Berners-Lee\'s 1989 WWW proposition depicted a web of rela-tionships among named objects binding together numerous data administration assignments Capsule history Guha\'s MCF (~94) XML+MCF=>RDF (~96) RDF+OO=>RDFS (~99) RDFS+KR=>DAML+OIL (00) W3C\'s SW action (01) W3C\'s OWL (03) SPARQL, RDFa (08) Rules (09)

Ten years prior … . The W3C began creating benchmarks for the Semantic Web The vision, innovation and use cases are as yet developing Moving from a web of archives to a web of information

Today 4.5 billion coordinated realities distributed on the Web as RDF Linked Open Data

Tomorrow Large accumulations of incorporated actualities distributed on the Web for some controls and spaces

W3C\'s Semantic Web Goal "The Semantic Web is an expansion of the present web in which data is given all around characterized importance, better empowering PCs and individuals to work in participation." - Berners-Lee, Hendler and Lassila, The Semantic Web, Scientific American, 2001

Contrast with a non-Web approach The W3C Semantic Web methodology is Distributed Open Non-restrictive Standards based

How would we be able to share information on the Web? POX, Plain Old XML, is one methodology, yet it has inadequacies The Semantic Web dialects RDF and OWL offer an easier and more unique information show (a diagram) that is better for coordination Its all around characterized semantics bolsters learning demonstrating and surmising Supported by a steady, financed gauges association, the World Wide Web Consortium

Simple RDF Example "Canny Information Systems on the Web and in the Aether" dc:Creator Note: "clear hub" bib:Aff bib:email " " "Tim Finin"

The RDF Data Model A RDF record is an unordered accumulation of explanations, each with a subject , predicate and question Such triples can be considered as a named circular segment in a chart Statements depict properties of assets An asset is any article that can be referenced or indicated by a URI Properties themselves are likewise assets (URIs) Dereferencing a URI produces valuable extra data, e.g., a definition or extra realities

RDF is the primary SW dialect Graph XML Encoding RDF Data Model <rdf:RDF … ..> <… .> <… .> </rdf:RDF> Good for human survey Good for Machine handling Triples stmt(docInst, rdf_type, Document) stmt(personInst, rdf_type, Person) stmt(inroomInst, rdf_type, InRoom) stmt(personInst, holding, docInst) stmt(inroomInst, individual, personInst) RDF is a basic dialect for chart based representations Good for capacity and thinking

XML encoding for RDF <rdf:RDF xmlns:rdf=" structure ns#" xmlns:dc="" xmlns:bib=""> <description about=""> <dc:title>Intelligent Information … and in the Aether</dc:Title> <dc:creator> <description> <bib:Name>Tim Finin</bib:Name> <bib:Email></bib:Email> <bib:Aff resource=""/> </description> </dc:Creator> </description> </rdf:RDF> "Keen Information Systems on the Web and in the Aether" dc:Creator bib:Aff bib:email " " "Tim Finin"

N3 is a friendlier encoding @prefix rdf: structure ns# . @prefix dc: @prefix face cloth: apron/. <> dc:title "Intelligent ... what\'s more, in the Aether" ; dc:creator [ bib:Name "Tim Finin"; bib:Email "" bib:Aff: "" ] . "Canny Information Systems on the Web and in the Aether" dc:Creator bib:Aff bib:email " " "Tim Finin"

RDFS underpins basic inductions RDF Schema includes vocabulary for classes, properties & imperatives A RDF cosmology in addition to some RDF proclamations may suggest extra RDF explanations (unrealistic in XML) Note this is a piece of the information display and not of the getting to or preparing code. @prefix rdfs: <http://www.....>. @prefix : <genesis.n3>. guardian a rdf: property; rdfs:domain individual; rdfs:range individual. mother rdfs:subProperty guardian; rdfs:domain lady; rdfs:range individual. eve mother cain. individual a class. lady subClass individual. mother a property. eve a man; a lady; guardian cain. cain a man.

OWL includes further wealth OWL includes wealthier representational vocabulary, e.g. parentOf is the backwards of childOf Every individual has precisely one mother Every individual is a man or a lady yet not both A man is what might as well be called a man with a sex property with worth "male" OWL depends on " portrayal rationale " – a rationale subset with proficient reasoners that are finished Good calculations for thinking about depictions

That was then, this is currently 1996-2000: concentrate on RDF and information 2000-2007: concentrate on OWL, creating ontologies, refined thinking 2008-… : Integrating and abusing vast RDF information accumulations sponsored by lightweight ontologies

A Linked Data story Wikipedia as a wellspring of learning Wikis are an incredible approaches to team up on working up information assets Wikipedia as a philosophy Every Wikipedia page is an idea or item Wikipedia as RDF information Map this metaphysics into RDF DBpedia as the lynchpin for Linked Data Exploit its broadness of scope to coordinate things

Populating Freebase KB

Underlying Powerset\'s KB

Mined by TrueKnowledge

Wikipedia as a cosmology Using Wikipedia as a metaphysics every article (~3M) is a cosmology idea or example terms connected by means of class framework (~200k), infobox layout use, between article joins, infobox joins Article history contains metadata for trust, provenance, and so forth. It\'s an agreement philosophy with expansive scope Created and kept up by an assorted group for nothing! Multilingual Very present Overall substance quality is high

Wikipedia as a metaphysics Uncategorized and miscategorized articles Many "managerial" classifications: articles requiring amendment; futile ones: 1949 births Multiple infobox formats for the same class Multiple infobox characteristic names for same property No datatypes or areas for infobox trait values and so on

Dbpedia : Wikipedia in RDF A people group push to separate organized data from Wikipedia and distribute as RDF on the Web Effort began in 2006 with EU subsidizing Data and programming publicly released DBpedia doesn\'t extricate data from Wikipedia\'s content, yet from the its organized data, e.g., joins, classifications, infoboxes

DBpedia: Linked Data lynchpin

Slide 31

Slide 32

Slide 33

Slide 34

Slide 35

Slide 36

Slide 37

Slide 38

Slide 39

