Understanding Data Model and Ontology

Understanding Data Model and Ontology

In this article, Dr. Tatiana Malyuta, an Associate Professor at CUNY and a consultant for DoD, discusses the differences between Data Model and Ontology. Dr. Barry Smith from UB NCOR also shares his insights on the topic. The article highlights the purpose and benefits of data modeling, emphasizing its role in achieving business efficiency.

  • Uploaded on | 0 Views
  • nerida nerida

About Understanding Data Model and Ontology

PowerPoint presentation about 'Understanding Data Model and Ontology'. This presentation describes the topic on In this article, Dr. Tatiana Malyuta, an Associate Professor at CUNY and a consultant for DoD, discusses the differences between Data Model and Ontology. Dr. Barry Smith from UB NCOR also shares his insights on the topic. The article highlights the purpose and benefits of data modeling, emphasizing its role in achieving business efficiency.. The key topics included in this slideshow are Data Model, Ontology, Business Applications, Data Management Technologies, Relational, Graph,. Download this presentation absolutely free.

Presentation Transcript

1. Data Model vs. Ontology Dr. Tatiana Malyuta Associate Professor, CUNY Consultant for DoD Dr. Barry Smith UB, NCOR

2. Data Model - Purpose To provide a consistent and efficiently functioning data store for a particular business application(s) Represents specific business concepts in a way that determines organization of data in the store Commonly used representations are relational and graph; they are supported by data management technologies, e.g. relational Oracle and MySQL, graph Neoj4, RDF/OWL stores. Efficiency requires Application-specific representations Store only data needed the application Objective (shared) representation of the domain is not the purpose multiple data models for the same domain to accommodate different business applications

3. Data Silos Numerous partial idiosyncratic representations of the domain in data models and numerous versions of data in data stores No re-usability No single version of truth Accounts Receivable Accounts Payable Budget

4. Ontology Purpose Objectivity of representation of reality Commonly used representation is graph, it is supported by RDF-based semantic technologies Objective (shared) representation of the domain - one authoritative ontology for the domain of reality meant for re-use Storing vast volumes of data is not the purpose

5. Financial Ontology A single domain ontology (or a collection of ontologies) To be re-used in different applications Single version of truth (as we know it today) Note: we discuss ontologies built in accordance with the methodology and architecture pioneered by Dr. Smith.

6. Comparison Although there are technologies that support a particular paradigm in the best way, they are not the defining factor in distinguishing between a data model and ontology We compare not technologies but paradigms Ontology Data Model

7. Data Model Types Types are general or repeatable entities capable of being instantiated by indefinitely many particulars Data model types and instances are abstractions embodying efficient ways of describing the data about reality that is needed by an application (efficient both for reasoning and for storage) Different abstractions depending on the business need The data model term person is used to define an efficient storage solution for data about persons needed by a particular application

8. Ontology Types Ontology types and instances are on the side of reality They must provide one term, and one definition, for each salient type of entity in each domain of interest The ontology term person, when it is used to represent data about persons, is designed to establish a link between these data and persons in reality.

9. Data Model Organization Arbitrary combination of selected types suited for efficient data processing The data model view of reality is flat and rigid One of the models needs to be changed to accommodate multiple skills of a person. These changes can be performed only through significant effort because of relative rigidity of data representation languages and the need to re-arrange the physical data store

10. Ontology - Organization Each type appears only once in the ontology hierarchy. The ontology view of reality is synoptic it represents in non-redundant fashion an entire hierarchy of types at different levels of generality. Each term is associated in an intelligible way with its subsuming and subsumed terms (and thus with the ancestor and descendant types) in the hierarchy of more and less general Representation is more flexible, changes are easier to make, and changes are not as disruptive

11. Questions?

12. Data Model vs. Ontology Types and Individuals Person Name Skill John Computer Skill Mary Sewing Skill Skill Computer Skill Programming Skill Java C++ Person Name Skill John Java Mary C++

13. Data Model Labels Are not as important because databases are not directly exposed to users they are presented via an application that exposes the database content using the specific vocabulary of a narrow community of users Can be anything, e.g. PN, PName, PersName, PersonN, etc. for the person name The meaning of the label is often derived from the context (e.g. Name for the name of the Person and the name of the Skill in one of the examples)

14. Ontology - Labels Are exposed to users Are nouns and noun phrases from natural language, and each type has a unique name that designates the type unambiguously regardless of the context in which the type might be used, e.g. PersonName, SkillName

15. Closed and Open World Assumptions (impact of technologies) Database reasoning is confined to search based on the closed world assumption. If we do not find something in the database, then this means that this something does not exist in the world that is defined by the database. Ontologies are based on the idea that we can never describe entities in the real world completely. This means that, from the absence in an ontology of a particular term A, we cannot infer that As do not exist. It means also that ontologies are constructed in a way which allows easy addition of new types and relations.

16. Life Span Data models are created in ad hoc ways to capture targeted selection of features; the data model usually is not reused, which results in numerous data silos for a domain Ontologies will grow and expand as new knowledge is gained over time

17. Summary of Comparison Dimension of Comparison Traditional Data-Model Ontologies Closeness to reality Variable, application-specific Reality is always the prime focus Conceptualization of the domain Plain and partial (always at the level of detail needed for a particular implementation) Hierarchical, simultaneously describing the same domain at different levels of detail Vocabulary Application-specific, not intended for sharing Application-independent, intended to support sharing and reuse Structures or organization of types Groupings of types to accommodate data access patterns Taxonomies (type hierarchies) always used to describe/classify the domain Combinability Can rarely be combined; even if possible this will typically require significant manual effort If the ontology building methodology is followed, then the results will be combinable automatically Flexibility Rigid, c hanges normally require significant effort Flexible, changes can normally be effected very easily.

18. Semantic Enhancement of Data Models by Ontology Semantic Enhancement (SE) is realized with the help of ontologies that are used to explicate data models and annotate data instances Vocabulary of ontologies used for explications and annotations provides agile horizontal integration Ontologies, by virtue of their nature and organization, provide semantic enhancement of data PersonID Name Description 111 Java Programming 222 SQL Database SQL Java C++ ProgrammingSkill ComputerSkill Skill Education Technical Education 18

19. The Meaning of Enhancement Semantic enhancement/enrichment of data = arms length approach (no change to data) through simple explication we associate an entire knowledge system with a database field enables analytics to process data, e.g. about computer skills, vertically along the Skill hierarchy, as well as horizontally via relations between Skill and Education. and further while data in the database does not change, its analysis can be richer and richer as our understanding of the reality changes For this richness to be leveraged by different communities, persons, and applications it needs to have the properties mentioned above and be constructed in accordance with the principles of the SE (see References) 19

20. SE and Data Integration Traditional integration approaches involve creation of a new model used in A new physical store (data warehouse) Expensive, resource- and time-consuming Another data store rigid (potential data silo), interoperable with other stores Querying the data sources via it Fragile Both entail loss and or distortion of data and semantics, and provide only local integration (do not lead to interoperability with other sources) SE of a store Does not require data reorganization and creation of another store Changes to it are non-intrusive Leads to integration of the store with other stores, enhanced previously or in the future

21. References Barry Smith, et al. IAO-Intel An Ontology of Information Artifacts in the Intelligence Domain , STIDS Conference, 2013. Barry Smith, Tatiana Malyuta, William S. Mandrick, Chia Fu, Kesny Parent, Milan Patel, Horizontal Integration of Warfighter Intelligence Data: A Shared Semantic Resource for the Intelligence Community , STIDS Conference, 2012. Barry Smith, Tatiana Malyuta, David Salmen, William Mandrick, Kesny Parent, Shouvik Bardhan, Jamie Johnson, Ontology for the Intelligence Analyst, Crosstalk: The Journal of Defense Software Engineering, 2012. David Salmen, Tatiana Malyuta, Alan Hansen, Shaun Cronen, Barry Smith, Integration of Intelligence Data through Semantic Enhancement , STIDS Conference, 2011. 21

22. Questions?