Information Administration and Representations in Ecce and CMCS .


29 views
Uploaded on:
Description
Data Management and Representations in Ecce and CMCS. Theresa L. Windus Pacific Northwest National Laboratory Environmental Molecular Sciences Laboratory Molecular Science Software Group. Outline. Some “definitions” Data and task representations Ecce CMCS Summary Acknowledgement.
Transcripts
Slide 1

Information Management and Representations in Ecce and CMCS Theresa L. Windus Pacific Northwest National Laboratory Environmental Molecular Sciences Laboratory Molecular Science Software Group

Slide 2

Outline Some "definitions" Data and undertaking representations Ecce CMCS Summary Acknowledgment 2

Slide 3

522.09 2.02 Data and metadata (one researcher\'s information is another researcher\'s metadata)  H° atomiz ( ) = 0 ± kcal/mol CH 3 OOH [calculated, G3//B3LYP, T. Windus, more at http://...] information : quality and instability units: kcal/mol amount: enthalpy of atomization species: methylhydroperoxide, CAS# 3031-73-0 temperature: 0 K computed: G3//B3LYP maker: T. Windus utilizing Ecce more information: http://avatar.emsl.pnl.gov:8080/Ecce/.../CH3OOH/.../GxEnergy 3

Slide 4

Metadata Converts Scientific Data into Knowledge Metadata gives ID and documentation to logical information. Illustration: Attaching a proprietor, creation date, unique, sort to information. Case: Tracking information to program renditions, and perhaps bugs for that variant. Metadata reports the connection and estimation of the information. Illustration: The hypothetical atomization vitality of methylhydroperoxide (and its instability) from Ecce (utilized as contribution to ATcT) contains data recognizing the species and the amount, units, the hypothetical strategy utilized, vibrational frequencies and geometry, reference to source document, maker, and so on. Metadata encourages cross-scale exchange of information. Case: Can demonstrate a chain of data sources, including input parameters and arrangement records, crosswise over scales. Illustration: Can recover writing references which depict this information. Metadata permits clients to remark on the information and its quality. Case: Can be utilized for exploratory associate audit of information. Metadata is fundamental for powerful joint effort. Case: Scientific information turns out to be more usable to others when it is reported. Explanation is another term for metadata. Comments can be included by either the information proprietor or an outsider. 4

Slide 5

Data Pedigree: A Special Kind of Metadata Data family or information provenance is a relationship which gives a "line of predecessors". Family takes into account the arrangement and following of the experimental information, and for the recognizable proof of the information\'s definitive starting point, potentially crosswise over scales. Family incorporates the arrangement of steps important to duplicate the information. Information is connected, for instance, to ventures, references, sources of info, and yields. 5

Slide 6

Knowledge Grid An arrangement of versatile instruments, middleware, and administrations For the creation, examination, scattering, assessment, and utilization Of information, data, and learning By people, gatherings, and groups … An advanced spot for playing out "all" parts of science 6

Slide 7

Ecce – Extensible Computational Chemistry Environment far reaching critical thinking environment basic graphical UIs exploratory demonstrating administration consistent exchange of data between applications steady information stockpiling through DAV coordinated investigative information administration apparatuses for guaranteeing productive utilization of figuring assets over an appropriated system representation of multi-dimensional information structures http://ecce.emsl.pnl.gov NWChem – hugely parallel computational science program Energetics, geometries, frequencies, and so on at different levels of hypothesis http://www.emsl.pnl.gov/docs/nwchem Ecce & NWChem 7

Slide 8

Ecce is… (cont.) 8

Slide 9

Ecce Architecture 9

Slide 10

Distributed Authoring and Versioning (DAV) An early web administration (XML summons over HTTP) A broadly received standard for metadata/information transport Put/Get information with subjective properties (dynamic) Properties can be found and got to freely DASL, Versioning, Transactions, … 10

Slide 11

What does the WebDAV convention give? 11

Slide 12

Accessing WebDAV Server from Windows 2000 12

Slide 13

Accessing WebDAV Server Using Browser 13

Slide 14

Accessing WebDAV Server Using Ecce 14

Slide 15

Ecce Physical Model Calculations are alluded to as a "virtual report" since we disseminate the structure crosswise over numerous physical items. Physical accumulations and assets are URI addressable. Accumulations are unordered and permit blended substance. 15

Slide 16

Basis Set Tool Builder Template File Parameters Perl .edml File Calculation Editor Geometry ai.input ESP Basis Set Input Deck Basis Set Reformatting Script Theory Details Runtype Details Python Perl Calculation Setup 16

Slide 17

Perl Output Ecce DataBase Text Block 1 Parse Script 1 Text Block 2 Parse Script 2 Job Monitor Calculation Viewer . . . . . . Parse Descriptor Text Block N Parse Script N Output Parsing 17

Slide 18

On the estimation: http://www.emsl.pnl.gov/ecce:contenttype=ecceCalculation http://www.emsl.pnl.gov/ecce:resourcetype=VIRTUAL_DOCUMENT http://www.emsl.pnl.gov/ecce:createdWith=v3.2 http://www.emsl.pnl.gov/ecce:owner=d39974 http://www.emsl.pnl.gov/ecce:application=NWChem http://www.emsl.pnl.gov/ecce:theory=SCF/RHF http://www.emsl.pnl.gov/ecce:spinmultiplicity=Singlet http://www.emsl.pnl.gov/ecce:currentVersion=v3.2 http://www.emsl.pnl.gov/ecce:creationdate=Mon , 22 Mar 2004 17:24:00 GMT http://www.emsl.pnl.gov/ecce:reviewed=false http://www.emsl.pnl.gov/ecce:runtype=ESP http://www.emsl.pnl.gov/ecce:launch_machine=arunta http://www.emsl.pnl.gov/ecce:launch_nodes=1 http://www.emsl.pnl.gov/ecce:launch_rundir=/home/d39974/ecceruns http://www.emsl.pnl.gov/ecce:launch_totalprocs=1 http://www.emsl.pnl.gov/ecce:launch_user=d39974 http://www.emsl.pnl.gov/ecce:launch_maxmemory=0 http://www.emsl.pnl.gov/ecce:launch_remoteShell=ssh http://www.emsl.pnl.gov/ecce:job_jobid=13858 http://www.emsl.pnl.gov/ecce:job_path=/home/d39974/ecceruns/tracebug/esp http://www.emsl.pnl.gov/ecce:job_clienthost=arunta http://www.emsl.pnl.gov/ecce:startdate=Mon , 22 Mar 2004 17:25:11 GMT http://www.emsl.pnl.gov/ecce:version=Thu May  8 13:16:51 PDT 2003 Version 4.5 http://www.emsl.pnl.gov/ecce:state=Complete http://www.emsl.pnl.gov/ecce:completiondate=Mon , 22 Mar 2004 17:25:14 GMT DAV:resourcetype=<D:collection/> DAV:creationdate=2004-03-22T17:24:38Z DAV:getlastmodified=Mon, 22 Mar 2004 17:24:38 GMT DAV:getetag="b2805d-1000-926a8180" DAV:supportedlock= DAV:getcontenttype=httpd/unix-registry On the particle: http://www.emsl.pnl.gov/ecce:empiricalFormula=H4C http://www.emsl.pnl.gov/ecce:charge=0.000000 http://www.emsl.pnl.gov/ecce:useSymmetry=false http://www.emsl.pnl.gov/ecce:symmetrygroup=C1 DAV:creationdate=2004-03-22T17:24:38Z DAV:getcontentlength=386 DAV:getlastmodified=Mon, 22 Mar 2004 17:24:38 GMT DAV:getetag="b28064-182-926a8180" DAV:executable=F DAV:supportedlock= DAV:getcontenttype=chemical/x-ecce-mvm Example metadata 18

Slide 19

title: demo sort: particle num_atoms: 1065 atom_info: image truck atom_list: O - 2.37400 - 3.09100 13.5210 H - 1.91600 - 2.20200 14.0480 ... pdb_list: H O5* RC 1 157D A H H5T RC 1 157D A … attr_list: - 0.622300 1 0 0.429500 1 0 … atom_type_list: OH HO … num_bonds: 1028 bond_list: 2 1 1.00000 1 3 1.00000 … Example MVM record 19

Slide 20

XML design for Properties <?xml version="1.0" encoding="utf-8" ?> <value name="CPUSEC" units="second">9.60000000000000e-01</value> <?xml version="1.0" encoding="utf-8" ?> <vector name="MLKNSHELL" rows="7" units="e" rowLabel="Unknown" rowLabels="1 2 3 4 5 6 7">1.99199825923126e+00 1.18803456337004e+00 3.08260463820159e+00 9.34340637068915e-01 9.34340635555820e-01 9.34340634042729e-01 9.34340632529639e-01</vector> <?xml version="1.0" encoding="utf-8" ?> <tsvectable name="GEOMTRACE" rows="5" units="Angstrom" columns="3" vectors="1" rowLabel="Atom,Coordinate" rowLabels="0 1 2 3 4" columnLabel="Coordinate" vectorLabel="Coordinate" columnLabels="X Y Z"><step number="1">0.000000000000000e+00 0.000000000000000e+00 - 6.755000000000000e-01 - 6.755000000000000e-01 6.755000000000000e-01 6.755000000000000e-01 6.755000000000000e-01 6.755000000000000e-01 6.755000000000000e-01 - 6.755000000000000e-01 - 6.755000000000000e-01 - 6.755000000000000e-01 6.755000000000000e-01 - 6.755000000000000e-01</step> <step number="2">6.767628142309400e-15 - 6.950100046595310e-09 1.390021315920880e-08 - 6.239857395114590e-01 - 6.239857464615680e-01 6.239857534116811e-01 6.239857568867110e-01 6.239857499366001e-01 6.239857707869190e-01 6.239857742619920e-01 - 6.239857812120860e-01 - 6.239857603617700e-01 - 6.239857916372510e-01 6.239857846871540e-01 - 6.239857777370440e-01</step> <step number="3">6.549446678833860e-15 1.124467050187860e-09 - 2.248938851918010e-09 - 6.252750669032320e-01 - 6.252750631744280e-01 6.252750594456050e-01 6.252750588833910e-01 6.252750626121890e-01 6.252750514257610e-01 6.252750508635410e-01 - 6.252750471347340e-01 - 6.252750583211300e-01 - 6.252750428437061e-01 6.252750465725070e-01 - 6.252750503012980e-01</step> </tsvectable> 20

Slide 21

Input Parameters Crossing the Molecular to Thermodynamic Scales Data Model Optimization and Frequencies B3LYP NWChem Input File Vinoxy B3LYP Vibrational Mode Animated GIF 6-31G* Pedigree is basic to moving information crosswise over scales. Properties NWChem Output File Properties Input Parameters Gaussian Input QCISD G3(MP2)B3LYP H f Vi noxy NASA File Energy QCISD(T,FC) Legend Gaussian Output Vinoxy NWChem 6-31G* Ecce Input Parameters Properties Gaussian Properties Energy CMCS MP2(FC) Active Tables NWChem Input Vinoxy Pedigree - hasInput MP2 Pedigree - hasOutput G3MP2large NWChem Output Properties 21 Proper

Recommended
View more...