IST 210 Association of Information - PowerPoint PPT Presentation

ist 210 organization of data l.
Skip this Video
Loading SlideShow in 5 Seconds..
IST 210 Association of Information PowerPoint Presentation
IST 210 Association of Information

play fullscreen
1 / 39
Download
Download Presentation

IST 210 Association of Information

Presentation Transcript

  1. IST 210 Organization of Data Database and the Web 1

  2. References • ASP Tutorial from MSDN http://msdn.microsoft.com/workshop/server/asp/asptutorial.asp

  3. HTML/VB Script/SQL HTML SQL Internet HTML

  4. HTML VB Script SQL

  5. Create Dynamic Web Applications • Static Web application • Request with a URL (e.g., http://www.psu.edu) Which contains three components: protocol, web server name, and folder path to an HTML page • Server simply send back the page • From static to dynamic web pages • Take user input and respond accordingly • Allow access to information stored in a database • https://aspdb.aset.psu.edu/ist210tsb4/example.asp • https://aspdb.aset.psu.edu/ist210tsb4/student.html • https://aspdb.aset.psu.edu/ist210tsb4/studentlist.asp

  6. Web Pages with Database Contents • Web pages contain the results of database queries. How do we generate such pages? • Common Gateway Interface (CGI) • Web server creates a new process when a program interacts with the database. • Web server communicates with this program via CGI (Common gateway interface) • Program generates result page with content from the database Problem: need to run multiple processes which is not efficient.

  7. Application Servers • In CGI, each page request results in the creation of a new process  generally inefficient • Application server: Piece of software between the web server and the applications • Functionality: • Hold a set of threads or processes for performance • Database connection pooling (reuse a set of existing connections) • Integration of heterogeneous data sources • Transaction management involving several data sources • Session management

  8. Other Server-Side Processing • Java Servlets: Java programs that run on the server and interact with the server through a well-defined API. • JavaBeans: Reusable software components written in Java. • Java Server Pages and Active Server Pages: Code inside a web page that is interpreted by the web server

  9. Active Server Pages (ASP) • ASP is programming model that allows dynamic, interactive Web pages to be created on server. • ASP runs in-process with the server, and is optimized to handle large volume of users. • When an ‘.asp’ file is requested, Web server calls ASP, which reads requested file, executes any commands, and sends generated HTML page back to browser.

  10. Active Server Pages (ASP)

  11. ASP Code • Combination of three types of syntax: • Text • HTML tags • ASP scripts

  12. ASP Scripts • ASP scripts can be written in • VBScript <SCRIPT LANGUAGE=VBScript> • JavaScript <SCRIPT LANGUAGE=JavaScript> • ActiveX Components • Client-side vs. Server-Side • Client-side scripts downloaded to and execute on the client machine. (Problems: features by not be supported by some browsers) • Server-side scripts Run directly on the server and generate data to be viewed by the browser in HTML. No concern for browser capability.

  13. ASP Code • Script codes are executed by the server • Generate HTML, on-the-fly, when requested • ASP code is browser independent. • ASP code can be viewed at the server using Text Editor • Browser can not directly view the source code of a ASP program

  14. ActiveX Data Objects (ADO) • Programming extension of ASP supported by Microsoft IIS for database connectivity. • Supports following key features: • Independently-created objects. • Support for stored procedures. • Support for different cursor types. • Batch updating. • Support for limits on number of returned rows. • Designed as an easy-to-use interface to OLE DB.

  15. Getting User Input From a Form • Connection – establishing link between application program and database • Recordset – contains data returned from a specific action on the database • Command – allow you to run commands against a database

  16. Extensible Markup Language (XML)

  17. Question: What’s the difference between the world of documents and databases?

  18. Document world > plenty of small documents > usually static > implicit structure section, paragraph > tagging > human friendly > content form/layout, annotation > Paradigms “Save as”, wysiwyg > meta-data author name, date, subject Database world > a few large databases > usually dynamic > explicit structure (schema) > records > machine friendly > content schema, data, methods > Paradigms Atomicity, Concurrency, Isolation, Durability > meta-data schema description Documents vs Databases

  19. Documents editing printing spell-checking counting words retrieving searching Database updating cleaning querying What to do with them

  20. The thin line • The line between the document world and the database world is not clear. • In some cases, both approaches are legitimate. • An interesting middle ground is data formats -- of which XML is an example

  21. <doc1> <employee> <name> John Doe </name> <contact-info> <address> … </address> <tel> 123 7456 </tel> <email> jd@psu.edu</email> </contact-info> <dept> IST </dept> </employee> <employee> … </employee> ... </doc1> A common form of data extraction John Doe 123 7456 Jane Dee 234 5678 … ... Find the names and telephones of all employees in IST

  22. Lineage (WWW Consortium) Standard Generalized Markup Language (SGML – Late 1980s) Ease of Use Extensible Markup Language (XML – Late 1990s) Hypertext Markup Language (HTML – Early 1990s) Flexibility

  23. Need • Doctor want to who wants to send you medical record to a specialist: <html> <p>Patient G. Washington is allergic to penicillin</p> </html> • As HTML provides a way for all computers to read Internet documents, but how can a computer read the data?

  24. HTML • Lingua franca for publishing hypertext on the World Wide Web • Designed to describe how a Web browser should arrange text, images and push-buttons on a page. • Easy to learn, but does not convey structure. • Fixed tag set. Text (PCDATA) Opening tag • <HTML> • <HEAD><TITLE>Welcome to IST210</TITLE></HEAD> • <BODY> • <H1>Introduction</H1> • <IMGSRC=”ist.jpeg"WIDTH="200"HEIGHT="150” > • </BODY> • </HTML> Closing tag “Bachelor” tag Attribute name Attribute value

  25. The Structure of XML • XML consists of tags and text • Tags come in pairs<date> ...</date> • They must be properly nested <date> <day> ... </day> ... </date> --- good <date> <day> ... </date>... </day> --- bad (You can’t do <i> ... <b> ... </i> ...</b> in HTML)

  26. XML text XML has only one “basic” type -- text. It is bounded by tags e.g. <title> G. Washington </title> <year> 2001 </ year> --- 2001 is still text XML text is called PCDATA (for parsed character data). It uses a 16-bit encoding. Later we shall see how new types are specified by XML-data

  27. XML structure Nesting tags can be used to express various structures. E.g. A tuple (record) : <person> <name>G. Washington</name> <tel>(703) 111 1000</tel> <email>gw@mtvernon.com</email> </person>

  28. XML structure (cont.) • We can represent a list by using the same tag repeatedly: <addresses> <person> ... </person> <person> ... </person> <person> ... </person> ... </addresses>

  29. Terminology The segment of an XML document between an opening and a corresponding closing tag is called an element. <person> <name> G Washington </name> <tel> (703) 111 1000 </tel> <tel> (703) 111 1001 </tel> <email> gw@mtvernon.com </email> </person> element element, a sub-element of not an element

  30. person name tel tel email XML is tree-like G Washington (703) 111 1000 (703) 111 1001 gw@mtvernon.com

  31. Mixed Content An element may contain a mixture of sub-elements and PCDATA <airline> <name> Agony Airways </name> <motto> US’s <dubious> favorite</dubious> airline </motto> </airline> Data of this form is not typically generated from databases. It is needed for consistency with HTML.

  32. A Complete XML Document <?xmlversion="1.0"?> <person> <name> G Washington </name> <tel> (703) 111 1000 </tel> <email> gw@mtvernon.com </email> </person>

  33. Document Type Descriptors Imposing structure on XML documents

  34. Document Type Descriptors • Document Type Descriptors (DTDs) impose structure on an XML document. • There is some relationship between a DTD and a schema • The DTD is a syntactic specification.

  35. Example: The Address Book <person> <name> MacNiel, John </name> <greet> Dr. John MacNiel </greet> <addr>1234 Huron Street </addr> <addr> Rome, OH 98765 </addr> <tel> (321) 786 2543 </tel> <fax> (321) 786 2543 </fax> <tel> (321) 786 2543 </tel> <email> jm@abc.com </email> </person> Exactly one name At most one greeting As many address lines as needed (in order) Mixed telephones and faxes As many as needed

  36. Specifying the structure • name to specify a name element • greet? to specify an optional (0 or 1) greet elements • name,greet? to specify a name followed by an optional greet

  37. Specifying the structure (cont) • addr* to specify 0 or more address lines • tel | fax a tel or a fax element • (tel | fax)* 0 or more repeats of tel or fax • email* 0 or more email elements

  38. Specifying the structure (cont) So the whole structure of a person entry is specified by name, greet?, addr*, (tel | fax)*, email* This is known as a regular expression.

  39. Summary • XML is a new data format. Its main virtues are: • widespread acceptance and the ability to handle semistructured data (data without schema) • The emerging combination of database and XML provide a powerful tool for delivering content over the web