Cognitionis
The little I know

XML (eXtensible Markup Language)


A markup language is an artificial language using a set of annotations to text that give instructions regarding the structure, the content or how it is to be displayed, that is to say, meta-information (information about information). Examples: Tex, SGML, HTML, XML, XHTML, SOAP.

SGML (Standard Generalized Markup Language, ISO since 1986) is a metalanguage in which one can define markup languages for documents. SGML is a descendant of IBM’s GML, developed in the 1960s by Charles Goldfarb, Edward Mosher and Raymond Lorie. Example: HTML

XML (eXtensible Markup Language) is a general-purpose specification for creating custom markup languages. It started as a simplified subset of the SGML (Standard Generalized Markup Language), and is designed to be relatively human-legible. In 1996 W3C decided to remodule SGML to make it more powerful and strict but much simpler to use/parse and recruited John Bosak from Sun as the project leader. Finally, in 1998 W3C presented XML as a fee-free open standard. Examples: XHTML, RSS, SVG, SOAP.

xml

DTD (XML Document Type Definition)(DEPRECADED) is an XML that defines the document structure and syntax (allowed tags, tags order/hierarchy/nesting). A DTD can be declared internally in your XML document, or as an external reference (file.dtd). Example persons.dtd:

<!ELEMENT persons (persona*)>  * = [0,n] (0 or more persons)
<!ELEMENT person (name,sex?)>  ? = [0,1] (sex is optional)
<!ELEMENT name (#PCDATA) >     requieres #PCDATA
<!ELEMENT sex (#PCDATA) >
Example person.xml:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE persons SYSTEM "persons.dtd">
<persons>
  <person>
    <name>John Smith</name>
    <sex>male</sex>
  </person>
</persons>

Limitations: Do not manage local names (namespaces) and data types. Solution: XML Schema (XSD) (W3C)

DTDs –> XSDs (XML Shemas)

XS, XSD, XSDL (XML Schema Definition Language) (W3C 2001)

Solves DTD problems: namespaces and data types. Online tutorial. Example:

<?xml version="1.0" encoding="utf-8" ?>
<xs:schema elementFormDefault="qualified" xmlns:xs="http://www.w3.org/2001/XMLSchema">
  <xs:element name="Person">
    <xs:complexType>
      <xs:sequence>
        <xs:element name="Name" type="xs:string" />
        <xs:element name="Sex">
          <xs:simpleType>
            <xs:restriction base="xs:string">
              <xs:enumeration value="Male" />
              <xs:enumeration value="Female" />
            </xs:restriction>
          </xs:simpleType>
        </xs:element>
      </xs:sequence>
    </xs:complexType>
  </xs:element>
</xs:schema>

XSL (eXtensible Stylesheet Language):

XSL consists of three parts:

  • XSLT – a language for transforming XML documents
  • XPath – a language for navigating in XML documents
  • XSL-FO – a language for formatting XML documents

XSLT (eXtensible Stylesheet Language Transformations)

XPath (XML Path Language)…navigate xml… estudiar TUTORIAL (1,2,3,4)

XSL-FO (XSL Formating) formating

.

XQuery as an extension of XPath: Although both XPath and XQuery perform some of the same functions, XPath provides simplicity and XQuery provides additional power and flexibility.

  • XQuery is the language for querying XML data
  • XQuery for XML is like SQL for databases
  • XQuery is built on XPath expressions
  • XQuery is supported by all the major database engines (IBM, Oracle, Microsoft, etc.)
  • XQuery is a W3C Recommendation

PARSERS

IMPORTANT: SAX vs DOM

SAX (Simple API for XML): Fast, sequential, top2bottom, 1time read. (Java)

DOM (Document Object Model): Slower, tree structure, forward and backward, random access, W3C. The XML DOM defines a standard way for accessing and manipulating XML documents. Knowing the XML DOM is a must for anyone working with XML. The DOM represents an XML document as a tree structure, with elements, attributes, and text as nodes:

DOM node tree

XML Utils:

IMPORTANT. VER http://www.tutorialized.com/view/tutorial/Improve-performance-of-XML-apps-using-Xerces-C/35630

http://www.ibm.com/developerworks/xml/library/x-xercc/?ca=dgr-xw766x-xercc

http://www.ibm.com/developerworks/xml/library/x-xercc2/

.

BNF (Backus-Naur Form): BNF is a formal meta-syntax used to express context-free Grammars CFG. BNF is one of the most commonly used meta-syntactic notations for specifying the syntax of programming languages, command sets, etc. However, pure BNF is rather limited, so the two variations EBNF and ABNF have become more popular.

.

TIMEML…