Corporations have a tremendous amount of stored information. On top of this,
new information is being created every day. A small but critical portion of
this information is stored in highly structured and well-defined formats in
relational databases. However, most of the information is on paper, in
e-mail, in word processing documents, in spreadsheets, in PDF files, in
engineering diagrams, and so on.
Ever since the initial XML draft in 1996, there has been an ongoing
discussion of the semantic Web. A Google search for the exact expression the
semantic Web returns about 1.2 million Web pages. Clearly there has been and
continues to be a lot of discussion about the semantic Web. However, the
semantic Web is still being worked on. This is mainly because very little
information on the Web has not been semantically tagged. It may be more
prudent to start on a smaller s... (more)
This article covers the use of XML in building a data warehouse. For the most
part, XML has been promoted as a mechanism for exchanging relatively small
amounts of data, such as orders and shipping documents. In this article I
describe an approach we used for transferring large sets of data (e.g., all
the accounts in a bank). The three key concepts are the definition of fields
using XML, the use of a common metadata, and the use of an external CDATA
section. Before covering the details, I'll introduce some data warehouse
issues and show why XML is a good solution.
My team built ... (more)
Will many of the features in XML Schema be widely used? In particular, I
agree that it is better to have an XML language for specifying document
layout rather than the DTD language. On the other hand, I am not sure that
the document layout should be strongly typed. The nightmare scenario is where
a customer cannot place a large order because an XML document is invalid.
Assume a company has an average order size of $50,000 dollars with a current
order range of $3,000 to $220,000. It would be reasonable to set a criterion
in the Purchase Order Schema for this company to set a total... (more)