|
 |
XML is based on HTML. (Or, more pedantically, HTML and XML are both
based on SGML.)
HTML is a *markup language*. It takes a big block of text, and inserts
one or two small marks to indicate section headings, hyperlinks, text
emphasis, and so on. But basically it's a textual document, just with a
few formatting marks.
XML is intended to be a universal way to represent all data. Which isn't
nearly the same thing. It's called eXtensible Markup Language, but it's
really eXtensible Container Format.
It is no secret that human-readable file formats tend to be much, much
more portable (and extensible). Suddenly you don't have to deal with
things like big-endian vs little-endian, signed vs unsigned, 32-bit vs
64-bit, etc. (No, instead you have to deal with the details of ASCII
number formatting, e.g., is ".5" acceptable? Or must it be "0.5"?)
It's also no secret that textual formats are less efficient. (But hey,
it never stopped PostScript!)
The nice thing about XML is that, since it's a standard, anybody that
wants to can make up some format based on XML, and then anyone who
understands XML has some small chance of figuring out what it all means.
There are standard XML parsing and processing libraries. There are
standard tools for searching, sorting and transforming XML into other XML.
The problem is... sometimes XML isn't a good fit. From what I can tell,
SVG works reasonably well. But something like MathML is... impossible to
read or write by hand. It's just absurd. The format is clearly and
obviously designed for ease of machine manipulation, not for humans.
There is also the minor detail that XML is actually quite a lot more
complex than most people realise. Most people think that "XML" just
means "write stuff in little angle brackets". In fact there is much,
much more to it than that. It's quite unecessarily complicated, in fact!
Post a reply to this message
|
 |