Skip to content.

plope

Personal tools
You are here: Home » Members » chrism's Home » Why Not XML
 
 

Why Not XML

Phillip Eby explains with great clarity why XML isn't always a good idea when you're using Python.

In this article , Phillip explains why while XML is a godsend for Java programmers, it is regarded with some disdain by a lot of Pythonistas.

Created by chrism
Last modified 2004-12-03 11:18 AM

I can hear a wailing noise

While Paul Everitt is several hours drive time and two countries away from me I thought I heard a faint wailing and gnashing of teeth coming from a westerly direction...

misguided elitism among Python programmers about XML

Since PJE had a nice rant, I'll have a rant back:

I think a lot of the disdain for XML from Python programmers is nothing more than elitism. *Misguided* elitism. The other part that I suspect plays a role for some people is finding an excuse not to learn more about it (see, XML sucks, Python is superior, so me not knowing a lot about it is fine). Finally, it's ignoring that there are more than one ways to do the same thing, and each have their own benefits and drawbacks. There are drawbacks to using XML, and there are drawbacks to using pure Python. Yes, there are misapplications of XML; there are probably very many. There is a lot of chaff. That doesn't mean it isn't applied successfully in very many cases.

The only reasons to use XML according to PJE are:

"If you're not implementing an existing XML standard for interoperability reasons, creating some kind of import/export format, or creating some kind of XML editor or processing tool, then Just Don't Do It."

and

"The only exception to this is if your target audience really really needs XML for some strange reason. Like, they refuse to learn Python and will only pay you if you use XML, or if you plan to give them a nice GUI for editing the XML, and the GUI in question is something that somebody else wrote for editing XML and you get to use it for free. There are also other, very rare, architectural reasons to need XML."

The tone of the text gives the impression that the set of problems listed is a small class of problems that can safely be ignored by smug and superior Python programmers. Nothing could be further from the truth -- just the class of problems described is in fact huge. A large fraction of IT problems has to do with interoperability; solutions very frequently involve tieing a number of systems together. Like it or not, XML is used in a lot of cases where interoperability between disparate systems is required. There are strategic investment concerns (is my data future-proof? Can I hire experts?) and decoupling concerns (I want this Java system to work with this data, and this Python system too, oh, and there's this browser-based Javascript application too). XML can be helpful in these areas and deciding to use it is in many cases an eminently sensible decision, not a "strange reason".

XML is standardized, cross-platform, cross-framework, and cross-programming language. Many of the surrounding specifications are well-implemented in a diversity of programming languages. Using XML, like the architecture of the web, is a way to manufacture serendipity, to speak with Jon Udell. If your data is in XML, suddenly you can apply query languages and transformation tools that you couldn't do if you hadn't thought about XML representations. Suddenly you can use your data in ways you couldn't use before.

I can list a few other architectural places where the use of XML can make sense that seems to fall outside of PJE's description, though he of course intelligently covered himself up by saying there are other rare cases where it might make sense.

Having a, standard, neutral content representation like XML can make sense when you need to decouple parts of your application from each other, because you have either different teams implementing the different parts, or you're implementing the different parts in very different platforms, or both. In addition, making the content representation explicit in the form of XML may aid comprehensibility of the system, as suddenly the data flow becomes a lot more clear and can be treated as something separate.

An example: visual template designers take the XML output of some processing component of the application developed by quite different programmers, for instance. They do not need to learn any APIs that may call deep into the backend application and could be used the wrong way; they only need to worry about extracting the information they need for presentation from an XML structure.

Another place where XML can make sense is if you need a domain specific language. These often make sense -- it can forcing people to think declaratively about a problem instead of finding dirty ad-hoc ways out all the time. Using XML can make your life easier, as the parser is already available, along with many other possible tools. Of course sometimes designing your own grammar from scratch may be worth doing, but many cases it's not necessary and XML is good enough. It's a useful tool in the toolbox in that case.

So, Python people, resistance is futile. Be assimilated by the XML collective. Feel free to be smarter about it than most: stay sensible and stay critical, to where XML makes sense and where it doesn't, but throw out the disdain that makes you unable to see valid opportunities for its use.

PJE is very smart. I am sure his view of it all is actually a bit more nuanced than his tone. Looking at Chandler source code (as I suspect PJE was doing) explains the outburst pretty well. But don't let it fool any Python programmer into thinking they can safely ignore XML because it is not worth considering by someone of superior abilities with a superior programming language. Most likely, you cannot and you shouldn't.