Thursday, April 14, 2011

Manipulating XML Using Python

I work with XML-related content on a day-to-day basis at work. I come from a .NET background and have written dozens of applications that leverage DOM when manipulating XML. Recently, I've started broadening my horizons to include more languages. I've written a few applications in Python now to do similar tasks as my .NET applications and there's one area that I always find lacking: XML manipulation with eTree. Perhaps I'm mistaken, but it appears from community pages that eTree is the defacto standard in Python for manipulating XML. Sure, it does *most* things correctly, but every once in a while, I can't help but stop and think, this was a whole lot easier with such and such method in .NET or, why does etree.xpath() work when etree.find() doesn't? Why are there two ways to do essentially the same thing in the same class library anyway?

One area that eTree really lacks cohesive support is mixed type XML (Some Text some more text even more text.). Dealing with tails and heads in this sort of situation is a nightmare but completely normal in the XML I work with.

Maybe I'm coming about this the wrong way. Maybe there's a better option out there I haven't considered yet? Maybe I'm just not used to seeing DOM in a Python-esque way. What do you think?

No comments:

Post a Comment