I've finally scraped some time to finish and release a new version of ijson — 1.0.
- support for YAJL 2
- pure Python parser
- compatibility with Python 2 and Python 3
On this I have already posted in details not that long ago. To summarize, ijson now works with both existing versions of YAJL and also has a Python parser which is probably the fastest among pure Python JSON parsers and works faster than YAJL under PyPy.
One interesting thing missing from that post is the solution to testing all those parsing backends using the same set of tests.
I believe we've already past that point when there's no question that all new Python code should work with Python 3. However since I didn't release any of that for a while, this was actually my first encounter with the problem. Having looked at what other guys are doing (namely, requests and Django) I get that the original idea of a one-time code conversion using 2to3.py doesn't work for libraries, you have to support both versions for some time. And from what little I know, it can be done, basically, in two ways:
- sacrifice beauty by having bilingual hybrid code, or
- maintain two variants of the same code base
After pondering for some time about it I figured that I hate the first approach less. Ijson is not particularly big and it's all temporary anyway. I hope in a year or so when most of the Python code would switch and Ubuntu would have Python 3 by default I'll be able to drop 2.x support (and also use the new
yield from in Python 3.3!)
The whole bilinguality patch ended up to be of quite manageable size and consists mostly of bytes/unicode type casting. All the helper functions are neatly collected in a single module compat.py — an approach borrowed from Kenneth Reitz's requests.
The good thing is that the compatibility code doesn't measurably affect performance. However the pure Python parser is a little bit (~ 6%) slower under Python 3 than under Python 2 (which, as I understand, is the usual story).