Yesterday I released version 2.0 of the streaming JSON parser ijson. It mostly includes bug fixes accumulated over the last year and the only reason to change the major part of the version number was that
import ijson doesn't do any discovery magic anymore.
Previously, when you did
import ijson it used to first go on a trial-and-error search for the latest version of the C library yajl and if none found used the Python backend as a fallback. This approach proved to be buggy and unpredictable: simply moving your app into another environment might have introduced different behavior, like being significantly slower on a machine without yajl or exposing bugs present in one backend but not the other.
So, following the "explicit is better than implicit" commandment I dropped the discovery, so
import ijson now always loads the safe pure Python backend. You can explicitly import any of them with
import ijson.backends.<name> as ijson.
You might argue that
import ijson is still not explicit enough but I didn't want to force users to always use a full backend name. Because "practicality beats purity".
- Fixed breakage when a multi-byte UTF-8 characters was split by a buffer boundary.
- Python backend now accepts custom buffer size as an argument.
- Always return integer values as 'type int' even if spelled like
- Use Wheels for a distribution format.
Also the lexer is now reimplemented as a generator and simplified a little bit, it's now only 46 lines of code. Funny thing, though: this change made it slightly faster on CPython but slightly slower on PyPy. Looks like PyPy really likes objects and doesn't mind all the
self.something references and myriads of method calls. Go figure :-).