Python's “batteries included” standard library seems to have a battery missing. And the argument that “we never had it before” has worn thin. It is time that Python offered a full range of collection classes out of the box, including sorted ones.
Many long-term Python programmers are used to holding unsorted collections and sorting them only when needed. This is a perfectly sensible approach that works well for many use cases. Furthermore, Python's sort algorithm is highly optimized and works especially well for partially sorted collections, so this approach is usually very fast.
In fact, some developers seem so used to sorting when needed that they can be dismissive of the value of intrinsically sorted collections. Yet most mainstream languages (such as Java and C++) provide both sorted and unsorted collections and leave it to their users to decide which is the appropriate kind of collection to use. Yet Python, for most of its history—and despite claiming to have a “batteries included” standard library—has denied its users this choice (at least, out of the box).
Back in 2012, the release of Python 3.1 at last included an intrinsically-sorted collection: the collections.OrderedDict class. This provides an insertion-ordered dictionary, and is extremely useful. I, personally use it myself, and I'm seeing more and more uses of it in other people's code. The most recent documentation shows newly added functionality and some interesting use cases.
Implementing a sorted list in Python is easy using the bisect module. And it isn't much more difficult to create a sorted list that accepts a key function. Since it is so easy, it might be argued that there is no point adding a sorted list to the standard library. But in practice, time after time developers create their own custom sorted list—a sure sign that such a class belongs in the standard library. There are, of course, many examples to choose from, but one of those that ought to be considered is the www.grantjenks.com/docs/sortedcontainers package.
Implementing an intrinsically sorted dictionary, i.e., a dictionary that
is sorted using its keys' <
operator, or implementing a
key-function sorted dictionary isn't difficult—unless you want
decent performance that is. I've implemented a red-black tree in pure
Python and achieved absolutely dismal performance, so now I use my own
dict
subclass in conjunction with a custom sorted list. The
fact is that Python's built-in dict
is implemented in C and is
extremely efficient, so it is very hard to compete with.
Yet, time after time, the
need for an intrinsically sorted dictionary arises. And again, people
keep coming up with their own solutions. Some use an in-memory SQLite
database, others implement pure Python solutions (like the one in the www.grantjenks.com/docs/sortedcontainers
package), and still others wrap the C++ map
class (which is an
intrisically sorted dictionary). Why do so many developers go to so much
trouble to create these things? Can they all be wrong, should they all
be using dict
and sorting on demand? Or could it be that for
their use cases what they need really is an intrisically sorted
dictionary?
A look on PyPI soon reveals just how many different implementations there are.
It is my hope that the Python developers will implement—or
adopt—an intrisically sorted list and dictionary. It could start
out as pure Python (like the www.grantjenks.com/docs/sortedcontainers
package), and perhaps over time will end up being backed by a C
implemention as happened with the decimal
module.
Your Privacy • Copyright © 2006 Qtrac Ltd. All Rights Reserved.