Aug 182010
 

I’ve been twice bitten by this subtlety of the python __cmp__() method, I thought I’d try to save you the same pain (and perhaps by writing it down I won’t repeat it… again).

When dealing with objects, it’s often useful to be able to compare these objects:

if mySwartz > yourSwartz:
    print "Suck it."

An obvious (and wrong) __cmp__() method would be:

def __cmp__(mine, yours):
    if mine.val < yours.val:
        return -1
    if mine.val > yours.val:
        return 1
    return 0

Looks good right? Well, what if some yahoo tries to compare two totally disimilar things, i.e:

if mySwartz > yourForce:
    print "Neener Neener."

Such a comparison clearly makes no sense, so let’s update __cmp__() to check for similar data types:

def __cmp__(mine, yours):
    if not isinstance(yours, Swartz):
        return -1
    if mine.val < yours.val:
        return -1
    if mine.val > yours.val:
        return 1
    return 0

Good right? WRONG. Consider the following ridiculous scenario:

mySwartz.val == yourSwartz.val

While I’m sure no such thing could happen, let’s imagine for the sake of argument that two objects might be compared on an internal value that just might not be unique across all instances of the object. What would happen if you had to retrieve your object from a container?

storage = []
storage.append(yourSwartz)
storage.append(mySwartz)
# time passes
for sw in storage:
    if sw == mySwartz:
        print "I found my Swartz!"
        print "...wait... this is your Swartz... gah!"

Yeah yeah, you could rewrite this to yada yada, that’s not the point. The point is while the values of the objects are the same, they are not the same objects and python uses the __cmp__() method for both less than, greater than, equality, AND identify comparison if you’re used to thinking of your objects as pointers!

If you’re faced with a situation where you want to be able to easily sort objects, but still need to be able to identify them uniquely when they share an internal value, augment your __cmp__() routine to check for id as well:

def __cmp__(mine, yours):
    if not isinstance(yours, Swartz):
        return -1
    if mine.val < yours.val:
        return -1
    if mine.val > yours.val:
        return 1
    if id(mine) < id(yours):
        return -1
    if id(mine) > id(yours):
        return 1
    return 0

If you’re a python guru and think I’m missing something, please share, otherwise:

if mySwartz > yourSwartz:
    print "I see my Swartz is bigger than yours..."

  2 Responses to “Not less than, not greater than, but not equal either!”

  1. If you want test object identity, use the ‘is’ operator rather than the equals operator.

  2. So, rather than using == to compare identities in your code, use “is” as Tim suggests (Thanks Tim). This allows you to strip the id() checks from the __cmp__ operator and let it be used strictly for value comparisons. It is probably a good idea to allow for comparison against None (Null pointer sort of thing) but to raise a proper exception when comparing against something other than an Swartz or a descendent of that class hierarchy.

        def __cmp__(mine, yours):
            if yours is None:
                return 1
            if not isinstance(yours, Swartz):
                raise TypeError
            if mine.val < yours.val:
                return -1
            if mine.val > yours.val:
                return 1
            return 0
    

    There are additional “rich comparison” operators that provide for more flexibility than __cmp__() and one should consider the implications associated with __hash__() when implementing custom comparators. See the Python Data Model Documentation on this.

 Leave a Reply

(required)

(required)

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>