Taking Your Logging to the Next Level Part II: Blood Feud

True Story Follows

In my last post I wrote about some techniques to add a decorator that will log all of the local variables in a function and otherwise keep your actual code clean and spotless from context switching. I used this at work to keep logging as decoupled as possible from the rest of the code in the application. I wrote briefly about how this fits nicely in a case where most of the entities in your application have UUID identifiers.

At this point, we have a case where we can grep through the logs for a particular UUID, and you can trace through all of the events that have occurred for a particular entity or an object in a particular state. But who wants to look at UUID’s all day? The aforementioned is nice for isolating events that have happened, but if we are indeed using UUID’s for most of our identifiers, it becomes mentally taxing to provite context around a particular event with other objects. UUID’s are in essence 16 random bytes, and so what kind of masochist wants to just look through UUID’s in a log?

My Solution

In my experience with the gfycat API, one of the nice things I noticed was that unique identifiers were created using english nouns. This made URL’s much more human friendly and eye pleasing. I basically applied this same concept to logging where I mapped UUID’s to nouns, and I included this noun in every object __repr__ method.

So for example:

class UUIDRepr(object):
    """ Object to be inherited from such that it can be represented in logs with
    a deterministic UUID. """

    def __repr__(self):
        return "<{class_name}: {instance_uuid} ({friendly_noun})>".format(
            class_name=type(self).__name__,
            instance_uuid=self.__instance_uuid__,
            friendly_noun=interpolate_uuid_to_friendly_noun(self.__instance_uuid__),
        )

    @cached_property
    def __instance_uuid__(self):
        return create_instance_uuid(self)

I can create some base objects that all objects in the repository inherit from, and I can add a bit of logic to associate some deterministic UUID for each object. With that UUID, you can simply map it to a deterministic index on a large list of nouns:

UUID_LEN = 16
MAX_UUID = uuid.UUID(bytes="".join([chr(255) for _ in xrange(UUID_LEN)]))


def interpolate_uuid_to_friendly_noun(obj_uuid):
    interpolation_value = uuid_to_integer_value(obj_uuid)
    max_uuid_int = uuid_to_integer_value(MAX_UUID)

    interpolation_percent = float(interpolation_value) / max_uuid_int
    interpolated_index = int(interpolation_percent * len(FRIENDLY_NOUNS))

    return FRIENDLY_NOUNS[interpolated_index]


def uuid_to_integer_value(obj_uuid):
    # here you need to write some function that maps UUID's to an integer
    # since there are 3.402823669209385e+38 possible UUID's and only several
    # thousand nouns, you just need to ensure the value is deterministic
    return some_integer

And now all objects when logged will look something like this:

'<ArbitraryClass: 8044829c-ea8c-b1c8-b9ee-9ea54edad116 (banana)>'

Now, as we traverse our logs we can more easily discern what objects are getting passed into particular functions and we can isolate more than just a single entity at a time (since we’re able to scan text with our eyes now instead of searching for an exact UUID string match).

The End