October 6, 2022

An Unexpected Encounter with Python Class Attributes

Recently, I encountered a mysterious bug in my Python application. The cause of the bug was my poor understanding of:

  • how class instances in Python initialise their attributes, and
  • how instance attributes and class attributes are resolved.

In this post I will explain my incorrect assumptions about this part of Python.

People with Hobbies

The following Python snippet defines the Person class, with a method for adding a hobby to the person's list of hobbies:

class Person:
  hobbies = []

  def add_hobby(self, hobby):
    self.hobbies.append(hobby)

That's marvellous. With this class, we can instantiate some persons (people in human-speak) and give them hobbies:

tolkien = Person()
tolkien.add_hobby("writing")

elvis = Person()
elvis.add_hobby("music")

Now, pause for a moment.

elevator music starts playing

Ponder for a while.

elevator music suddenly stops

Did you spot the mistake?

I had been programming in Python for more than five years, and I didn't!

Unforeseen Consequences

What do you think the following two lines would print?

print(f"Tolkien's hobbies: {tolkien.hobbies}")
print(f"Elvis's hobbies: {elvis.hobbies}")

This is what I expected it to print:

Tolkien's hobbies: ['writing']
Elvis's hobbies: ['music']

Here's what it actually prints:

Tolkien's hobbies: ['writing', 'music']
Elvis's hobbies: ['writing', 'music']

What?! Why are both hobbies listed for both persons?

The Incident

My bug was rooted in this statement in the Person class:

  hobbies = []

I thought this statement made sure that every new Person's hobbies attribute was set to a new empty list.[^1]

However, it actually sets the class attribute Person.hobbies to an empty list. The statement is only evaluated once, namely when the class is first evaluated, and no new hobby lists are made when new Person instances are made.

Resolving Hobbies

When appending to the person's hobby list in add_hobby, the attribute self.hobbies is used. When looking up that attribute, Python first looks for a variable named hobbies in the instance's namespace. If that fails, it tries to find the variable in the namespace of the instance's class! So, it finds the Person.hobbies attribute in the class's namespace.

The calls to add_hobby mutated that single list in the class attribute, resulting in all hobbies from different person instances being added to the same list.

We can verify that they resolve to the same list by evaluating the expression:

tolkien.hobbies is elvis.hobbies

which results in True.

This behaviour is well documented in the Python tutorial which goes to show that I should have RTFM.

Why it behaves like this is unclear to me (perhaps I should read some more). To me, it would make sense for Python to raise an AttributeError when trying to access an instance attribute that doesn't exist on the instance.

Feel free to reach out if you can enlighten me.

The Fix

Here's a corrected version of the Person class:

class Person:
  def __init__(self):
    self.hobbies = []

  def add_hobby(self, hobby):
    self.hobbies.append(hobby)

In this version, the hobbies attribute is assigned to the instance's namespace in the __init__ method, which is executed when an instance is initialised.

This way, every person instance gets its own list of hobbies in its instance attributes.

1: This is vaguely how it works in some other programming languages like Java and PHP.