The MAAS team’s mission was to port MAAS from Python 2.7 to Python 3.5.

This post has a rundown of how we prepared, how we did the port itself, and what we learned. It follows on from Porting MAAS to Python 3 which gives an overview of the port. They’re both written from an engineering perspective, but this post contains a lot more technical detail.

What we did to prepare

We used a few features of Python 2.7 that are meant to help prepare for a port to Python 3, and we also devised one of our own. At the top of every module and script we:

imported unicode_literals, absolute_imports, and print_function from __future__,
selected new-style classes by default with __metaclass__ = type,
forbade the use of str with str = None.

The latter forced the use of bytes or unicode and so brought decisions about encoding and decoding to the fore. Sadly it couldn’t prevent implicit coercion between these types.

With an irritating tenacity I would also recommend the use of dict.view{keys,values,items} in code reviews.

In Python 2.7 these methods exhibit the closest behaviour to Python 3’s dict.{keys,values,items}. They’re also converted cleanly by 2to3 whereas, for example, dict.keys() is converted to list(dict.keys()) and dict.iterkeys() is converted to iter(dict.keys()).

The intermediate lists that arise are somewhat wasteful and very often unnecessary, but it’s hard for 2to3 to know this because the dict fixer doesn’t consider the context (and doing so may be a task more suited to a human in any case). Using the dict.view* variants gives it a hint.

Unfortunately old habits die hard, and I ended up spending a lot of time manually reverting these kinds of changes from 2to3’s patches.

The process

A bug in 2to3 means that all __future__ imports needed to be reformated onto a single line (see [reformat-future-imports-on-single-line.py] reformat-future-imports-on-single-line):
```
 bzr ls --kind=file --recursive --versioned --null | \\
   xargs -r0 python python3/reformat-future-imports-on-single-line.py
```

The str = None lines also needed to be removed (see remove-str-equals-none-shim.py):

 bzr ls --kind=file --recursive --versioned --null | \\
   xargs -r0 python python3/remove-str-equals-none-shim.py

We converted MAAS’s code directory by directory, but worked with patches instead of getting 2to3 to write directly:
```
 2to3 --nofix=callable src/${subcomponent} > \\
   python3/fix-${subcomponent}.diff
```
We reviewed patches to sanity check them, and to remove unnecessary conversions. Commit each patch again, then apply:
```
 patch -p0 < python3/fix-${subcomponent}.diff
```

The __metaclass__ = type lines and all remaining shims were next to go (see remove-all-shims.py):

 bzr ls --kind=file --recursive --versioned --null | \\
   xargs -r0 python python3/remove-all-shims.py

We got the tests passing, committing as we went. Problematic tests were skipped like so:
```
 @skip("PYTHON3-TEMPORARY-DURING-PORT")
```

We did this last step for tests that depended on code that had not yet ported. Instead of pushing that work onto the stack we would just skip the tests and move on. Later on we revisited these tests (which, marked distinctively, were easy to find) and got them all working.

Observations

On the usefulness of annotations

Python 3.5 has the typing module in the standard library, the use of which results in quite readable type annotations. This was more useful than I expected.

I was often trying to keep in mind many disparate parts of the code base and I found it was much more convenient to have type information in the function signature rather than in the docstring, or discernible only from reading the code or call-sites.

I started to pine for tooling to enforce those annotations. Duck-typing doesn’t mean that anything goes: arguments still need to look and quack like the duck you’re expecting.

ABCs and this new and related typing module make it possible to describe the ducks you’re looking for. It seems a shame not to take full advantage of it.

I could not get mypy to install. From what I can tell, this is the big boss of type annotations in Python. It can statically analyze your program and discover typing mistakes. But I didn’t have time to figure out what was wrong and learn how to use it. Another day.

However, I would settle for checking annotations at run-time if I could get it working quickly, so I put together the short typecheck module.

By decorating function and methods with @typecheck.typed and adding annotations I could make type-related issues shallower, by which I mean that the code would crash closer to where the problem originated. This made an immediate difference, especially when unravelling byte/Unicode string issues.

This approach is imperfect and simplistic, sure. There’s none of that uncanny magic you get with, say, Haskell, where a program that merely compiles actually stands a good chance of doing what you meant, first time. But it is valuable all the same; it is another layer of defence.

Annotations and typecheck combined also replace the need for documenting the types of function arguments and returned values, and of that documentation being out-of-date, a state towards which documentation rapidly decays.

The Big One: Byte and Unicode strings

Almost all difficulties in this port were caused by Python 2’s automatic coercion of byte strings into Unicode strings and vice-versa. That one language feature has a lot of sloppy code to answer for. It has also made it hard for even the most systematic developer to live free from the shadow of UnicodeError and its spawn.

It is cold-sweat-inducing to realise that the following code in Python 2 that works fine:

from urllib2 import urlopen
response = urlopen('http://example.com/')
data = json.load(response)

is actually complete bollocks because it disregards the encoding of the response (i.e. the charset in the Content-Type header).

Python 3 forces you to fix this, but the temptation is to do something like:

data = json.loads(response.read().decode("utf-8"))

which is a different class of bollocks because, although UTF-8 is common, it still disregards the encoding of the response. So, Python 3 gives us a big shove in the right direction but can’t yet magically fix faulty reasoning.

We used unicode_literals in our code. In code reviews we would check for correct encoding and decoding. We forbade the use of str. These things helped, I am sure, but I expected far fewer surprises from our own code; you might even say I was shocked at how much coercion between bytes and unicode was going on once Python 3 was there to coax it out.

Fixing these issues was, at a guess, over half of the work required to port MAAS.

In retrospect I wish there had been a way to disable automatic coercion in Python 2 although I suspect it would have been unworkable in practice; that’s Python 3’s big feature after all. A more selective Unicode-only literals feature with a corresponding unicodeonly type (and a converse bytesonly type) that Python would never automatically coerce to a byte string might have been a workable way to improve the sorry string story in Python 2.

Sorting disparate types

Python 3 doesn’t allow sorting of different types unless they explicitly support it. However, one important part of MAAS uses this.

MAAS’s Web API publishes a description document; a blob of JSON that describes all the objects and calls available. The CLI client downloads this once and refers back to it when generating sub-commands and options. When the server’s API is updated we need to detect that the client is working from an outdated description.

To do this, the server renders a canonical representation of the description document and calculates an SHA1 hash from it. This is included in the description that the client downloads, and the server also sends it in an X-MAAS-API-Hash header in every HTTP response. The client can compare the server’s hash with the local hash; if they differ, the API has changed.

Rendering the canonical representation is where the problem lies. We want to ensure a consistent ordering, and we had relied on Python 2’s built-in rules for a few types:

None < Numeric/Boolean < String < Tuple

We reproduced this by creating wrappers — KeyCanonicalNone, KeyCanonicalNumeric, KeyCanonicalString, and KeyCanonicalTuple — that sort correctly with respect to one another. A function, key_canonical, wraps disparate objects according to type, and can be used with sorted:

sorted(disparate_objects, key=key_canonical)

This solved our problem and we were back in business.

Things that `2to3` misses

I’m a Bad Person because these are bugs and I didn’t capture enough context at the time to be able to report them, nor have I tried to reproduce them since:

string.letters is not automatically changed to string.ascii_letters.
Imports of __builtin__ are changed to builtins, but references to __builtin__ are not updated.
Imports of urllib2 are changed to urllib.*, but some references are missed.

Miscellaneous

Not all of twisted.conch has been ported. This means that we can no longer support the little-known introspect service in MAAS. It’s a niche service for developer-driven debugging and it’s not enabled by default, so we dropped it.
2to3 converts things like isinstance(thing, (bytes, unicode)) to isinstance(thing, (bytes, str)), but it’s likely that we only want either str or bytes in Python 3.
sudo_write_file conflated its core mission (writing a file as another user via sudo) with encoding the file content. I changed it to instead raise TypeError if the given content is not a byte string, so that encoding must be done by the caller.
atomic_write also conflated its mission: it expected text content and silently encoded it as UTF-8. It will now raise TypeError if the content is not a byte string; again, encoding must be done by the caller.
TFTP paths are always byte strings. Other paths are often, but not always, represented as Unicode strings. This caused some difficulty.
Integer division: we had to change many expressions like a / b into a // b to ensure integer results.
When testing web interactions, content coming from Django is always a byte string. We used django.conf.settings.DEFAULT_CHARSET to decode. Strictly, however, we should have checked the Content-Type header.
Python 3 cushions us by wrapping sys.std{in,out,err} in io.TextIOWrappers, but when forking processes you are presented with the underlying reality: byte streams. The question arises: which character encoding should we use? The LANG and LC_* environment variables typically coordinate these kinds of understandings between processes. A new select_c_utf8_locale() function was created to select the C.UTF-8 locale. For cooperating applications this will mean we can reliably use UTF-8.
Command-line arguments given to subprocess’s functions should be Unicode strings, and it will encode them as appropriate. I did check further: that bit is implemented in C, but the result is very similar to calling os.fsencode().
Python 3 has only new-style classes. Classes that explicitly inherit from object can be amended to implicitly inherit from object instead.
No one seems to have paid any attention to the years of deprecation warnings about Exception.message. I conclude that deprecation warnings are more useful as retrospective justifications for breaking someone else’s application than they are useful in getting that person to update their application in time.

That’s it

I hope you find this useful when making your own plans. Good luck!