On Twitter I had a thread going this year in which I tried to reflect on bugs that I found throughout the year, how to avoid this kind of bug, what can be learned, etc. I will port this idea over to here and see how it goes in the future (I'm still both here and on Twitter, we'll see how that goes).
Recently I fixed a bug in PyPy's time.strftime. It was using some unicode helper function that takes as argument a byte buffer with some utf-8 encoded string, as well as the number of code points. strftime was using this API wrong and passing the number of bytes instead.
After finding the bug we tried to make this API more robust by having a check in the function that counts the codepoints in the byte buffer and complains if that is different from the second argument. This shouldn't be one by default for performance reasons, but it's on during testing.
The reason why the bug got away for so long is that if you test only with ASCII chars it works, because number of bytes == number of codepoints in that case. Lesson: write tests with wider ranges of characters.
Another bug, this time in itertools.tee: tee has an optimization that uses a __copy__ method on the iterator if it has one, instead of carefully using its generic implementation. However, PyPy got it wrong and copied the *iterable* instead of the iterator
https://foss.heptapod.net/pypy/pypy/-/issues/3852
This works in simple tests, but in more complicated situations it gives nonsense.
πͺ², also present in CPython. On Linux, if you pass MSG_TRUNC as a flag to socket.recv (which calls recv in its implementation) it will return the size of the packet, not the number of bytes written into the output buf.
https://foss.heptapod.net/pypy/pypy/-/issues/3864
This confused the logic in socket.recv, it leads to an assertion error in PyPy (trying to read too many chars from the output buffer) and getting garbled characters in CPy. Fixed by not reading more than the buffer size from the buffer in PyPy in that case
CPython bug: https://github.com/python/cpython/issues/69121
someone could fix this! probably not super hard.
I learned again that I know nothing about network programming :-(
Fixed a bug in PyPy's 3.9 parser (based on the new PEG parsing approach introduced in cpy 3.9). The parser would report a valid generator expression in a function call as lacking parentheses, but only if there is another syntax error further down in the file. Eg
f(x for x in y)
if a:
pass
Would report line 1 (which is fine) not line 3.
Bug was an oversight, leaving out an 'if' in the logic when porting from CPy. Shows that error cases are often not tested enough?
seems I neglected my bug thread a little bit! I interacted with two interesting bugs this week, that I wanted to write about.
One is threads related: a Python program with a bunch of threads crashes on PyPy, but not on CPython.
It turns out the project was missing a lock around this kind of code:
request_id = self._next_id
self._next_id += 1
and was handing the same request id a bunch of times.
This bug was possible in CPython too, but happened rarely. On PyPy it was happening very reliably, due to PyPy's higher performance and slight differences in when the GIL is released. This is a pattern we see regularly, that a latent threading bug only appears in PyPy.
Since then I've also learned that you can use itertools.count() as an atomic counter.
2nd bug was in the HPy-implementation of PyPy. we aren't quite sure we've understood it 100%, but it looks like we managed to confuse ourselves with metaclasses. Say we have a metaclass that is created from an HPy extension, ie C code. If that metaclass is instantiated, its instances are themselves also types.
In one code path we were reading a slot from the newly instantiated type, as opposed to the metatype. Most code doesn't have C-defined metaclasses, but numpy does, leading to a crash.
Amusingly enough, soon after writing about a CPython bug involving class versioning ( https://mastodon.social/@cfbolz/111708848209751692 ), I found one in PyPy too:
If you have a instance x of a class X, and both the instance and the class have an attribute f, reading x.f will return the instance attribute if the class attribute X.f is not a data descriptor (a data descriptor is something like a property). The lookup x.f will be cached in the interpreter, to not have to do any dictionary lookups if it is done repeatedly.
But there was a case of missing cache invalidation: We can *make* X.f be a data descriptor later by adding methods to its class after. If the x.f cache has been filled before that already, which will then return the wrong prior result.
The fix is to only fill the cache if X.f is an instance of an immutable class.
Feels hard to learn something from this, apart from the fact that Python's object model is a sprawling mess π€·ββοΈ
Another week, another bug: PyPy's JIT assumes that property objects are immutable, but they can totally be mutated by calling their `__init__` method again later. This leads to miscompiles where the old property getter was still called in already JIT-compiled code.
Fixing it (without losing performance) involves changing the fields of the property to be "quasi-immutable". This means they can be mutated, but the JIT should assume that this happens very rarely. If it does happen, all the callers of the property will get invalidated and recompiled. I found the bug when commenting on a CPython issue where they plan to add similar mechanisms to CPython: https://github.com/faster-cpython/ideas/issues/645#issuecomment-1905491165
@cfbolz Seeing PyPy and faster-cpython folks interacting is the stuff of dreams :)
PyPy has a lot of history with experiments and optimizations that could be useful to faster-cpython development. I see it's been a long time since papers and talks were added to extradoc (and even then, they focus on novel ideas). Would writing down how PyPy does the things faster-cpython is investigating help them?
I also wonder whether faster-cpython's research could help with speeding up PyPy's interpreter.
@danzin we don't really write papers any more, not enough academics left in PyPy π€·ββοΈ. Also, the core technology is really stable and hasn't been changing that much (and the small changes that do happen often end up on the blog).
Using some of the techniques that cpython has been using to write a faster interpreter in PyPy to speed up warmup is a possibility, but would require significant effort (the faster cpython team is pretty big, PyPy really isn't).
Ouch, @mgorny found/reported that PyPy's unicode .expandtabs method was simply giving wrong results in non-ascii situations :-(. While fixing it, it also turns out that it's quadratic?! Fixed now, but I was impressed by the badness.
@hpk @mgorny digging deeper, it was written by Guenter Jantzen (who I don't know) in 2003, so it was basically quadratic from the first implementation on, and that property survived through a whole lot of refactorings
https://github.com/pypy/pypy/commit/65ff28c60376#diff-58edf75816640e8633647c2f5d9c50814f490608c19c8f31079e25761ccfaa21L623
This is what I get for posting about finding bugs on social media: I now get invited to be a reviewer for scam journals on pesticide research.
actually I was wrong! I didn't get the review invitation to the pesticide journal because of my posting about bugs!
I got it due to bad pattern matching in some system. The name of the tool the paper describes has a very small edit distance to the string "PyPy".
The description of a tracing indefinitely bug: https://mastodon.social/@cfbolz/113142067087813944
Found a PyPy JIT bug that happens when you compile Python code that repeatedly accesses a huge list at fixed offsets >= 2**15. Leads to the jit failing with an assertion error.
I wrote the code that caused the crash. It was likely caused by me misunderstanding that the method 'append_int' only works for... shorts π€¦ββοΈ
I tried to find out who called the method append_int just now. Of course it turns out that that was also me. Oh well.
Another longer thread about a JIT deoptimization bug is here: https://mastodon.social/@cfbolz/113980112206754464