In [1]:
# (Open to Scott standing at podium, on title slide. Joe is hiding in the audience.)
#
# Scott: Hey everyone, thank you all for coming out today. We've just received word
# from Joe and Scott via text message that they're not going to be able to
# make it today due to a freak scheduling accident. But we know that you were
# all super excited to learn about Python bytecode, so I'm gonna try to do my
# best to give the talk in their place.
#
# Scott: I took a look at their abstract a minute ago...it looks like they
# were going to explain CPython's internal code representation, they were
# going to show how to create a function "from scratch" (whatever that means),
# and they were going to show how to change the bytecode of Python functions
# at runtime.
#
#
# Scott: Okay, so it sounds like this talk is supposed to be about mucking
# with compiled Python functions. I guess that means we need a function
# to muck with.
def add(a, b):
return a + b
add(1, 2)
Out[1]:
In [2]:
# Scott: We've got our add() function. There must be some way for us to access
# its bytecode if we're going to play with it.
# All the magic stuff in Python starts and ends with double
# underscores. Maybe we can find the bytecode.
# add.__<TAB>
# add.__code__
# add.__code__.co_<TAB>
# add.__code__.co_argcount
# add.__code__.co_consts
add.__code__.co_code
# Scott: These don't really look like printable characters.
# Scott: Maybe we can understand these better if we look at the byte values as integers.
Out[2]:
In [3]:
print(list(add.__code__.co_code))
# Scott: Hrm. There's definitely some structure here.
# I see (124, 0, 0) followed by (124, 1, 0). Maybe one of
# these corresponds to 'a' and the other corresponds to 'b'?
# (confused noises)
#
# Scott: I think I'm a little out of my depth here. Is there anyone in the audience that
# knows how bytecode works?
#
# Joe stands in audience, wearing a 'BYTECODE EXPERT' shirt.
# He turns to face the audience, looking around to see if anyone
# else will answer the call to action.
#
# After a moment, he shrugs and says:
#
# Joe: Well, **I'm** actually a Certified Bytecode Expert.
#
# Scott: Great, why don't you come on up here.
# Can we get a mic for him? (Joe is already mic'ed)
#
# Joe: It's okay, I brought my own.
# (Joe scrambles up onto the stage.)
# Scott: Wait, you brought you're own micr--
# Joe: Let's get back on track here. You were going in the right direction.
# The integers in that list are the instructions
# that the interpreter executes when you call add().
# There's actually a built-in module for displaying them
# in human-readable form.
# Why don't you import 'dis'?
# import this
# Scott: <starts reading>
# Joe: No, not 'this': dis. It's the 'disassembly' module.
# Scott: Ah.
import dis
# Joe: You can do `dis.dis` of a function to show the disassembly.
dis.dis(add)
# Scott: What does all this mean?
# Joe: The dis function prints a human-readable representation
# of the instructions in the code object.
#
# 1. There are 8 bytes in that code object, but there are only 4 instructions:
# LOAD_FAST, LOAD_FAST, BINARY_ADD, RETURN_VALUE.
# 2. The first three bytes in the list correspond to the first LOAD_FAST.
# The second three bytes
# correspond to the second LOAD_FAST, the last two bytes correspond to the BINARY_ADD
# and RETURN_VALUE, respectively.
# Scott: Why does LOAD_FAST get three bytes, but BINARY_ADD and RETURN_VALUE only get one?
# Joe: 3. The two bytes after 124 are the **arguments** to the LOAD_FAST instruction. They
# tell the interpreter which local variable to load.
# (124, 0, 0) says to load local variable zero.
# (124, 1, 0) says to load local variable one.
#
# 4. In general, whenever an instruction takes an argument,
# the next two bytes contain the argument to the instruction.
# Scott: Why are those one and zero? Don't we want to load a and b?
# Joe: Argument instructions are usually represented as little-endian 16-bit integers.
# For instructions with an argument, the last two dis columns show the value of the
# argument as an integer, followed by the actual meaning of the argument.
#
# In this case, dis says the first local variable is 'a' and the second local
# variable is 'b'.
# Scott: Okay. So LOAD_FAST of 0 tells Python to load 'a'.
# Where exactly are we loading it **to**?
# Joe: LOAD instructions push values onto a stack where they can be
# manipulated by other instructions. The BINARY_ADD instruction
# doesn't have an argument in the bytecode because it always pops
# the top two values off the stack, adds them, and pushes the result
# back onto the stack.
# Scott: I think I understand:
#
# At the start of the function the stack is empty, we load a and b,
# and then we get to the BINARY_ADD. When we execute the add, we
# pop 'a' and 'b' off the stack, add them together,
# and push the result back onto the stack.
# Finally, we hit the RETURN_VALUE instruction, which pops the top
# value off the stack and returns it to the caller.
# Joe: Perf.
# Scott: What are those numbers next to the instruction names?
# Joe: Those tell you where in the bytecode the instruction is located.
# The first instruction starts at index 0, and the second instruction starts at index 3
# because indices 1 and 2 were used for arguments.
# Scott: And 6 and 7 are next to each other because BINARY_ADD is one-byte instruction?
# Joe: Yep.
# Scott: What's the number on the top left
# Joe: That's the line number for the instructions on or below that line.
# It's a little easier to see on a function with multiple lines.
In [6]:
# Scott: Ok...how about this?
def add_with_assign(a, b):
x = a + b
return x
dis.dis(add_with_assign)
# Scott: So this says that the first four instructions correspond to the fourth Python line,
# and the next two instructions correspond to the 5th Python line.
# Joe: Yep. Why don't you try something a little more complex.
In [7]:
def abs(x):
if x > 0:
return x
else:
return -x
dis.dis(abs)
# Scott: <talk about LOAD_CONST> Argument is 1, but the value is zero.
# <talk about COMPARE_OP> How does Python know that 4 means "greater than"?
# Joe: Not all instruction arguments are indices.
# COMPARE_OP takes an enum value as its argument.
# There are entries for less than, equal, not equal, etc.
# That next instruction:
#
# Luckily, POP_JUMP_IF_FALSE does exactly what it says: it pops the top value off the stack and
# jumps to the instruction at the index of its argument if that value is falsey.
# Otherwise it continues onto the next instruction as normal.
#
# Scott: Okay. So when we get to the POP_JUMP_IF_FALSE, if the result of COMPARE_OP
# is truthy, we continue on to the next instruction. Otherwise,
# we jump to LOAD_FAST at index 16?
# Joe: Yep. Those arrows are dis' way of noting that byte 16 is a jump target.
# Scott: Okay. I think I understand the first branch, and I understand the second branch.
# What about the final LOAD_CONST/RETURN_VALUE?
# I don't think we can ever hit those instructions.
# Joe: Yep, that's dead code.
# CPython uses a fairly simple code generation algorithm.
# One of the rules is that if the body of a function doesn't end in a return statement,
# an implicit return of None is always inserted. In this case, even though it looks like
# our function ends in a return value, CPython considers the if-statement to be
# the last statement, so we get the dummy return even though it'll never be hit.
# Scott: That seems kinda wasteful...
# Joe: In most programs, an extra four unused bytes at the end of a function isn't a big deal.
# That's only half the size of a pointer.
# The CPython team decided that eliminating those four bytes isn't worth
# the additional complexity it would add to the compiler.
# Scott: Okay, I guess that makes sense.
# Is there any way we could remove that code if we wanted to?
# Joe: Well...you don't **have** to use the CPython compiler to make a code object. You can
# just construct one yourself like any other object.
In [8]:
# Scott: Alright, well, let's write our own abs then!
# Joe: Hold on there killer. Why don't I get you started with something a little
# simpler. How about addone?
# Scott: Alright, fine. Let's write it out in Python first so we know what we're going for:
def addone(x):
return x + 1
# Scott: Okay, so that's our target. You said I need to construct a code object.
# Where do I find a code constructor?
# Joe: The types module provides a CodeType.
# from types import CodeType
# print(CodeType.__doc__)
# Scott: Not for the faint of heart...great.
# my_code = CodeType(1, # argcount (Scott)
# 0, # kwonlyargcount (Scott)
# 1, # nlocals (Joe)
# 2, # stacksize (Joe)
# Joe: stacksize tells Python how much space to allocate for storing values on the stack.
# We need enough slots to hold the maximum number of objects that ever
# will appear on the stack simultaneously.
#
# Scott: Okay, so, the biggest the stack is ever going to be is right before the add when
# both 'x' and 1 are on the stack. So the stacksize should be two.
# Scott: The next entry is `flags`. What's the deal with those?
# Joe: `flags` is a bitmask of options for the code object.
# I've prepared some material to illustrate the flags in depth.
# If you could be so kind as to press the down arrow on your keyboard.
# Scott: (presses down, then looks up, a beat)
# Scott: (confused sputtering)
# Joe: (brushing off Scott's confusion) Let's get back on track here...if
# take a look at the slide, you can see that there's a lot of different options.
In [9]:
# Used to determine if certain extra optimizations can be made.
# In practice, means "is this code for a function (as opposed to a module or class body)".
CO_OPTIMIZED = 0x0001
# Should a new locals dict be allocated every time this code is executed?
CO_NEWLOCALS = 0x0002
CO_VARARGS = 0x0004
CO_VARKEYWORDS = 0x0008
# Are we defined inside another function?
CO_NESTED = 0x0010
# Are we a generator?
CO_GENERATOR = 0x0020
# Do we share any variable cells with another function.
# We could infer this from other info on the code object,
# but this used for optimization purposes by the interpreter.
CO_NOFREE = 0x0040
# Are we an async-def'd coroutine or a types.coroutine?
CO_COROUTINE = 0x0080
CO_ITERABLE_COROUTINE = 0x0100
In [10]:
# __future__ flags
CO_FUTURE_DIVISION = 0x2000
CO_FUTURE_ABSOLUTE_IMPORT = 0x4000
CO_FUTURE_WITH_STATEMENT = 0x8000
CO_FUTURE_PRINT_FUNCTION = 0x10000
CO_FUTURE_UNICODE_LITERALS = 0x20000
# Compiled with enhanced inequality syntax.
CO_FUTURE_BARRY_AS_BDFL = 0x40000
# Python 3.5 backwards-compat flag.
CO_FUTURE_GENERATOR_STOP = 0x80000
In [11]:
# code(argcount, kwonlyargcount, nlocals, stacksize, flags, codestring,
# constants, names, varnames, filename, name, firstlineno,
# lnotab[, freevars[, cellvars]])
my_code = CodeType(1, # argcount
0, # kwonlyargcount
1, # nlocals
2, # stacksize
(CO_OPTIMIZED | CO_NEWLOCALS | CO_NOFREE),
# (Resume Typing)
bytes([124, 0, 0, 100, 0, 0, 23, 83]),
(1,), # constants
(), # names
('x',), # varnames
'<string>', # filename
'addone', # name
42, # firstlineno
b"", # lnotab
(), # freevars
()) # cellvars
# Scott: Okay, codestring. (types, 'bytes([])') What are our bytes?
# Joe: (quickly, but not memorized) 124, 0, 0, 100, 0, 0, 23, 83
# Scott: Could you elaborate a little bit on that?
# Joe: Well, looking at our function, we need to load 'x', load 1, do a binary add,
# and then return the result. To load x, we need a LOAD_FAST instruction.
# The opcode for LOAD_FAST is 124. 'x' is the only local variable, so the argument is 0.
# Next we want to load 1 with a LOAD_CONST instruction. The opcode for LOAD_CONST is 100.
# We're only using one constant, so we can put the value 1 at index 0.
# We saw earlier that the opcodes for BINARY_ADD and RETURN_VALUE were 23 and 83.
# Scott: It's impressive that you can just rattle that off.
# Joe: Well, you should expect nothing less from a certified bytecode expert.
# Scott: Okay. So if we're doing a LOAD_CONST of zero, I guess we want a tuple containing just one?
# Joe: (thumbs up)
# Scott: Okay, names?
# Joe: Names should be a tuple containing the names of global variables and
# attributes that appear in the function.
# We don't have any, so you just want an empty tuple.
# Scott: One empty tuple, coming right up. Okay...varnames. How is that different from names?
# Joe: Varnames is a tuple containing the names of the local variables for the function,
# in the order that they're indexed by LOAD_FAST and STORE_FAST.
# Scott: So that's just 'x'? (types `('x',))
# Joe: (affirmative noises)
# Joe: The next few entries don't really matter for a hand-written code object:
# filename is normally the name of the file where the code was defined.
# We don't have a file, but the convention for exec is the word 'string'
# in angle brackets. (Scott types)
# Next is the name of the code object. (Scott types 'addone')
# Then we have the first line number where the instructions appear.
# Pick your favorite number.
# After that is the line number table. It's a bytes object representing a mapping
# from instructions to their line number. We don't really care about line numbers here,
# so you can just put an empty bytes object.
# Joe: Last but not least, we have the freevars and the cellvars. These are the names of
# variables that are shared with other functions.
# Since we set CO_NOFREE, those should be empty tuples. (Scott types)
# Scott: Alright, moment of truth: (Scott executes the cell. Hopefully it works.)
In [12]:
# Scott: Let's fire this baby up!
my_code(1)
# Scott: What gives?
# Joe: A code object isn't executable by itself. What you really need is a **function**.
# Scott: Alright, let me guess: I need FunctionType from the types module.
In [13]:
# from types import FunctionType
# print(FunctionType.__doc__)
# Scott: This doesn't look so bad.
my_addone = FunctionType(my_code, {})
In [14]:
my_addone(1)
# Scott: We did it! I wonder how different is our implementation from Python's...
dis.dis(addone)
print("--------")
dis.dis(my_addone)
# Scott: Other than the made up line numbers, these look pretty identical.
# Joe: Well, not quite. If you look over at the raw argument column, CPython is loading
# a constant at index 1, but we're loading a constant at index 0.
# Scott: Oh, weird. I wonder what's at index 0.
In [15]:
print(addone.__code__.co_consts)
print(my_addone.__code__.co_consts)
# Scott: Why is None in CPython's co_consts. Nothing ever touches it.
# Joe: Every code object created by the CPython compiler has None at index 0.
# Scott: So our hand-crafted code object is more compact than Python's.
# Joe: Ever so slightly, yes.
In [16]:
# Scott: Hold up. If CPython is just loading values at certain indices from the
# co_consts, what happens if I change them?
my_addone.__code__.co_consts = (2,)
# Joe: Code objects are read-only to guard against those kinds of shenanigans.
# If that worked, anyone who depended on your addone function would suddenly
# find themselves with an addtwo function. Probably not what they wanted.
# Scott: Fair enough. So there's no way to modify existing code objects?
# Joe: You can't modify them in place, but it's easy enough to copy all the attributes
# from one code object to another, and you can swap out any values you want to
# change when you're doing your copying.
# Scott: Okay. So maybe we can write some sort of *function*... that
# performs a *functional* update... on a function?
In [17]:
def update(f, **kwargs):
"A function that performs a functional update on a function."
code = f.__code__
attrs = [
'co_argcount', 'co_kwonlyargcount', 'co_nlocals', 'co_stacksize',
'co_flags', 'co_code', 'co_consts', 'co_names', 'co_varnames',
'co_filename', 'co_name', 'co_firstlineno', 'co_lnotab',
'co_freevars', 'co_cellvars',
]
newcode = CodeType(*(kwargs.get(a, getattr(code, a)) for a in attrs))
return FunctionType(
newcode, f.__globals__, f.__name__, f.__defaults__, f.__closure__,
)
# Joe: I took the liberty of preparing this one ahead of time.
# Scott: I'm changing my password when this is over.
In [3]:
# Scott: Okay, so if I do update(my_addone, co_consts=(2,)),
# I can *transform* my_addone into my_addtwo.
my_addtwo = update(my_addone, co_consts=(2,))
my_addtwo(2)
Out[3]:
In [8]:
# Joe: That's cute, but, you know there's only so much you can do with the
# code metadata. The **real** power comes from manipulating the bytecode itself.
# Scott: (a little bit indignant) Okay, well, co_code is just another attribute.
# If we can swap out one constants tuple for another, we can just as well swap
# out the code string.
# Joe: Now you're cooking with gas. How about we write a function that
# replaces 23s with 20s in a function's instructions.
# Scott: 23 and 20?
# Joe: Sorry, BINARY_ADD and BINARY_MULTIPLY. Sometimes I forget that not everyone is
# a certified bytecode expert.
def add_to_mul(f):
old = f.__code__.co_code
new = old.replace(bytes([23]), bytes([20]))
return update(f, co_code=new)
multwo = add_to_mul(addtwo)
multwo(5)
# Joe: See, bytecode hacking isn't so hard once you know how everything works.
#
# (a beat)
#
# Scott: (Scott peers long and hard at the code on the screen,
# his brow furrowed with concentration. Finally, he speaks:)
# I think there's a bug here.
#
# Joe: (immediately dismissive) How could there be a bug?
# **All** we did was replace all the BINARY_ADD opcodes
# with BINARY_MULTIPLY opcodes.
#
# Scott: But remember, you told me that not all the bytes are opcodes.
# Some of them are **arguments**.
#
# Joe: Okay...so? 23 and 20 only ever mean BINARY_ADD and BINARY_MULTIPLY.
# (increasingly agitated). Those would **NEVER** appear as arguments...
# (suddenly unsure of himself) ...right?
#
# Scott: Well, what if we wrote a function that needed 23 local variables?
#
# Joe: Who would write a function with **23** local variab--
#
# Scott: I actually happen to have a function like that lying around.
Out[8]:
In [11]:
from string import ascii_lowercase
def get_x(a, b, c, d, e, f, g, h, i, j, k, l, m,
n, o, p, q, r, s, t, u, v, w, x, y, z):
"A function with **26** local variables."
return x
# Scott: This function has 26 local variables.
# If we call it normally, it returns the
# 23rd local variable, 'x'.
# get_x(*ascii_lowercase)
# Scott: But if we apply add_to_mul to get_x...
add_to_mul(get_x)(*ascii_lowercase)
# Joe: (ahh shit...) It turns into `get_u`.
Out[11]:
In [ ]:
# Scott: I think this could have been a lot worse too.
# At least there was still a local variable at index
# 20 for us to use. What would have happened if the
# index was out of bounds entirely?
baddone = update(addone, co_consts=())
baddone(3)
# Joe: Yeah, that looks pretty bad. I think you crashed the interpreter.
# Now that I think about it, there are other cases that are
# just as bad or worse. Remember those JUMP instructions we talked about
# earlier? The arguments there were just indices into the bytecode.
# If those ever got changed, who knows what would happen?
# Scott: Does that mean we can't ever insert or delete instructions from a
# code object?
# Joe: Yeah, we'd have to recompute all those offsets.
# Scott: Oof...
# (pause as Joe and Scott reflect on the surprising complexity they've seen)
# Joe: Didn't you say that Joe and Scott wrote a library for dealing with
# some of these issues?
# Scott: Yeah, codetransformer. I actually downloaded it right before I came on
# stage. Maybe they've got some ideas for making bytecode hacking
# a little easier.
In [3]:
from codetransformer.transformers import ordereddict_literals
@ordereddict_literals
def make_dict(a, b, c):
return {a: 1, b: 2, c: 3}
make_dict('a', 'b', 'c')
Out[3]: