Introduction
In this post we are going to explore how to interactively debug Python programs using a function to hook into another function, start a Python shell and arbitrarily overwrite local variables. This method will not rely on any libraries and use only what Python’s standard library gives us. I have used this method to debug and explore numerical algorithms and I think it is a real timesaver when you don’t want to use some external debugger.1
The method explored in this post works with CPython in version 3.11.x. It does not work with PyPy and probably doesn’t with other Python implementations.
Why is such a hook helpful?
Imagine you have a complex and/or complicated function that deals with some state in a way that is hard to understand.
To get a better feel for the function you might start adding print
statements everywhere to see how the state changes over time.
This becomes unnecessarily cumbersome and messy.
It would be much better to hook into interesting parts of your programs once and then take a look at all values present at that time.
Another use case: In very time consuming functions you might serialize interim results and write them to files so you can resume the function from that result to save time. However, sometimes you are not sure what data you actually need. It would be much easier if we could do this interactively.
What we need is a hook that lets us interact with the program while it is running, freely changing its state.
Hooking into Python
In this section we are going to develop a function to hook into arbitrary parts of a Python program. In doing so we will learn a bit about how to read and modify the interpreter stack. If you are only interested in the outcome, you can skip to the next section.
Starting a shell
How do we get an interactive hook going?
What we need first is an interactive Python shell.
We can use the interact
function from the code
module to start a REPL within our program.
When called, it halts further execution of the program and only resumes it once the REPL is closed.
Let’s look at an example application we want to debug:
1import code
2
3GLOBAL_VAR = 100
4
5def f(n):
6 return n + 1
7
8def main():
9 print("Starting program.")
10 a = 1
11
12 # Where we want to hook
13 code.interact()
14
15 b = f(a)
16 c = f(GLOBAL_VAR)
17 print(a)
18 print(b)
19 print(c)
20
21if __name__ == "__main__":
22 main()
The call to interact
is already present, so let’s run this program. The interactive shell can be exited by CTRL+D.
Click to expand
Starting program.
Python 3.11.3 (main, Apr 7 2023, 20:13:31) [Clang 14.0.0 (clang-1400.0.29.202)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
(InteractiveConsole)
>>> a
Traceback (most recent call last):
File "<console>", line 1, in <module>
NameError: name 'a' is not defined
>>> GLOBAL_VAR
Traceback (most recent call last):
File "<console>", line 1, in <module>
NameError: name 'GLOBAL_VAR' is not defined
>>> a = 100
>>> ^D
now exiting InteractiveConsole...
1
2
101
We can see the familiar Python shell banner when the REPL (called InteractiveConsole
) starts.
Now, we can import modules, define variables and evaluate expressions.
However, as we can see, we can neither work with local nor global variables of the program.
This also means that we cannot work with any functions in the program without tediously importing them.
We can fix that.
Passing local and global variables
Let’s add these variables to our REPL.
We can access the local and global variables in the current scope using the locals
and globals
functions.
Additionally, we can set the local variables for our interactive console, by passing a dictionary with str
keys to the interact
function with the keyword argument local
.
So, passing local variables to the REPL is a very small alteration:
13code.interact(local=locals())
With this change, we can access local variables:
Click to expand
Starting program.
Python 3.11.3 (main, Apr 7 2023, 20:13:31) [Clang 14.0.0 (clang-1400.0.29.202)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
(InteractiveConsole)
>>> a
1
>>> ^D
now exiting InteractiveConsole...
1
2
101
Additionally, we can add global variables into the mix. The order in which we add variables is important, since we want to overwrite global variables by local variables.
13# Create locals for the interactive shell
14shell_locals = dict()
15shell_locals.update(globals())
16shell_locals.update(locals())
17# Where we want to hook
18code.interact(local=shell_locals)
Now, we have a way to not just access the global and local variables, but also have our first way of exploring their relationship with functions in our program.
Click to expand
Starting program.
Python 3.11.3 (main, Apr 7 2023, 20:13:31) [Clang 14.0.0 (clang-1400.0.29.202)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
(InteractiveConsole)
>>> a
1
>>> GLOBAL_VAR
100
>>> f(a)
2
>>> ^D
now exiting InteractiveConsole...
1
2
101
Hooking from another function
While the functionality we have achieved is already helpful, it makes the code hard to read.
We do not want to manually collect local variables and make the call to interact
each time.
It would be much better to provide a function that computes the variables passed to the shell and starts it automatically.
Let’s create this function:
1def hook():
2 # Create locals for the interactive shell
3 shell_locals = dict()
4 shell_locals.update(globals())
5 shell_locals.update(locals())
6 # Start the shell
7 code.interact(local=shell_locals)
Now, we can replace the call to interact
with hook
, but this poses a problem: The local variables of hook
obviously are not the same as the function that called hook
(from now on called the caller).
What we have to do is to access the local variables of that caller by looking at the stack.
Unsurprisingly, the interpreter stack contains one stack frame for each function call.
We are able to inspect these frames through FrameInfo
objects, which contain the frame objects themselves, as well as information about the executing function, respective filenames and line numbers.
The frame objects themselves contain a few interesting attributes:2
- f_back: next outer frame object (this frame’s caller)
- f_builtins: builtins namespace seen by this frame
- f_code: code object being executed in this frame
- f_globals: global namespace seen by this frame
- f_lasti: index of last attempted instruction in bytecode
- f_lineno: current line number in Python source code
- f_locals: local namespace seen by this frame
- f_trace: tracing function for this frame, or None
Of note, are f_locals
and f_globals
.
Both contain dictionaries with keys of type str
mapping to arbitrary objects.
They can be understood as the local and global scope that the function with that specific stack frame sees. So we can replace locals
and globals
with these attributes. This only raises the question on how to get access to the stack.
Python features the inspect
module to provide introspection into functions, classes and… the stack, using the stack
function.
The function returns a list of the FrameInfo
objects of our current stack, the first element being the info for the frame of the current function.
Thus, the second element in that list is the frame for the caller of the currently executing function.
However, when accessing these frame objects we have to be careful according to the interpreter stack documentation:
Keeping references to frame objects, as found in the first element of the frame records these functions return, can cause your program to create reference cycles. Once a reference cycle has been created, the lifespan of all objects which can be accessed from the objects which form the cycle can become much longer even if Python’s optional cycle detector is enabled. If such cycles must be created, it is important to ensure they are explicitly broken to avoid the delayed destruction of objects and increased memory consumption which occurs.
To not risk any problems with the garbage collector we have to explicitly delete the reference to the caller’s stack frame.
Putting all of this together gives us the following function:
1import code
2import inspect
3
4def hook():
5 # Get stack frame for caller
6 stack = inspect.stack()[1]
7 frame = stack.frame
8
9 # Copy locals and globals of caller's stack frame
10 locals_copy = dict(frame.f_locals)
11 globals_copy = dict(frame.f_globals)
12 shell_locals = dict()
13 shell_locals.update(globals_copy)
14 shell_locals.update(locals_copy)
15
16 # Start interactive shell
17 code.interact(local=shell_locals)
18
19 # Delete frame to avoid cyclic references
20 del stack
21 del frame
Now the function we want to hook contains only a single function call to hook
:
8def main():
9 print("Starting program.")
10 a = 1
11
12 # Where we want to hook
13 hook()
14
15 b = f(a)
16 c = f(GLOBAL_VAR)
17 print(a)
18 print(b)
19 print(c)
Although the call to the function looks innocent, it is capable of reading the caller’s local and global variables.
Mutating the stack frame
Now, we come to the most interesting functionality of our hook.
We want to be able to make changes to our running program’s variables from within the REPL!
While the interact
function doesn’t return anything, it does modify the dictionary of local variables we pass to it.
This means that we can use this dictionary, after the REPL has run, to write back values and references that might have changed.
To do this, we can update the f_locals
attribute of our caller’s frame.
However, these changes will be discarded if we don’t use a specific function (PyFrame_LocalsToFast
) from the Python API to apply these changes.3
To do this we can use the pythonapi
symbol exported from the ctypes
module to call the function with the frame
object as it’s first argument and an integer as its second argument.
locals_to_update = dict()
for key, value in shell_locals.items():
if key in locals_copy:
locals_to_update[key] = value
frame.f_locals.update(locals_to_update)
ctypes.pythonapi.PyFrame_LocalsToFast(ctypes.py_object(frame), ctypes.c_int(0))
After calling the REPL, shell_locals
will contain the modified (and possibly new) locals from the shell.
We build a new dictionary that only contains the keys that are also present in the caller’s locals.
After updating the f_locals
from the caller, we call the API function.
The first argument is the frame
encoded as an arbitrary Python object.
The second argument is a constant that specifies wether variables that are missing from the updated dictionary should be deleted, 0
meaning that the values should not be deleted and 1
meaning that they should.
This can be added to our hook function, to let us manipulate the program from our REPL.
1def hook():
2 # Get stack frame for caller
3 stack = inspect.stack()[1]
4 frame = stack.frame
5
6 # Copy locals and globals of caller's stack frame
7 locals_copy = dict(frame.f_locals)
8 globals_copy = dict(frame.f_globals)
9 shell_locals = dict()
10 shell_locals.update(globals_copy)
11 shell_locals.update(locals_copy)
12
13 # Start interactive shell
14 code.interact(local=shell_locals)
15
16 # Update caller's locals with modified locals from shell
17 locals_to_update = dict()
18 for key, value in shell_locals.items():
19 if key in locals_copy:
20 locals_to_update[key] = value
21 frame.f_locals.update(locals_to_update)
22 ctypes.pythonapi.PyFrame_LocalsToFast(ctypes.py_object(frame), ctypes.c_int(0))
23
24 # Delete frame to avoid cyclic references
25 del stack
26 del frame
Using this new function in our running example shows how we can modify the running program to change its behavior.
Click to expand
Starting program.
Python 3.11.3 (main, Apr 7 2023, 20:13:31) [Clang 14.0.0 (clang-1400.0.29.202)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
(InteractiveConsole)
>>> a = 1000
>>> ^D
now exiting InteractiveConsole...
1000
1001
101
Now, a single call to hook
can change the local state of any function or method.
We could have extended the functionality to also overwrite globals.
However, that is a messy procedure and arguably causes more problems than solutions, since we want to use the hook to inspect and modify local state first and foremost.
With the core functionality taking care of, we can now clean up the banner message and take care of logging before wrapping it all up.
Logging the correct line number
It would be helpful to log when the hook was entered and exited and it would be even more helpful if we could tell the logger the line number and file of the caller instead of the hook itself.
Luckily, FrameInfo
objects have this info already present.
When using the logging
module we can pass loggers a custom LogRecord
that contains the correct code location.
Given the stack
of the caller and some msg
we want to log, we can create the corresponding LogRecord
like so:
logging.LogRecord(
name="root",
level=logging.DEBUG,
pathname=stack.filename,
lineno=stack.lineno,
msg=msg,
args=None,
exc_info=None,
func=stack.function,
)
In our example, we log the hook on the DEBUG
level, since it shouldn’t be used in production anyways.
Additionally, we can create a better banner message that the interactive shell should show.
This banner is then passed to interact
as its first argument.
Furthermore, we can set the exitmsg
argument of interact
to an empty string to suppress it.
banner = f"Hooked at {stack.filename}:{stack.lineno} - Exit with CTRL+D"
code.interact(banner, local=shell_locals, exitmsg="")
Now, we can put it all together and put the hook
function to use.
The hook
function
The finished function to hook into Python programs looks like this (and can also be viewed, downloaded and commented on here):
#!/usr/bin/env python3
import code
import ctypes
import inspect
import logging
def hook(
banner_msg: str | None = None,
overwrite_locals: bool = False,
logger: str = "root",
):
"""Hooks into the caller, starting an interactive shell.
Copies globals and locals from the caller into the shell and local
variables from the shell will be copied back to the caller.
Args:
banner_msg:
Additional message to display on shell startup
overwrite_locals:
If `True`, overwrites locals in the caller function with locals
from the interactive shell
logger:
Name of the logger to use for reporting the hook
"""
# Get stack frame for caller
stack = inspect.stack()[1]
frame = stack.frame
# Set up log function
def log(msg: str):
logging.getLogger(logger).handle(
logging.LogRecord(
name=logger,
level=logging.DEBUG,
pathname=stack.filename,
lineno=stack.lineno,
msg=msg,
args=None,
exc_info=None,
func=stack.function,
)
)
# Log start of REPL
log("Starting interactive shell.")
# Copy locals and globals of caller's stack frame
locals_copy = dict(frame.f_locals)
globals_copy = dict(frame.f_globals)
shell_locals = dict()
shell_locals.update(globals_copy)
shell_locals.update(locals_copy)
# Format banner
banner = f"Hooked at {stack.filename}:{stack.lineno} - Exit with CTRL+D"
if banner_msg:
banner = f"{banner_msg}\n{banner}"
# Start interactive shell
code.interact(banner, local=shell_locals, exitmsg="")
# Update caller's locals with modified locals from shell
if overwrite_locals:
locals_to_update = dict()
for key, value in shell_locals.items():
if key in locals_copy:
locals_to_update[key] = value
frame.f_locals.update(locals_to_update)
ctypes.pythonapi.PyFrame_LocalsToFast(ctypes.py_object(frame), ctypes.c_int(0))
# Log exit from REPL
log("Returned from interactive shell. Resuming execution.")
# Delete frame to avoid cyclic references
del stack
del frame
Let us explore it’s uses.
Simple debugging
Let’s assume we are prototyping a little script to hash file contents.
We read a filepath from stdin
and then print a hash calculated from the file to stdout
.
A problem with this functionality we can already expect is that the file is missing and thus the FileNotFoundError
will be thrown when we try to open it.
Quickly writing the idea down, we might end up with the following code:
filehasher.py (buggy)
1#!/usr/bin/env python3
2
3from hashlib import sha256
4
5
6def hash_file(filepath: str):
7 h = sha256(b"")
8 with open(filepath, "r") as f:
9 while True:
10 chunk = f.read(1024)
11 if len(chunk) == 0:
12 break
13 else:
14 h.update(chunk)
15 return h.hexdigest()
16
17
18def main():
19 try:
20 fp = input("Give me a file: ")
21 file_hash = hash_file(fp)
22 print(f"Hash: {file_hash}")
23 except FileNotFoundError:
24 print("The file does not exist!")
25
26
27if __name__ == "__main__":
28 main()
When running this program, we are in for a surprise:
Give me a file: hooking.py
Traceback (most recent call last):
File "/some/path/filehasher.py", line 28, in <module>
main()
File "/some/path/filehasher.py", line 21, in main
file_hash = hash_file(fp)
^^^^^^^^^^^^^
File "/some/path/filehasher.py", line 14, in hash_file
h.update(chunk)
TypeError: Strings must be encoded before hashing
Seems like there is something wrong with the hash_file
function.
Let’s hook into it at the point when the file has been opened.
1#!/usr/bin/env python3
2
3from hashlib import sha256
4from hooking import hook
5
6
7def hash_file(filepath: str):
8 h = sha256(b"")
9 with open(filepath, "r") as f:
10 hook()
11 while True:
12 chunk = f.read(1024)
13 if len(chunk) == 0:
14 break
15 else:
16 h.update(chunk)
17 return h.hexdigest()
Now we can take a closer look:
Give me a file: hooking.py
Hooked at /some/path/filehasher.py:10 - Exit with CTRL+D
>>> chunk = f.read(1024)
>>> type(chunk)
<class 'str'>
>>> help(h.update)
Help on built-in function update:
update(obj, /) method of _hashlib.HASH instance
Update this hash object's state with the provided string.
>>> h.update(chunk)
Traceback (most recent call last):
File "<console>", line 1, in <module>
TypeError: Strings must be encoded before hashing
>>> h.update(chunk.encode())
>>> h.hexdigest()
'92153f71bd98802d596f9bf4e39c497ae9f91abb5d6ac2b0a9295dd7682aca7b'
>>> type(chunk.encode())
<class 'bytes'>
Eureka!
It turns out, that the update
function needs to receive bytes
instead of str
.
The reason that the read
method on our file returns str
instead of bytes
is in the way it’s opened.
It’s mode
argument needs to be "rb"
to read the file in binary mode and receive values of type bytes
.
6def hash_file(filepath: str):
7 h = sha256(b"")
8 with open(filepath, "rb") as f:
9 while True:
10 chunk = f.read(1024)
11 if len(chunk) == 0:
12 break
13 else:
14 h.update(chunk)
15 return h.hexdigest()
Now, the program works as intended:
Give me a file: hooking.py
Hash: 8cafaa5d451bedfa778ef7135edc0688255a2df9dae5a57b6466526434ca39fc
Inspecting objects and values at runtime
Sometimes results from a long running algorithm are so interesting or important that we would like to interactively inspect them. This can be the case for machine learning pipelines. Let’s look at this example:
datascience.py
1#!/usr/bin/env python3
2
3import logging
4from hooking import hook
5from sklearn import ensemble
6from sklearn.datasets import fetch_kddcup99
7from sklearn.preprocessing import LabelEncoder
8from sklearn.model_selection import train_test_split
9from sklearn.metrics import fbeta_score, make_scorer
10
11
12def main():
13 # Set up logging
14 logging.basicConfig(
15 level=logging.DEBUG,
16 format="[%(asctime)s %(levelname)s (%(filename)s:%(lineno)d)]: %(message)s",
17 )
18
19 # Get some data
20 X, y = fetch_kddcup99(return_X_y=True, percent10=True)
21
22 # Transform nominal labels to numbers
23 transport_protocol_encoder = LabelEncoder().fit(X[:, 1])
24 application_protocol_encoder = LabelEncoder().fit(X[:, 2])
25 subset_encoder = LabelEncoder().fit(X[:, 3])
26 class_encoder = LabelEncoder().fit(y)
27 X[:, 1] = transport_protocol_encoder.transform(X[:, 1])
28 X[:, 2] = application_protocol_encoder.transform(X[:, 2])
29 X[:, 3] = subset_encoder.transform(X[:, 3])
30 y = class_encoder.transform(y)
31
32 # Split data
33 X_train, X_test, y_train, y_test = train_test_split(X, y)
34
35 # Create fbeta scorer
36 scorer = make_scorer(
37 lambda y, ypred: fbeta_score(y, ypred, beta=1.4, average="micro"),
38 greater_is_better=True,
39 )
40
41 # Train model
42 clf = ensemble.RandomForestClassifier(n_jobs=8)
43 clf = clf.fit(X_train, y_train)
44
45 # Inspect model
46 hook()
47
48
49if __name__ == "__main__":
50 main()
We can use hook
to interactively inspect the created model after training.
Hooked at /some/path/datascience.py:46 - Exit with CTRL+D
>>> scorer(clf, X_test, y_test)
0.9997975806843393
>>> from sklearn.metrics import matthews_corrcoef
>>> mcc = make_scorer(matthews_corrcoef)
>>> mcc(clf, X_test, y_test)
0.9996559488619614
>>> [t.tree_.node_count for t in clf.estimators_]
[601, 651, 679, 753, 593, 557, 639, 709, 663, 551, 635, 539, 689, 709, 743, 743, 723, 673, 673, 645, 533, 685, 579, 601, 501, 589, 599, 483, 653, 615, 729, 831, 703, 523, 609, 525, 673, 549, 689, 525, 707, 523, 521, 691, 629, 599, 651, 637, 477, 633, 523, 701, 693, 575, 571, 755, 587, 631, 649, 555, 663, 585, 637, 655, 601, 717, 627, 561, 653, 681, 759, 831, 675, 643, 615, 669, 591, 677, 697, 701, 763, 651, 543, 651, 659, 655, 573, 687, 713, 713, 811, 589, 505, 661, 673, 601, 599, 605, 515, 641]
>>> sum([t.tree_.node_count for t in clf.estimators_]) / len(clf.estimators_)
637.16
Essentially, hook
let’s us do supercharged print-debugging.
We can look at data, take it apart and compute new metrics that we might not have thought of beforehand.
Saving and replacing data structures
Another technique I have found useful is to serialize data structures or objects for later use.
This is especially helpful when it takes a long time to compute something that you want to store for later use.
In the case of our machine learning algorithm we can use the pickle
module to serialize a trained classifier and save it to a file.
Hooked at /some/path/datascience.py:46 - Exit with CTRL+D
>>> clf
RandomForestClassifier(n_jobs=8)
>>> import pickle
>>> f = open("model.pickle", "wb+")
>>> pickle.dump(clf, f)
>>> f.close()
Conversely, it’s possible to load pickled data in a hook.
This can be used to debug algorithms that might have failed earlier with the pickled data.
Imagine that you have a pipeline that first takes a long time to produce some intermediate result and that intermediate result is then responsible for causing a bug later down the line.
Using hook
, we could hook into exception handlers, store the intermediate result and then, in a another run of the program, use the hook
again to insert our faulty data for more testing.
Hooked at /some/path/datascience.py:46 - Exit with CTRL+D
>>> import pickle
>>> f = open("model.pickle", "rb")
>>> clf = pickle.load(f)
>>> clf
RandomForestClassifier(n_jobs=8)
Just don’t forget to add overwrite_locals=True
to the hook
call when you actually want to overwrite something!
Pitfalls and Problems
This function is not without its caveats.
Using exit
exit
will not exit your hook, but the entire program. (╯°□°)╯︵ ┻━┻
Unused variables
In theory the hook
function could use any variable in a caller function.
However, linters are not aware of this.
If you keep variables around without using them, only so hook
can later be used to inspect them, linters will warn you of unused variables.
This is only a minor annoyance, but should still be kept in mind.
No Completion
The tab completion that the Python shell usually provides is absent from the hook.
Reference mutation
Here is a simple program with a hook:
1#!/usr/bin/env python3
2
3from hooking import hook
4
5class A:
6 def __init__(self):
7 self.x = 1
8
9a = A()
10hook()
11print("a.x:", a.x)
As we can see, we have not supplied hook
with the overwrite_locals
argument, so we should not be able to overwrite anything, right?
Well, let’s look at the program in action:
Hooked at /some/path/file.py:10 - Exit with CTRL+D
>>> a
<__main__.A object at 0x1007bff10>
>>> a.x
1
>>> a.x = 100
>>> a
<__main__.A object at 0x1007bff10>
>>> ^D
a.x: 100
As we can see, he have not overwritten any locals, but we have still changed the program’s state.
We have not changed the reference to the object a
.
That stays intact.
However, since these objects are references and we only perform a shallow copy of the locals and globals in our program, we can mutate referenced objects however we like.
A fix for this would be using the deepcopy
module to create copies of said objects instead of copying references.
Since we don’t overwrite the local variables (references to the object) in this example, it can also causes some strange behavior.
Hooked at /some/path/file.py:10 - Exit with CTRL+D
>>> a.x = 100
>>> a = A()
>>> a.x = 200
>>> ^D
100
Changing the state of the untouched reference did cause a change, but a change to the new object (which overwrites the old reference) didn’t.
Conclusion
We have constructed a function that allows us to hook into Python programs.
It only relies on the introspection tools given to us by the standard library and is free of external dependencies.
Therefore, the introduced hooking
module can easily be used with most kinds of Python software projects that can be run locally in a development environment.
If anything, this post also highlights why one should never use eval
nor exec
in production code.
By using what Python gives us, we are able to freely inspect and mutate the whole program state, which can have dangerous consequences for data integrity and security.
-
This idea is also presented in DigitalOcean’s tutorials, where the author goes a bit more into detail with Python’s
code
module. In this post however, we focus more on reading and writing stack frames and to provide a hook that works well in any project without much setup. ↩︎ -
The descriptions for the attributes have been quoted from Python’s documentation. ↩︎
-
This workaround is also described in a post on the PyDev blog. ↩︎