Hooking Into Python Programs


Table of Contents

Introduction

In this post we are going to explore how to interactively debug Python programs using a function to hook into another function, start a Python shell and arbitrarily overwrite local variables. This method will not rely on any libraries and use only what Python’s standard library gives us. I have used this method to debug and explore numerical algorithms and I think it is a real timesaver when you don’t want to use some external debugger.1

Important Note

The method explored in this post works with CPython in version 3.11.x. It does not work with PyPy and probably doesn’t with other Python implementations.

Why is such a hook helpful? Imagine you have a complex and/or complicated function that deals with some state in a way that is hard to understand. To get a better feel for the function you might start adding print statements everywhere to see how the state changes over time. This becomes unnecessarily cumbersome and messy. It would be much better to hook into interesting parts of your programs once and then take a look at all values present at that time.

Another use case: In very time consuming functions you might serialize interim results and write them to files so you can resume the function from that result to save time. However, sometimes you are not sure what data you actually need. It would be much easier if we could do this interactively.

What we need is a hook that lets us interact with the program while it is running, freely changing its state.

Hooking into Python

In this section we are going to develop a function to hook into arbitrary parts of a Python program. In doing so we will learn a bit about how to read and modify the interpreter stack. If you are only interested in the outcome, you can skip to the next section.

Starting a shell

How do we get an interactive hook going? What we need first is an interactive Python shell. We can use the interact function from the code module to start a REPL within our program. When called, it halts further execution of the program and only resumes it once the REPL is closed.

Let’s look at an example application we want to debug:

 1import code
 2
 3GLOBAL_VAR = 100
 4
 5def f(n):
 6    return n + 1
 7
 8def main():
 9    print("Starting program.")
10    a = 1
11
12    # Where we want to hook
13    code.interact()
14
15    b = f(a)
16    c = f(GLOBAL_VAR)
17    print(a)
18    print(b)
19    print(c)
20
21if __name__ == "__main__":
22    main()

The call to interact is already present, so let’s run this program. The interactive shell can be exited by CTRL+D.

Click to expand
Starting program.
Python 3.11.3 (main, Apr  7 2023, 20:13:31) [Clang 14.0.0 (clang-1400.0.29.202)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
(InteractiveConsole)
>>> a
Traceback (most recent call last):
  File "<console>", line 1, in <module>
NameError: name 'a' is not defined
>>> GLOBAL_VAR
Traceback (most recent call last):
  File "<console>", line 1, in <module>
NameError: name 'GLOBAL_VAR' is not defined
>>> a = 100
>>> ^D
now exiting InteractiveConsole...
1
2
101

We can see the familiar Python shell banner when the REPL (called InteractiveConsole) starts. Now, we can import modules, define variables and evaluate expressions. However, as we can see, we can neither work with local nor global variables of the program. This also means that we cannot work with any functions in the program without tediously importing them. We can fix that.

Passing local and global variables

Let’s add these variables to our REPL. We can access the local and global variables in the current scope using the locals and globals functions. Additionally, we can set the local variables for our interactive console, by passing a dictionary with str keys to the interact function with the keyword argument local. So, passing local variables to the REPL is a very small alteration:

13code.interact(local=locals())

With this change, we can access local variables:

Click to expand
Starting program.
Python 3.11.3 (main, Apr  7 2023, 20:13:31) [Clang 14.0.0 (clang-1400.0.29.202)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
(InteractiveConsole)
>>> a
1
>>> ^D
now exiting InteractiveConsole...
1
2
101

Additionally, we can add global variables into the mix. The order in which we add variables is important, since we want to overwrite global variables by local variables.

13# Create locals for the interactive shell
14shell_locals = dict()
15shell_locals.update(globals())
16shell_locals.update(locals())
17# Where we want to hook
18code.interact(local=shell_locals)

Now, we have a way to not just access the global and local variables, but also have our first way of exploring their relationship with functions in our program.

Click to expand
Starting program.
Python 3.11.3 (main, Apr  7 2023, 20:13:31) [Clang 14.0.0 (clang-1400.0.29.202)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
(InteractiveConsole)
>>> a
1
>>> GLOBAL_VAR
100
>>> f(a)
2
>>> ^D
now exiting InteractiveConsole...
1
2
101

Hooking from another function

While the functionality we have achieved is already helpful, it makes the code hard to read. We do not want to manually collect local variables and make the call to interact each time. It would be much better to provide a function that computes the variables passed to the shell and starts it automatically. Let’s create this function:

1def hook():
2    # Create locals for the interactive shell
3    shell_locals = dict()
4    shell_locals.update(globals())
5    shell_locals.update(locals())
6    # Start the shell
7    code.interact(local=shell_locals)

Now, we can replace the call to interact with hook, but this poses a problem: The local variables of hook obviously are not the same as the function that called hook (from now on called the caller). What we have to do is to access the local variables of that caller by looking at the stack.

Unsurprisingly, the interpreter stack contains one stack frame for each function call. We are able to inspect these frames through FrameInfo objects, which contain the frame objects themselves, as well as information about the executing function, respective filenames and line numbers. The frame objects themselves contain a few interesting attributes:2

Of note, are f_locals and f_globals. Both contain dictionaries with keys of type str mapping to arbitrary objects. They can be understood as the local and global scope that the function with that specific stack frame sees. So we can replace locals and globals with these attributes. This only raises the question on how to get access to the stack.

Python features the inspect module to provide introspection into functions, classes and… the stack, using the stack function. The function returns a list of the FrameInfo objects of our current stack, the first element being the info for the frame of the current function. Thus, the second element in that list is the frame for the caller of the currently executing function.

c h a o l o l k e r f f _ _ l g o l c o a b l a s l s

However, when accessing these frame objects we have to be careful according to the interpreter stack documentation:

Keeping references to frame objects, as found in the first element of the frame records these functions return, can cause your program to create reference cycles. Once a reference cycle has been created, the lifespan of all objects which can be accessed from the objects which form the cycle can become much longer even if Python’s optional cycle detector is enabled. If such cycles must be created, it is important to ensure they are explicitly broken to avoid the delayed destruction of objects and increased memory consumption which occurs.

To not risk any problems with the garbage collector we have to explicitly delete the reference to the caller’s stack frame.

Putting all of this together gives us the following function:

 1import code
 2import inspect
 3
 4def hook():
 5    # Get stack frame for caller
 6    stack = inspect.stack()[1]
 7    frame = stack.frame
 8
 9    # Copy locals and globals of caller's stack frame
10    locals_copy = dict(frame.f_locals)
11    globals_copy = dict(frame.f_globals)
12    shell_locals = dict()
13    shell_locals.update(globals_copy)
14    shell_locals.update(locals_copy)
15
16    # Start interactive shell
17    code.interact(local=shell_locals)
18
19    # Delete frame to avoid cyclic references
20    del stack
21    del frame

Now the function we want to hook contains only a single function call to hook:

 8def main():
 9    print("Starting program.")
10    a = 1
11
12    # Where we want to hook
13    hook()
14
15    b = f(a)
16    c = f(GLOBAL_VAR)
17    print(a)
18    print(b)
19    print(c)

Although the call to the function looks innocent, it is capable of reading the caller’s local and global variables.

Mutating the stack frame

Now, we come to the most interesting functionality of our hook. We want to be able to make changes to our running program’s variables from within the REPL! While the interact function doesn’t return anything, it does modify the dictionary of local variables we pass to it. This means that we can use this dictionary, after the REPL has run, to write back values and references that might have changed. To do this, we can update the f_locals attribute of our caller’s frame. However, these changes will be discarded if we don’t use a specific function (PyFrame_LocalsToFast) from the Python API to apply these changes.3

To do this we can use the pythonapi symbol exported from the ctypes module to call the function with the frame object as it’s first argument and an integer as its second argument.

locals_to_update = dict()
for key, value in shell_locals.items():
    if key in locals_copy:
        locals_to_update[key] = value
frame.f_locals.update(locals_to_update)
ctypes.pythonapi.PyFrame_LocalsToFast(ctypes.py_object(frame), ctypes.c_int(0))

After calling the REPL, shell_locals will contain the modified (and possibly new) locals from the shell. We build a new dictionary that only contains the keys that are also present in the caller’s locals. After updating the f_locals from the caller, we call the API function. The first argument is the frame encoded as an arbitrary Python object. The second argument is a constant that specifies wether variables that are missing from the updated dictionary should be deleted, 0 meaning that the values should not be deleted and 1 meaning that they should. This can be added to our hook function, to let us manipulate the program from our REPL.

 1def hook():
 2    # Get stack frame for caller
 3    stack = inspect.stack()[1]
 4    frame = stack.frame
 5
 6    # Copy locals and globals of caller's stack frame
 7    locals_copy = dict(frame.f_locals)
 8    globals_copy = dict(frame.f_globals)
 9    shell_locals = dict()
10    shell_locals.update(globals_copy)
11    shell_locals.update(locals_copy)
12
13    # Start interactive shell
14    code.interact(local=shell_locals)
15
16    # Update caller's locals with modified locals from shell
17    locals_to_update = dict()
18    for key, value in shell_locals.items():
19        if key in locals_copy:
20            locals_to_update[key] = value
21    frame.f_locals.update(locals_to_update)
22    ctypes.pythonapi.PyFrame_LocalsToFast(ctypes.py_object(frame), ctypes.c_int(0))
23
24    # Delete frame to avoid cyclic references
25    del stack
26    del frame

Using this new function in our running example shows how we can modify the running program to change its behavior.

Click to expand
Starting program.
Python 3.11.3 (main, Apr  7 2023, 20:13:31) [Clang 14.0.0 (clang-1400.0.29.202)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
(InteractiveConsole)
>>> a = 1000
>>> ^D
now exiting InteractiveConsole...
1000
1001
101

Now, a single call to hook can change the local state of any function or method. We could have extended the functionality to also overwrite globals. However, that is a messy procedure and arguably causes more problems than solutions, since we want to use the hook to inspect and modify local state first and foremost.

With the core functionality taking care of, we can now clean up the banner message and take care of logging before wrapping it all up.

Logging the correct line number

It would be helpful to log when the hook was entered and exited and it would be even more helpful if we could tell the logger the line number and file of the caller instead of the hook itself. Luckily, FrameInfo objects have this info already present. When using the logging module we can pass loggers a custom LogRecord that contains the correct code location. Given the stack of the caller and some msg we want to log, we can create the corresponding LogRecord like so:

logging.LogRecord(
        name="root",
        level=logging.DEBUG,
        pathname=stack.filename,
        lineno=stack.lineno,
        msg=msg,
        args=None,
        exc_info=None,
        func=stack.function,
    )

In our example, we log the hook on the DEBUG level, since it shouldn’t be used in production anyways.

Additionally, we can create a better banner message that the interactive shell should show. This banner is then passed to interact as its first argument. Furthermore, we can set the exitmsg argument of interact to an empty string to suppress it.

banner = f"Hooked at {stack.filename}:{stack.lineno} - Exit with CTRL+D"
code.interact(banner, local=shell_locals, exitmsg="")

Now, we can put it all together and put the hook function to use.

The hook function

The finished function to hook into Python programs looks like this (and can also be viewed, downloaded and commented on here):

#!/usr/bin/env python3

import code
import ctypes
import inspect
import logging


def hook(
    banner_msg: str | None = None,
    overwrite_locals: bool = False,
    logger: str = "root",
):
    """Hooks into the caller, starting an interactive shell.

    Copies globals and locals from the caller into the shell and local
    variables from the shell will be copied back to the caller.

    Args:
        banner_msg:
            Additional message to display on shell startup
        overwrite_locals:
            If `True`, overwrites locals in the caller function with locals
            from the interactive shell
        logger:
            Name of the logger to use for reporting the hook
    """
    # Get stack frame for caller
    stack = inspect.stack()[1]
    frame = stack.frame

    # Set up log function
    def log(msg: str):
        logging.getLogger(logger).handle(
            logging.LogRecord(
                name=logger,
                level=logging.DEBUG,
                pathname=stack.filename,
                lineno=stack.lineno,
                msg=msg,
                args=None,
                exc_info=None,
                func=stack.function,
            )
        )

    # Log start of REPL
    log("Starting interactive shell.")

    # Copy locals and globals of caller's stack frame
    locals_copy = dict(frame.f_locals)
    globals_copy = dict(frame.f_globals)
    shell_locals = dict()
    shell_locals.update(globals_copy)
    shell_locals.update(locals_copy)

    # Format banner
    banner = f"Hooked at {stack.filename}:{stack.lineno} - Exit with CTRL+D"
    if banner_msg:
        banner = f"{banner_msg}\n{banner}"

    # Start interactive shell
    code.interact(banner, local=shell_locals, exitmsg="")

    # Update caller's locals with modified locals from shell
    if overwrite_locals:
        locals_to_update = dict()
        for key, value in shell_locals.items():
            if key in locals_copy:
                locals_to_update[key] = value
        frame.f_locals.update(locals_to_update)
        ctypes.pythonapi.PyFrame_LocalsToFast(ctypes.py_object(frame), ctypes.c_int(0))

    # Log exit from REPL
    log("Returned from interactive shell. Resuming execution.")

    # Delete frame to avoid cyclic references
    del stack
    del frame

Let us explore it’s uses.

Simple debugging

Let’s assume we are prototyping a little script to hash file contents. We read a filepath from stdin and then print a hash calculated from the file to stdout. A problem with this functionality we can already expect is that the file is missing and thus the FileNotFoundError will be thrown when we try to open it. Quickly writing the idea down, we might end up with the following code:

filehasher.py (buggy)
 1#!/usr/bin/env python3
 2
 3from hashlib import sha256
 4
 5
 6def hash_file(filepath: str):
 7    h = sha256(b"")
 8    with open(filepath, "r") as f:
 9        while True:
10            chunk = f.read(1024)
11            if len(chunk) == 0:
12                break
13            else:
14                h.update(chunk)
15    return h.hexdigest()
16
17
18def main():
19    try:
20        fp = input("Give me a file: ")
21        file_hash = hash_file(fp)
22        print(f"Hash: {file_hash}")
23    except FileNotFoundError:
24        print("The file does not exist!")
25
26
27if __name__ == "__main__":
28    main()

When running this program, we are in for a surprise:

Give me a file: hooking.py
Traceback (most recent call last):
  File "/some/path/filehasher.py", line 28, in <module>
    main()
  File "/some/path/filehasher.py", line 21, in main
    file_hash = hash_file(fp)
                ^^^^^^^^^^^^^
  File "/some/path/filehasher.py", line 14, in hash_file
    h.update(chunk)
TypeError: Strings must be encoded before hashing

Seems like there is something wrong with the hash_file function. Let’s hook into it at the point when the file has been opened.

 1#!/usr/bin/env python3
 2
 3from hashlib import sha256
 4from hooking import hook
 5
 6
 7def hash_file(filepath: str):
 8    h = sha256(b"")
 9    with open(filepath, "r") as f:
10        hook()
11        while True:
12            chunk = f.read(1024)
13            if len(chunk) == 0:
14                break
15            else:
16                h.update(chunk)
17    return h.hexdigest()

Now we can take a closer look:

Give me a file: hooking.py
Hooked at /some/path/filehasher.py:10 - Exit with CTRL+D
>>> chunk = f.read(1024)
>>> type(chunk)
<class 'str'>
>>> help(h.update)
Help on built-in function update:

update(obj, /) method of _hashlib.HASH instance
    Update this hash object's state with the provided string.

>>> h.update(chunk)
Traceback (most recent call last):
  File "<console>", line 1, in <module>
TypeError: Strings must be encoded before hashing
>>> h.update(chunk.encode())
>>> h.hexdigest()
'92153f71bd98802d596f9bf4e39c497ae9f91abb5d6ac2b0a9295dd7682aca7b'
>>> type(chunk.encode())
<class 'bytes'>

Eureka! It turns out, that the update function needs to receive bytes instead of str. The reason that the read method on our file returns str instead of bytes is in the way it’s opened. It’s mode argument needs to be "rb" to read the file in binary mode and receive values of type bytes.

 6def hash_file(filepath: str):
 7    h = sha256(b"")
 8    with open(filepath, "rb") as f:
 9        while True:
10            chunk = f.read(1024)
11            if len(chunk) == 0:
12                break
13            else:
14                h.update(chunk)
15    return h.hexdigest()

Now, the program works as intended:

Give me a file: hooking.py
Hash: 8cafaa5d451bedfa778ef7135edc0688255a2df9dae5a57b6466526434ca39fc

Inspecting objects and values at runtime

Sometimes results from a long running algorithm are so interesting or important that we would like to interactively inspect them. This can be the case for machine learning pipelines. Let’s look at this example:

datascience.py
 1#!/usr/bin/env python3
 2
 3import logging
 4from hooking import hook
 5from sklearn import ensemble
 6from sklearn.datasets import fetch_kddcup99
 7from sklearn.preprocessing import LabelEncoder
 8from sklearn.model_selection import train_test_split
 9from sklearn.metrics import fbeta_score, make_scorer
10
11
12def main():
13    # Set up logging
14    logging.basicConfig(
15        level=logging.DEBUG,
16        format="[%(asctime)s %(levelname)s (%(filename)s:%(lineno)d)]: %(message)s",
17    )
18
19    # Get some data
20    X, y = fetch_kddcup99(return_X_y=True, percent10=True)
21
22    # Transform nominal labels to numbers
23    transport_protocol_encoder = LabelEncoder().fit(X[:, 1])
24    application_protocol_encoder = LabelEncoder().fit(X[:, 2])
25    subset_encoder = LabelEncoder().fit(X[:, 3])
26    class_encoder = LabelEncoder().fit(y)
27    X[:, 1] = transport_protocol_encoder.transform(X[:, 1])
28    X[:, 2] = application_protocol_encoder.transform(X[:, 2])
29    X[:, 3] = subset_encoder.transform(X[:, 3])
30    y = class_encoder.transform(y)
31
32    # Split data
33    X_train, X_test, y_train, y_test = train_test_split(X, y)
34
35    # Create fbeta scorer
36    scorer = make_scorer(
37        lambda y, ypred: fbeta_score(y, ypred, beta=1.4, average="micro"),
38        greater_is_better=True,
39    )
40
41    # Train model
42    clf = ensemble.RandomForestClassifier(n_jobs=8)
43    clf = clf.fit(X_train, y_train)
44
45    # Inspect model
46    hook()
47
48
49if __name__ == "__main__":
50    main()

We can use hook to interactively inspect the created model after training.

Hooked at /some/path/datascience.py:46 - Exit with CTRL+D
>>> scorer(clf, X_test, y_test)
0.9997975806843393
>>> from sklearn.metrics import matthews_corrcoef
>>> mcc = make_scorer(matthews_corrcoef)
>>> mcc(clf, X_test, y_test)
0.9996559488619614
>>> [t.tree_.node_count for t in clf.estimators_]
[601, 651, 679, 753, 593, 557, 639, 709, 663, 551, 635, 539, 689, 709, 743, 743, 723, 673, 673, 645, 533, 685, 579, 601, 501, 589, 599, 483, 653, 615, 729, 831, 703, 523, 609, 525, 673, 549, 689, 525, 707, 523, 521, 691, 629, 599, 651, 637, 477, 633, 523, 701, 693, 575, 571, 755, 587, 631, 649, 555, 663, 585, 637, 655, 601, 717, 627, 561, 653, 681, 759, 831, 675, 643, 615, 669, 591, 677, 697, 701, 763, 651, 543, 651, 659, 655, 573, 687, 713, 713, 811, 589, 505, 661, 673, 601, 599, 605, 515, 641]
>>> sum([t.tree_.node_count for t in clf.estimators_]) / len(clf.estimators_)
637.16

Essentially, hook let’s us do supercharged print-debugging. We can look at data, take it apart and compute new metrics that we might not have thought of beforehand.

Saving and replacing data structures

Another technique I have found useful is to serialize data structures or objects for later use. This is especially helpful when it takes a long time to compute something that you want to store for later use. In the case of our machine learning algorithm we can use the pickle module to serialize a trained classifier and save it to a file.

Hooked at /some/path/datascience.py:46 - Exit with CTRL+D
>>> clf
RandomForestClassifier(n_jobs=8)
>>> import pickle
>>> f = open("model.pickle", "wb+")
>>> pickle.dump(clf, f)
>>> f.close()

Conversely, it’s possible to load pickled data in a hook. This can be used to debug algorithms that might have failed earlier with the pickled data. Imagine that you have a pipeline that first takes a long time to produce some intermediate result and that intermediate result is then responsible for causing a bug later down the line. Using hook, we could hook into exception handlers, store the intermediate result and then, in a another run of the program, use the hook again to insert our faulty data for more testing.

Hooked at /some/path/datascience.py:46 - Exit with CTRL+D
>>> import pickle
>>> f = open("model.pickle", "rb")
>>> clf = pickle.load(f)
>>> clf
RandomForestClassifier(n_jobs=8)

Just don’t forget to add overwrite_locals=True to the hook call when you actually want to overwrite something!

Pitfalls and Problems

This function is not without its caveats.

Using exit

exit will not exit your hook, but the entire program. (╯°□°)╯︵ ┻━┻

Unused variables

In theory the hook function could use any variable in a caller function. However, linters are not aware of this. If you keep variables around without using them, only so hook can later be used to inspect them, linters will warn you of unused variables. This is only a minor annoyance, but should still be kept in mind.

No Completion

The tab completion that the Python shell usually provides is absent from the hook.

Reference mutation

Here is a simple program with a hook:

 1#!/usr/bin/env python3
 2
 3from hooking import hook
 4
 5class A:
 6    def __init__(self):
 7        self.x = 1
 8
 9a = A()
10hook()
11print("a.x:", a.x)

As we can see, we have not supplied hook with the overwrite_locals argument, so we should not be able to overwrite anything, right? Well, let’s look at the program in action:

Hooked at /some/path/file.py:10 - Exit with CTRL+D
>>> a
<__main__.A object at 0x1007bff10>
>>> a.x
1
>>> a.x = 100
>>> a
<__main__.A object at 0x1007bff10>
>>> ^D
a.x: 100

As we can see, he have not overwritten any locals, but we have still changed the program’s state. We have not changed the reference to the object a. That stays intact. However, since these objects are references and we only perform a shallow copy of the locals and globals in our program, we can mutate referenced objects however we like.

Side Note

A fix for this would be using the deepcopy module to create copies of said objects instead of copying references.

Since we don’t overwrite the local variables (references to the object) in this example, it can also causes some strange behavior.

Hooked at /some/path/file.py:10 - Exit with CTRL+D
>>> a.x = 100
>>> a = A()
>>> a.x = 200
>>> ^D
100

Changing the state of the untouched reference did cause a change, but a change to the new object (which overwrites the old reference) didn’t.

Conclusion

We have constructed a function that allows us to hook into Python programs. It only relies on the introspection tools given to us by the standard library and is free of external dependencies. Therefore, the introduced hooking module can easily be used with most kinds of Python software projects that can be run locally in a development environment.

If anything, this post also highlights why one should never use eval nor exec in production code. By using what Python gives us, we are able to freely inspect and mutate the whole program state, which can have dangerous consequences for data integrity and security.


  1. This idea is also presented in DigitalOcean’s tutorials, where the author goes a bit more into detail with Python’s code module. In this post however, we focus more on reading and writing stack frames and to provide a hook that works well in any project without much setup. ↩︎

  2. The descriptions for the attributes have been quoted from Python’s documentation↩︎

  3. This workaround is also described in a post on the PyDev blog↩︎