Jupyter notebooks are pretty neat. It takes only a couple clicks/keystrokes to create a new cell, write some code, and run it. You can skip all the steps of creating a new file, naming the file, writing an entrypoint/main function, and switching to a terminal window to run the code.
One of the most powerful features of the notebook paradigm is that you can swap in and out bits of functionality just by running cells in a different order. You can update a function and then only re-run the pieces of code that depend on that change. This is incredible for exploration and prototyping, but with this great flexibility comes a serious drawback: Since there's no well-defined execution order for cells, your notebook can get into a state where running the cells top to bottom produces errors.
In my own experience, notebooks are most often used by data scientists and researchers to create plots and test theories. And while not always essential, if someone else can't reproduce those plots by re-running the notebook, then the notebook loses a lot of its value. A core value of science is reproducibility, so wouldn't it be great if our coding environment made it easier to achieve that lofty goal?
Observable
In case you're not familiar with it, I want to quickly introduce Observable, an online platform for writing JavaScript notebooks. It was created by Mike Bostock, who you may recognize as the author of the D3.js data visualization library.
Observable's key innovation is bringing reactivity to notebooks -- execution of one cell can trigger execution of others. Each cell registers a variable that other cells can depend on. When a cell executes, the updated output automatically propagates to all cells that depend on it, keeping all cells in sync.
Cado
After a few years of being a full time Python developer, wishing I could use Observable, wishing my notebooks were smarter, I decided to build a proof of concept called Cado, bringing the reactive notebook paradigm to Python.
Naturally, the app has a mascot named Avo.
Over the next couple sections I'll go through four key features that made Cado possible.
Python exec and locals
First, we need a way to run cells. Fortunately, Python exposes runtime access to its own interpreter. You can define some code as a string and execute it with the built-in exec function.
This code outputs:
This code alone is the meat of a cell implementation. We can execute a cell and capture its output, errors, and any variables it defines. Capturing locals will be important later when we validate which variables can be registered as the output of a cell. And stdout and stderr gives us control of how we display a cell's I/O streams.
> It's worth noting that exec has full access to your Python environment, so be careful when executing untrusted code.
Enforcing the DAG
Next, let's introduce the reactivity. We'll need to track which cells depend on which other cells so the execution of one can trigger propagation to others.
You may be able to guess an appropriate data structure for dependency tracking. We can use a directed acyclic graph (DAG). Each cell is a node in the graph, and there is a directed edge from cell A to cell B if B depends on A.
Figure 1: Circles represent cells in the notebook and arrows represent data flow. The numbers have been assigned to the cells so that running the cells in order from 1 to 7 ensures dependencies are run before they are needed.
We can build up the graph from a list of cells where each cell has an ID, a list of input variables, and a single output variable. And while building the graph, we can also enforce a few important invariants of our notebook:
1. Each output variable is produced by exactly one cell (no two cells output the same variable)
2. Each input variable corresponds to an output of some cell (can't rely on undefined variables)
Finally, we want to avoid a bad state where the DAG has a cycle in it. If the notebook had a cycle, then there wouldn't be a valid execution order for the cells, so we can use a depth-first search (DFS) to detect cycles and display an informative error to the user if one is found.
With these pieces in place, whenever a user interacts with the notebook, we can use the dependency graph to figure out which cell need to be executed first and where outputs should be propagated. Maintaining this DAG is pretty valuable!
Caching execution results
Once I got this far, I found a bug. Did you catch it? A cell runs its dependencies before running itself and also a cell propagates its output to its dependents. So cell A triggers execution of dependent cell B, then B triggers execution of A, etc. The notebook gets itself into an infinite loop of reactivity.
We need a way to stop updates from propagating forever. Also, if a cell executes, but produces the same output as the last time it was run, we wouldn't want to waste compute triggering its dependents to re-execute unnecessarily.
Both of these problems can be solved by caching the results of cell executions! A cell only needs to be re-executed if one of its inputs has changed since the last time it was run. Much more efficient!
Impure cells
But what if a cell is not a pure function of its inputs? For example, a cell might read from a file on disk or make a network request via some API. Then it could produce different outputs even if its inputs haven't changed. The caching strategy described above would fail in this case.
In these cases, the user can mark the cell as impure and it will always re-execute when any of its descendants (the whole subtree of downstream cells) are run. If this sounds like a lot of extra work, running these impure cells over and over again, you're right.
There is one further optimization we can make though. The update of an impure cell only cascades if the new output differs from the cached output. The impure cell may run often, but its outputs need not cascade unnecessarily.
The cost of equality checks on cached outputs is the main source of complexity in Cado's implementation. For objects with imprecise equality semantics, letting users define what equality means for each output variable is an interesting UX challenge that I don't have a great answer for yet.
Web interface
Similarly to Jupyter, the Cado server also serves the user interface. By running the cado up command, a single Python process serves the FastAPI WebSocket API as well as the React frontend, which connects to the socket automatically.
Your browser does not support the video tag.
Figure 2:
Updates propagate from cells to cells that depend on them.
Your browser does not support the video tag.
Figure 3:
Cells use cached outputs from dependencies rather than re-executing their dependencies.
Your browser does not support the video tag.
Figure 4:
Running a cell triggers execution of all dependencies that don't have cached outputs.
Your browser does not support the video tag.
Figure 5:
Cycles are automatically detected.
Your browser does not support the video tag.
Figure 6:
Cells cannot rely on input variables that aren't outputs of other cells.
Your browser does not support the video tag.
Figure 7:
Output variables must be unique across all cells.
Your browser does not support the video tag.
Figure 8:
Cells are draggable (using
Framer Motion), something I always thought Jupyter notebooks should support.
Wrapping up
I hope someday the reactive notebook paradigm gains traction in the Python ecosystem, if not as a default, perhaps as an opt-in setting. With the right user interface design, the benefits of reactivity could far outweigh the added complexity. Using the same tricks Observable used to make data visualization more interactive, we can make data science more reproducible.
: I can hear the grumbling protestations of emacs/vim power users. I get it, notebooks aren't for everyone. But if you're prototyping in Python and aren't horribly allergic to the computer mouse then they're worth a shot!
: Another cool thing you can do with this DAG is safely convert a notebook to a script. You can prefix local variables in each cell with the cell ID, topologically sort the cells with the DAG, then concatenate the cell contents. You're guaranteed a working script that you can fold into your Python codebase as a module.