Welcome to Software Development on Codidact!
Will you help us build our independent community of developers helping developers? We're small and trying to grow. We welcome questions about all aspects of software development, from design to code to QA and more. Got questions? Got answers? Got code you'd like someone to review? Please join us.
How can I properly type-hint methods in different files that would lead to circular imports?
I am writing a Python package where I have two classes in different files that (indirectly) depend on each other. If I wouldn't care about type-hints, there would be no problem, but unfortunately, I do like static type-hinting. Therefore, I need one class for typing the methods of the other and vice versa. Of course, this leads to a circular import, but using forward references, I managed to arrive at the following code:
# process.py
from helpers import Helper
class Process:
def do_something(self, helper: Helper):
...
# helpers.py
class Helper:
def update(self, process: "Process"):
...
This code runs without problems, but when running mypy
, it complains that Name "Process" is not defined
.
I know I could silence these errors, but I was wondering whether there is a "proper" way to fix the type-hinting in this case.
2 answers
Import modules rather than names first to avoid a circular reference in the import
statements; then use forward declarations, as before, to avoid a circular reference in the type annotations - like so:
# process.py
import helpers
class Process:
def do_something(self, helper: "helpers.Helper"):
...
# helpers.py
import process
class Helper:
def update(self, process: "process.Process"):
...
Circular imports are entirely possible in Python; problems only occur due to circular references in the imported code if and when that code runs at import time. It's important to keep in mind here that class
and def
are statements in Python, so when they appear at top level they represent code that will run immediately. (The effect of "running" def
is to create the object representing the function and assign it to the corresponding name; similarly for classes.)
Background information about the import process
When Python evaluates an import
statement (At least, by default - the import system provides a ton of hooks to customize the behaviour), once it has determined that the module actually needs to be loaded (and that there is a file to load it from, and where that file is) it first creates an empty module
object and stores it with the appropriate name in a global module dict (available by default as sys.modules
) - the same one that it uses to check whether a module has already been loaded. It then evaluates the top-level code of that file, using the attributes of the module
object as the global namespace for that code.
This has a few important implications:
-
import
is, of course, also a statement, and thus any top-levelimport
s in the module being imported will follow the same logic, recursively. However, because an object was stored insys.modules
before executing the module code, an ordinary loop ofimport
statements doesn't cause a problem by itself. If weimport process
in the above example, it willimport helpers
, which willimport process
- which will find the empty module object insys.modules
, and therefore not attempt to locateprocess.py
again. (As a historical note, this didn't always work properly in all cases: see https://github.com/python/cpython/issues/74210 and https://github.com/python/cpython/issues/61836 .) -
However, a problem emerges with circular imports when the top-level code depends on the contents of the imported module. If we
import process
, such thathelpers
finds an emptyprocess
module in thesys.modules
lookup, it will not be able to use any attributes ofprocess
that haven't yet been assigned. Python automatically converts the resultingAttributeError
into anImportError
internally. Since 3.8, Python can inspect the stack to add(most likely due to a circular import)
to the exception message, as opposed toImportError
s caused by modules that simply don't define the name in question. (I couldn't find a reference to this in the documentation, but I confirmed it manually by comparingceval.c
in the source across versions.) -
Normally, we put
import
statements at the top of the code to avoid confusion. But they are just ordinary statements that can run at any time. In particular, it's possible to define global variables inprocess.py
before theimport helpers
line, and then havehelpers.py
import themfrom process
. -
It's also possible to have an
import
inside the code of a function, which will then not be attempted until the function is called. (It will be attempted every time the function is called, but the cached module ordinarily will be immediately found every time after the first.) However, that doesn't help with the current case, because both theclass
anddef
statements will run immediately. -
It's possible for the top-level code to replace the object stored in
sys.modules
- for example, by defining a_Implementation
class and then assigningsys.modules[__name__] = _Implementation()
. In this case, the global names from the module still get attached to the originalmodule
object, not whatever was stored intosys.modules
(i.e. they won't interfere with the class instance; the class code can still use those names, because the module object is still acting as the class' global namespace). This can be used to get the effect of modules that seem to have dynamically determined "magic" attributes (by implementing__getattr__
or__getattribute__
in the class).
Because of the import
s, MyPy should now have enough information to resolve the string forward references and understand what type is being named. Meanwhile, since the forward reference is only a string, it doesn't cause a complaint from Python at runtime: when Python constructs the __annotations__
for the update
and do_something
functions, it just stores those strings rather than having to look up any other names.
0 comment threads
I eventually also found out that the typing
module has a global variable TYPE_CHECKING
that can be used to only import during type-checking.
Concretely, the following code for helpers.py
seems to type-check fine as well.
from typing import TYPE_CHECKING
if TYPE_CHECKING:
from process import Process
class Helper:
def update(self, process: "Process"):
...
The other answer probably is the better way to go in the context of circular imports, since this functionality is mainly intended for costly imports (according to the docs).
0 comment threads