Communities

Writing
Writing
Codidact Meta
Codidact Meta
The Great Outdoors
The Great Outdoors
Photography & Video
Photography & Video
Scientific Speculation
Scientific Speculation
Cooking
Cooking
Electrical Engineering
Electrical Engineering
Judaism
Judaism
Languages & Linguistics
Languages & Linguistics
Software Development
Software Development
Mathematics
Mathematics
Christianity
Christianity
Code Golf
Code Golf
Music
Music
Physics
Physics
Linux Systems
Linux Systems
Power Users
Power Users
Tabletop RPGs
Tabletop RPGs
Community Proposals
Community Proposals
tag:snake search within a tag
answers:0 unanswered questions
user:xxxx search by author id
score:0.5 posts with 0.5+ score
"snake oil" exact phrase
votes:4 posts with 4+ votes
created:<1w created < 1 week ago
post_type:xxxx type of post
Search help
Notifications
Mark all as read See all your notifications »
Q&A

Welcome to Software Development on Codidact!

Will you help us build our independent community of developers helping developers? We're small and trying to grow. We welcome questions about all aspects of software development, from design to code to QA and more. Got questions? Got answers? Got code you'd like someone to review? Please join us.

Comments on Open file in script's own folder

Parent

Open file in script's own folder

+9
−0

I have a Python script that needs to access some data (or configuration) file in its very own folder. For example, say script.py does something like this:

with open('data.txt') as file:
    data = file.read()

The script will find the file, data.txt, if it is run in the terminal via python script.py from the same folder the script itself is in. But I want to call the script from any other folder, then with a relative or absolute path: python path/to/script.py. In which case it will fail to find the data file, raising FileNotFoundError.

How can I make sure the script finds the external file in its own folder?

History
Why does this post require attention from curators or moderators?
You might want to add some details to your flag.
Why should this post be closed?

0 comment threads

Post
+3
−0

Theory

Relative paths are relative to the current working directory of the Python process. This can be checked from within Python using the os.getcwd function and set using the os.chdir function.

Whenever Python loads a module, a __file__ attribute may be set on the resulting module object, giving the path to the source code file that the module is based on. (Normally, this will happen for all user code; but the attribute could be missing for built-in modules like sys, or it could be the name of a file containing compiled code if the module was implemented in C.) Within the code for that module, these attributes are just the global variables for that code. (There is also no __file__ global variable available when using the REPL.)

The technique

Therefore, the code in script.py can simply check the __file__ global variable to find a full path to script.py; and the containing folder for script.py can thus be determined by parsing that path, in any number of ways. The simplest way is to take advantage of the pathlib standard library module (introduced in Python 3.4), as shown in Moshi's answer.

In Python 3.9 and above, the same technique also works if script.py was run directly as a module. However, in Python 3.4 through 3.8, the __file__ attribute will be a relative path if script.py was run directly as a module. (For example, if the command was python script.py, the __file__ value will be simply 'script.py', and the Pathlib logic will produce just '.'.)

As a result, this technique will break if the current working directory has changed between the time the script was started and the time that the path taken from __file__ is used.

To avoid this issue, make sure to determine __file__ immediately, compute an absolute path immediately, and store that path until needed:

from pathlib import Path

# when the script starts
here = Path(__file__).parent.absolute()

# later, possibly after an `os.chdir` call
with open(here / 'data.txt') as file:
    data = file.read()

Legacy support (Python 3.3 and earlier)

In older versions of Python, __file__ is still documented to be present on module objects (and thus be available as a global variable), with the same purpose. However, it is not guaranteed to be an absolute path. (For an absolute import, it will depend upon the entry in sys.path that was used for the import.) Further, pathlib is not available.

Thus, the necessary code might look like:

import os

# when the script starts
here = os.path.split(os.path.abspath(__file__))[0]

# later, possibly after an `os.chdir` call
with open(scriptFolder / 'data.txt') as file:
    data = file.read()

Here, os.path.abspath creates an absolute path, and os.path.split splits it into two parts - the "path" (i.e., to the containing folder) and the filename itself.


Credit to kindall and wim on Stack Overflow for the corresponding research about the semantics of __file__. kindall uncovered the legacy semantics, which agf then described in that answer; later, wim edited to supply the updated semantics in Python 3.4+.

History
Why does this post require attention from curators or moderators?
You might want to add some details to your flag.

1 comment thread

Also mention package_data? (2 comments)
Also mention package_data?
tripleee‭ wrote 11 months ago

Should this also cover the use of a package_data specification in the packaging files? It's slightly out of scope for the actual question, but arguably the "correct" solution for anything which gets properly packaged and distributed.

Karl Knechtel‭ wrote 11 months ago

That seems completely unrelated to me. Controlling where the file ends up when it's packaged (so that the existing code properly refers to its path) is a completely separate problem from actually specifying the path in the code. It should be addressed by a separate Q&A. I can imagine that question linking back here in order to explain the context of the packaging problem being solved; but I wouldn't make the link bidirectional.