Welcome to Software Development on Codidact!
Will you help us build our independent community of developers helping developers? We're small and trying to grow. We welcome questions about all aspects of software development, from design to code to QA and more. Got questions? Got answers? Got code you'd like someone to review? Please join us.
What is the point of pipx?
Background
Many Python programs now recommend installing with pipx, and there is a sense that you shouldn't install with pip anymore, you should use tools like pipx.
Main Question
However, what does pipx actually do that makes it such a preferred alternative to pip?
Thoughts
I checked their docs and what I found doesn't really make sense. In sum:
- pipx is described as a package manager, but package managers install files in system locations whereas pipx installs them in user locations. Also, package managers already have
python-...
packages where it makes sense. - It mentions that unlike pip, it is specifically for CLI apps. But what exactly does pip not do? AFAIK executable packages just have a wrapper script in
~/.local/bin/
that calls them. This doesn't seem worth a whole program. - It talks about PyPi as an "app store", which sounds weird. Yes, people can and do distribute on PyPi, but there are major differences which pipx hardly closes.
I do see that it mentions isolating envs. I can see how it is not straightforward with pip to install each CLI app in a venv, but also make it available in PATH. So is that all pipx is, CLI apps in venv? This seems like a rather inefficient way to handle packaging (see also "static link everything").
3 answers
The following users marked this post as Works for me:
User | Comment | Date |
---|---|---|
matthewsnyder | (no comment) | Sep 5, 2023 at 15:20 |
They are tools for different audiences. pipx
does not replace pip
.
In some more detail, pip
answers the question "As a Python developer, how can I install Python packages and their dependencies" whereas pipx
answers the question "As a user, how can I conveniently install a tool which is available via PyPi, without learning anything about Python, and without ending up in a situation where two packages I installed have trouble coexisting because they have conflicting or otherwise incompatible dependencies?"
Seen from this angle, it's pretty clear that the use cases are different.
pipx
takes care to encapsulate each installation so that, behind the scenes, the installation effectively has its own virtual environment which gets activated when you run a pipx
-installed command. Thus, any dependencies pipx
installs are specific to, and separated from, any other Python packages installed elsewhere on your system somehow.
Even as a Python developer, I have some tools that I install because they are convenient to have available on my system, not particularly because they help me with Python or as a developer. The CLI for Amazon AWS is a good example.
Also, on shared servers, I can install tools like ruff
without messing with the system or breaking anything for other users.
Any instructions which recommend installation with pipx
, then, are meant for consumers of the utility package. The instructions basically imply, "if you know how to use pip
, and prefer to use that for your use case (for example, to install this package as a dependency for a Python project of your own), by all means use pip
instead if you like."
Obviously, pipx
only makes sense for packages which are useful as a standalone CLI utility. For your Python development needs, pip
remains the recommended installation tool, and the only one which makes sense for a library you want to use from your own Python code, directly or indirectly.
Well, to start, it is not an alternative to pip
. It's built on top of pip
and exclusively deals with applications. pip
is more of a development tool, while pipx
is aimed at end-users (who may also be developers).
The bulk of your question seems to be complaining that loose analogies aren't tight analogies. I don't know why you're taking these analogies so strongly. Here is the relevant text from that page: "It's roughly similar to macOS's brew, JavaScript's npx, and Linux's apt." and "In a way, it turns Python Package Index (PyPI) into a big app store for Python applications." (emphasis mine)
As a final note before turning to more substantive topics, an application does not need to be large and complicated to be valuable. In this case, the main value proposition is ergonomics/ease-of-use. A secondary benefit is that it provides an abstraction layer. You don't need to know how it works to use it, and that means how it works can change without changing how you use it.
So, why use this instead of pip
?
Mainly, pip
provides no isolation. This makes it a headache to deal with different libraries and applications that have dependencies on different versions of the same libraries. It also means installing or upgrading a package might unwittingly break something else. Common practice is to use tools like virtualenv, Conda, venv, and others to get isolation, but these are much more oriented to developers. Using these together to run an application with isolation would require several commands and manually keeping track of various "environments". This "environment" concept is great for developers who want to install several libraries within a single environment, but for applications there's (usually) no reason to have different applications "see" the same environment. This "environment" concept just complicates the flow that pipx
is aiming for.
Okay, why use this instead of a system package manager?
Mainly, most package managers need someone else to package things for you. If they haven't, then oh well. Package managers also usually don't provide isolation and make it very difficult to have multiple versions of the same "package". Their ethos is usually provide a "known good" selection of packages, but this makes them slow to change and often incomplete and out-of-date. A small but real annoyance is often system packages have slightly different names than the corresponding Python packages meaning you need an extra "look up the corresponding system package" step to installing something you see on PyPI or GitHub.
What does pipx
provide?
- Easy Python application installation with no need to worry about conflicting versions of things.
- Addition of such applications to the
PATH
allowing them to be used seamlessly. - Easy uninstallation and upgrade.
- Use of the PyPI namespace.
- Ephemeral environments for one-off executions.
- For developers of Python applications, less need to deal with unknown mixtures of libraries.
How does pipx
work?
I refer you to How pipx works, but, as a quick summary as of August 2023, it does the "obvious" thing. It installs each application in its own virtual environment via venv
. It then adds symbolic links to ~/.local/bin
. It is very slightly cleverer than that and will reuse a virtual environment for the packaging tools themselves.
This does mean you will have duplication of library code. This is a common problem with many language-specific package managers, not just Python's. It is a real problem as it is very easy to have many gigabytes of wasted space in duplicated dependencies. There's also a lot of time wasted installing duplicate dependencies. The best solution I've seen to this problem is to adopt Nix's approach. The key to this approach is to allow not just installing different versions of the same library side-by-side, but also the same library version with different dependencies side-by-side. This allows a single shared environment where dependencies are never duplicated, though you can definitely have multiple very similar libraries installed. This requires all dependencies to be easily identifiable and builds to be deterministic which are properties that are hard to ensure in the Python ecosystem. The only language-specific package manager I'm aware of that uses this approach is Haskell's cabal-install
.
As I mentioned before, pipx
could, in theory, transparently switch to such an approach if it ever became available.
Why not use pipx
?
There are plenty of use-cases pipx
is not aimed at where it would be unnecessary or irrelevant, e.g. system images in an immutable infrastructure setup. For the use-case it is targeted at, i.e. personal machines, the only good options are, if possible, restrict yourself to "standard" system packages, or use pipx
. The other options are to manually do what pipx
does or forgo isolation and enter dependency hell.
Dependency conflicts are the problem pipx aims to solve, in the context of installing CLI programs.
When you install a Python package, by default pip will also install their dependent packages so that you don't get ImportError
s when trying to use the package. These dependencies are explicitly configured by the package developer. When you install multiple packages that both have the same dependency, the version of that dependency may complicate things.
For example, say you pip install foo
which depends on somedep>=2
and pip decides to install somedep 2.3.1
. Then you pip install bar
which requires somedep==1.2.3
. To ensure that bar
works, pip will uninstall somedep 2.3.1
and instead install somedep 1.2.3
. Presumably, foo
is incompatible with somedep 1.*
hence the constraint, so foo
will now stop working. At a high level, the problem here is that foo
and bar
are actually mutually exclusive due to a dependency conflict.
The classic Python solution to dependency conflicts is to create separate virtual environments for foo
and for bar
. But if the Python package happens to be a CLI tool, your shell will not see the command until you activate the virtual environment.
pipx
saves you from this extra step by automatically putting each package in a venv and providing wrapper scripts that run it.
Notably, the python-...
packages many distros provide are also vulnerable to the dependency conflict problem. However, they also have more wiggle room for workarounds. For example, they can ignore the dependency versions specified in the package, and provide their own dependencies which have better compatibility. In Python, dependencies are not detected automatically by usage, but specified arbitrarily by the developer. Often, Python developers mistakenly make requirements too general or too specific, which makes it harder to find compatible versions. By ignoring the original developer's (incorrect) version specs, it becomes easier to find a workable set of dependencies. Of course, distro maintainers can also incorporate venvs into python-...
packages to get around the problems. All of this and more is feasible when you are creating a distro package, but can be tedious when manually installing packages, hence pipx
exists for the latter case.
1 comment thread