On a regular basis, I find myself the odd-man-out when it comes to talking about how to work with Python on Debian systems. I'm going to write this and post it so that I might be able to point people at my thoughts without having to write the same email in response to each thread that pops up.
Turns out I don't fit in with the Debian hardliners (which is to say, the
pip sucks and shouldn't exist), nor do I fit in with the Python
hardliners (which is to say
dpkg are out of date, and neither have
a place on a Development machine).
I think our discourse on this topic has become petty and stupid in general. Let's all try to step back and drop a bit of the attitude.
pip doesn't suck, and neither does
The truth is, both sides are wrong. As with any subject, the real answer here is much more nuanced than either side presents it. I'm going to try and present my opinion on this, in the way that both my Pythonista self and my Debianite self see the issue. Hopefully I can keep this short, to the point, and caked with logic.
The case for
dpkg (the Debianite in me)
In defense of
apt, imagine having to install
on all your systems when you install. It'd be hell on earth.
Imagine having a user try to do this. It's insane to assume that
end-users will be using
pip for this purpose.
pip is fun and all, but it's also installing 100% untrusted code to your
system (perhaps as root, if you're using
sudo for some reason),
and hasn't been reviewed for software freeness, which is something Debian
(and Debian users) take seriously. This isn't even to mention the hell that
pip wreaks on
dpkg controlled files / packages.
Try to remember how much of your system running (yes, right now) is running
because of Python or Python modules. Try to imagine how much of a pain in
the ass it'd be if you couldn't boot into
GNOME to use
nm-applet to connect
to wifi to
pip install something. I'm sure even the most extreme pip'er
understands the need for Operating System level package management.
Debian also has a bigger problem scope - we're not maintaining a library
in Debian for kicks, we're maintaining it so that end user applications may
use the library. When we update something like
Django, we have to make sure
that we don't break anything using it (although, to be honest, the fact that we
package webapps is an entire rant for later) before we get to update it to the
Hell, with a few coffees, I could automate the process of releasing a
with a new upstream release, 100% unattended. I won't, however, since this is
an insane idea. Let's go over a brief list of things I do before uploading a
- Review the entire codebase for simple mistakes.
- Review the entire codebase for license issues.
- Review the entire codebase for files without source, and track down
(and include source for) any sourceless files (such as
- Get to know the upstream, get to know open bugs, write something using the lib, in case I need to debug later.
- Install the package.
- Test the package.
- Work out any Debian package issues (this is easy).
Now, a brief list of things I do before I update a package:
- Review the changes between the last uploaded version (in diff format, if it's sane, otherwise get the VCS and review), ensure all the above are still OK.
- Review for Debian-local issues (such as how it will upgrade, using
- Check to make sure it won't break any reverse dependencies.
- Review for bugfixes that I might need to bring back to the
- Figure out if I should (or even can) backport the package, if API is stable.
- Review for bugs (upstream or in Debian) that I need to mention in the debian/changelog.
Clearly, this isn't a quick-and-dirty task. It's not a matter of getting a
package updated (technically), it's a much more detailed process than that.
This is also why Debian is so highly regarded for its technical virtuosity,
and why the
ISS decided to deploy Debian in space,
despite other commercial distros such as
Red Hat, or
community distros, such as
It's also not Debian's job to package the world in the archive. This is an insane task, and it's not Debian's place to do it. We introduce libraries as things need them, not because we wrote some new library that someone may find slightly useful at some point in the future. maybe.
Upstream developers and language communities (not only Python here) tend to lose sight of why we're doing this in the first place, which is our users. This isn't some sort of technical pissing contest to see who can distribute the software in the best way. Debian-folk always keep end users as our highest priority.
I quote the
Debian Social Contract, when I say
that Our priorities are our users and free software. No one's trying to
get developers to use
dpkg to create software. In fact, as you'll see
below, I actively discourage using system modules for development.
The case for
pip (the Pythonista in me)
In defense of
pip, the idea that Debian will keep the latest versions of
packages is insane. The idea that we can keep pace with upstream releases is
nuts, and the idea that every upstream release on
pypi is ready to ship is
As a developer, I don't want to support every release, and I surely don't want
other people depending on some random snapshot.
Often times, I'll put stuff up on
pypi as a preview, or to release often, and
solicit feedback without having to give out instructions on using a
checkout (it's also easier to have them try a version from
pypi so I can
cross-ref the git tag to reproduce issues when they file them)
pypi is easy, ubiquitous and works regardless of the platform, which means
less of my development time is spent packaging stuff up for platforms I don't
really care about (
Windows), even though I value
feedback from users on those systems. The effort it takes to release something
is limited to
python setup.py sdist upload, and it's in a place (and in a
shape) that anyone can use it without having 10 sets of platform-local
Even ignoring all the above, when I'm writing a new app or bit of code,
I want to be sure I'm targeting the latest version of the code I depend on,
so that future changes to API won't hit me as hard. By following
along with my dependencies' development, I can be sure that my code breaks
early, and breaks in development, not production. Upstreams also tend to not
like bug reports against old branches, so ensuring I have the latest code from
pypi means I can properly file bugs.
Lastly, I prefer
virtualenv based setups for development, since I'm usually
working on many things at once. This often means version mismatches in
libraries, which brings in API changes (another whole rant here as well).
I don't want to keep installing and uninstalling packages to switch between
the two projects, and using a
chroot(8) means a lot of overhead and that it's
disconnected from my development environment / filesystem, so I resort to
virtualenv to isolate my Development environment.
I don't want to keep arguing about this. Just accept that the world's a big
place and that there exist use-cases that both
pip need to exist
and work in the way they're working now. At the very least, try and understand
there exist smart people on both sides, and no one is trying to screw anyone
over or keep their own little private club to themselves. Hopefully, going
forward, we can make sure that the integration between these two tools gets
better, not worse.
Help make this dream a reality. Contribute to a productive tone, not a destructive one. In short:
sudoalways. Don't tell people to use
dpkgwhen deploying system-wide.
- Understand people are going to package, and they will be more concerned about software using your library then keeping your library up to date.
- Understand Debian Developers and package maintainers have to do a lot of work when updating or sponsoring an upload.
- Understand upstream developers can't be bothered to fix every issue with every release (release early, release often) with some snapshot you introduced into unstable.
virtualenvin development setups, so we can upgrade your app when we upgrade the lib.