Monday, July 11
extracting all (?) of the filenames from packages available in debian
Note: See 2020/2/25 for an updated version that reflects Debian 10.0 (Buster).
I’m thinking about renaming a command-line utility, and I want to pick something that isn’t already taken. I decided that getting a list of the command names available in Debian packages, maybe with the addition of some lists like Wikipedia’s List of Unix commands, would be a decent start at this.
I thought that maybe apt-cache
could give me what I wanted, but a quick look
at the man page wasn’t that helpful. Google brought me to this writeup by
Kevin van Zonneveld on apt-file
, which lets you search for
packages by filenames they contain. Something like so:
$ sudo apt-get install apt-file
$ apt-file update
$ apt-file search /some/path
This will show you packages which provide files containing /some/path
. It’s
supposed to take patterns (Perl regexps? The usual grep
flavor? POSIX?), but
I’m not quite sure whether I ever got it to just print all filenames.
Eventually I figured out that it keeps a cache of filenames in
~/.cache/apt-file/
:
$ cd ~/.cache/apt-file && ls
ftp.us.debian.org_debian_dists_jessie_contrib_Contents-amd64.gz
ftp.us.debian.org_debian_dists_jessie_main_Contents-amd64.gz
ftp.us.debian.org_debian_dists_jessie_non-free_Contents-amd64.gz
ftp.us.debian.org_debian_dists_jessie-updates_main_Contents-amd64.gz
These have some commentary at the top, followed by data in the following format:
FILE LOCATION
bin/ash shells/ash
bin/bash shells/bash
bin/bash-static shells/bash-static
bin/bsd-csh shells/csh
...
Here is a dumb command to get the names of commands from these files:
zcat ~/.cache/apt-file/*.gz | \
egrep '^(usr/bin/|sbin/|bin/)' | \
cut -f1 -d' ' | \
perl -pe 's/^(.*)\/(.*)$/$2/' | \
sort | uniq > used_names.txt
Here is the resulting list: used_names.txt.
more: used_names.txt