Monday, July 11
extracting all (?) of the filenames from packages available in debian
Note: See 2020/2/25 for an updated version that reflects Debian 10.0 (Buster).
I’m thinking about renaming a command-line utility, and I want to pick something that isn’t already taken. I decided that getting a list of the command names available in Debian packages, maybe with the addition of some lists like Wikipedia’s List of Unix commands, would be a decent start at this.
I thought that maybe
apt-cache could give me what I wanted, but a quick look
at the man page wasn’t that helpful. Google brought me to this writeup by
Kevin van Zonneveld on
apt-file, which lets you search for
packages by filenames they contain. Something like so:
$ sudo apt-get install apt-file $ apt-file update $ apt-file search /some/path
This will show you packages which provide files containing
supposed to take patterns (Perl regexps? The usual
grep flavor? POSIX?), but
I’m not quite sure whether I ever got it to just print all filenames.
Eventually I figured out that it keeps a cache of filenames in
$ cd ~/.cache/apt-file && ls ftp.us.debian.org_debian_dists_jessie_contrib_Contents-amd64.gz ftp.us.debian.org_debian_dists_jessie_main_Contents-amd64.gz ftp.us.debian.org_debian_dists_jessie_non-free_Contents-amd64.gz ftp.us.debian.org_debian_dists_jessie-updates_main_Contents-amd64.gz
These have some commentary at the top, followed by data in the following format:
FILE LOCATION bin/ash shells/ash bin/bash shells/bash bin/bash-static shells/bash-static bin/bsd-csh shells/csh ...
Here is a dumb command to get the names of commands from these files:
zcat ~/.cache/apt-file/*.gz | \ egrep '^(usr/bin/|sbin/|bin/)' | \ cut -f1 -d' ' | \ perl -pe 's/^(.*)\/(.*)$/$2/' | \ sort | uniq > used_names.txt
Here is the resulting list: used_names.txt.