Sunday, July 12, 2009

Retrieving a List of User/Manually Installed Packages in Debian/Ubuntu

If you are like me and have been running a Linux system for a while, you inevitably come to a point sooner or later where you would really like to know what packages you have been installing - yourself. In the case of Debian (and friends like Ubuntu), the installer pulls in a lot of packages, and there is no special indication (AFAIK) which packages were installed by the installer, and which packages were installed manually by the user. You might want the list in order to quickly get up to speed on a new installation, or, even more probably, clean up a bit among all those packages that you installed because you thought you needed them, or needed them only once or twice.

I have seen quite a lot of suggestions that you should work with the entire list of installed packages, but that is extremely cumbersome if your intention is to clean up, and if you are dealing with a new installation, you risk pulling in obsolete or conflicting packages. What I wanted was a list with just the packages that I had chosen to install myself.

My solution is a bit brute force, but it works very well. Using a virtual machine manager (I use VirtualBox since it is free), I set up a virtual machine with a similar installation as the one for which I wanted to get the list of installed packages, only in the case of the virtual machine installation, I will never install any packages myself, only keep the base installation up to date. On the virtual machine, I created a list of installed packages with the following command (borrowed from here with slight modification):

aptitude search ~i | grep -v "^i A" | cut -d " " -f 4 > clean.txt

aptitude search ~i lists the installed packages, grep -v "^i A" removes the lines starting with "i A" (automatically pulled in dependencies), and cut -d " " -f 4 > clean.txt filters out the package names (starting at position four and ending with a space). The result is written to the file clean.txt. Perform a similar command on the machine for which you want to retrieve the list of installed packages, substituting clean.txt with another file name, say modified.txt. With the two files in the same directory (on whichever machine), perform the following command to get the list of packages that are listed in modified.txt, but not in clean.txt (here is a reference to set operations available from the command line, and other useful commands):

join -v1 modified.txt clean.txt

If anyone has a simpler solution to the problem, please let me know, but once set up, this works like a charm.

2 comments:

  1. Nah!! Super linux makes you take a notebook and a pencil and write down the name of each package you manually, explicitly, install so that the next install you can install them again. Hahah too much linux for little things.

    ReplyDelete
  2. Anyway, take out programs: grep and cut, like this:

    $ aptitude search '~i !~M'

    also:

    $ aptitude search '~i !~M !~pstandard !~pimportant !~prequired' > miniclean

    ReplyDelete