Linux Annoyances du jour…

It’s been noted I tend to fly off the handle a bit at The Worlds Largest Software Company, deservedly or not. But lest you, dear reader, feel I am one-sided in my condemnations, let us touch on some poor decisions made in the Linux camp as well.

Long ago in a college not so far away, it was decided that most of the proprietary Unix tools could be rewritten under the GPL, and made free. The tools chosen for this transition were the very basics of Unix shell operation. The command line programs that us unix geeks use day by day. We’re talking about the GNU Coreutils.

Now, in general, I consider the rewrite to be a good idea. The tools are aptly named – they are the core of the Unix environment. We’re talking ls and who here. Very basic stuff. But naturally, when a couple programmers decides to rewrite something, they can’t help but ‘improve’ on them a little. This means adding some new features, throwing in some little tidbits to make the tool a little more interesting.

None of these ‘features’ was ever really vetted or examined as to whether they made sense or not, they were just tossed in willy nilly, and now are in every Linux and Unix distribution on the planet.

I take as a prime example the addition of the ‘-h’ option to ‘ls’. In it’s basic concept, it sounds like a great idea. Lets add a ‘human readable’ format to display the size of a file. Instead of counting decimal places to find out what order of magnitude a file is, just ‘ls -lh’ it, and you get a readable form:

dbs@boomer:~$ ls -l mbox
-rw-------  1 dbs dbs 8629133 2006-02-23 18:33 mbox
dbs@boomer:~$ ls -lh mbox
-rw-------  1 dbs dbs 8.3M 2006-02-23 18:33 mbox

Simple, eh? Well, sure, except when you realize ‘ls’ is rarely used on just a single file. It’s used to compare and list out large directories, sorting things by size, or getting an overview of what you’re looking at. The bright lights who wrote the ‘-h’ option into ‘ls’ apparently had never considered anything approaching a human interface guideline, so we end up with some serious readability problems. Remember, this is meant to be HUMAN READ. I give for reference, an example directory listing, taken from my home dir:

-rw-r--r--   1 dbs dbs     58 2005-11-23 23:13 cipher.txt
-rw-r--r--   1 dbs dbs    40K 2005-07-05 22:52 claimit-backup.tgz
-rw-r--r--   1 dbs dbs   3.8K 2005-06-28 14:55 claimit.dump
-rw-r--r--   1 dbs dbs   6.6K 2005-07-05 23:48 claimit.tgz
-rw-r--r--   1 dbs dbs   162K 2005-09-09 12:07 commons-collections.jar
-rw-r--r--   1 dbs dbs   3.6M 2006-01-10 10:25 congo-20060109.tgz
-rw-r--r--   1 dbs dbs   4.7M 2005-07-25 22:41 cvsdir.tgz
-rwxr-xr-x   1 dbs dbs   3.1K 2005-11-20 10:55 dbs@boomer.homeport.org
-rw-------   1 dbs dbs   1.8K 2006-01-17 12:11 dead.letter
-rw-r--r--   1 dbs dbs    12K 2005-09-02 12:37 decisions.dump
drwxr-xr-x  20 dbs dbs   4.0K 2004-10-18 12:09 docs
drwxr-xr-x   5 dbs dbs   4.0K 2006-01-13 23:19 dumps
-rw-------   1 dbs dbs    509 2005-07-21 11:56 INBOX.Drafts
-rw-------   1 dbs dbs    22K 2006-02-09 12:52 INBOX.Sent
-rw-------   1 dbs dbs   3.0M 2006-02-22 19:57 INBOX.Trash
  • The formatting is inconsistent. A file that is not an order of magnitude (an ‘M’, or a ‘K’ or a ‘G’), has no extension at all, presuming it means ‘Bytes’. It wouldn’t have been hard to put ‘B’ at the end for consistency, but that didn’t cross their minds. Additionally, some entries include a decimal point (3.0M), and others do not (22K). WHY?
  • The characters chosen are in capitals. This does not differenciate them from the digits in any meaningful way, so it’s very easy to mistake a letter for a digit, and you have to look very carefully to get real information out of the listing.
  • Because of the mixed formatting, it’s almost impossible to, at a glance, determine real file sizes. Looking at that listing, are there any files that look unusually large or small? I can’t tell. Geeks will point out that “well, if you’re looking for large files, you should have sorted by size” – well certainly, if I had a specific question, yes, but what’s the point of making a ‘human readable’ format that a human can’t read?

    This format is now well established in the Unix world, and probably will never go away. I’m assuming some shell hack wrote it 15 years ago on a lunch break, and it will never be removed or updates. ‘ls’ is such a core utility, changing any of its functionality to remove or alter the output will raise a huge outcry from script-writers everywhere, because scripts that had been running perfectly for years will suddenly break.

    Sure this is a small thing. I’m picking nits. But when I see BAD DESIGN decisions, I feel it’s my moral duty to stand up and foam at the mouth about it. Thank you, and good night.

  • About

    A wandering geek. Toys, shiny things, pursuits and distractions.

    View all posts by

    4 thoughts on “Linux Annoyances du jour…

    1. What drives me crazy is the fact that ‘du -sh * | sort -n’ is useless.
      However, your rant moved me to go read the man page and figure out ‘ls -lSrh’…

    2. Lots of good points, esp. with respect to some of the inconsistency issues, but here’s the big counterargument for me:
      1. The -h flag makes individual entries much, much easier to read. It is a lot easier to know that something is 450MB, or 4.5G, than to count digits and figure out that there are 9 or 10 digits there. I agree that it’s less useful on long lists.
      2. It makes a great deal of sense to give the -h flag the same meaning on all tools that report disk consumption: df, du, ls, and so on.
      I hardly ever use ls -h, and when I do it is usually on only a single file. It’s *invaluable* with du and df, however. If I want to get a quick sense of the relative sizes of lots of files, I’ll leave off the -h flag and just use ls -l.

    3. Additionally, some entries include a decimal point (3.0M), and others do not (22K). WHY?
      Perhaps the writer chose to use a minimum of two digits of precision. To me, that is what “3.0” indicates.

    4. JB sez:
      What drives me crazy is the fact that ‘du -sh * | sort -n’ is useless.
      Agreed, but that’s a sorting / processing question, and is easily done with just using du -ks * | sort -nr

    Leave a Reply

    Your email address will not be published. Required fields are marked *