It’s been noted I tend to fly off the handle a bit at The Worlds Largest Software Company, deservedly or not. But lest you, dear reader, feel I am one-sided in my condemnations, let us touch on some poor decisions made in the Linux camp as well.
Long ago in a college not so far away, it was decided that most of the proprietary Unix tools could be rewritten under the GPL, and made free. The tools chosen for this transition were the very basics of Unix shell operation. The command line programs that us unix geeks use day by day. We’re talking about the GNU Coreutils.
Now, in general, I consider the rewrite to be a good idea. The tools are aptly named – they are the core of the Unix environment. We’re talking ls and who here. Very basic stuff. But naturally, when a couple programmers decides to rewrite something, they can’t help but ‘improve’ on them a little. This means adding some new features, throwing in some little tidbits to make the tool a little more interesting.
None of these ‘features’ was ever really vetted or examined as to whether they made sense or not, they were just tossed in willy nilly, and now are in every Linux and Unix distribution on the planet.
I take as a prime example the addition of the ‘-h’ option to ‘ls’. In it’s basic concept, it sounds like a great idea. Lets add a ‘human readable’ format to display the size of a file. Instead of counting decimal places to find out what order of magnitude a file is, just ‘ls -lh’ it, and you get a readable form:
dbs@boomer:~$ ls -l mbox
-rw------- 1 dbs dbs 8629133 2006-02-23 18:33 mbox
dbs@boomer:~$ ls -lh mbox
-rw------- 1 dbs dbs 8.3M 2006-02-23 18:33 mbox
Simple, eh? Well, sure, except when you realize ‘ls’ is rarely used on just a single file. It’s used to compare and list out large directories, sorting things by size, or getting an overview of what you’re looking at. The bright lights who wrote the ‘-h’ option into ‘ls’ apparently had never considered anything approaching a human interface guideline, so we end up with some serious readability problems. Remember, this is meant to be HUMAN READ. I give for reference, an example directory listing, taken from my home dir:
-rw-r--r-- 1 dbs dbs 58 2005-11-23 23:13 cipher.txt
-rw-r--r-- 1 dbs dbs 40K 2005-07-05 22:52 claimit-backup.tgz
-rw-r--r-- 1 dbs dbs 3.8K 2005-06-28 14:55 claimit.dump
-rw-r--r-- 1 dbs dbs 6.6K 2005-07-05 23:48 claimit.tgz
-rw-r--r-- 1 dbs dbs 162K 2005-09-09 12:07 commons-collections.jar
-rw-r--r-- 1 dbs dbs 3.6M 2006-01-10 10:25 congo-20060109.tgz
-rw-r--r-- 1 dbs dbs 4.7M 2005-07-25 22:41 cvsdir.tgz
-rwxr-xr-x 1 dbs dbs 3.1K 2005-11-20 10:55 dbs@boomer.homeport.org
-rw------- 1 dbs dbs 1.8K 2006-01-17 12:11 dead.letter
-rw-r--r-- 1 dbs dbs 12K 2005-09-02 12:37 decisions.dump
drwxr-xr-x 20 dbs dbs 4.0K 2004-10-18 12:09 docs
drwxr-xr-x 5 dbs dbs 4.0K 2006-01-13 23:19 dumps
-rw------- 1 dbs dbs 509 2005-07-21 11:56 INBOX.Drafts
-rw------- 1 dbs dbs 22K 2006-02-09 12:52 INBOX.Sent
-rw------- 1 dbs dbs 3.0M 2006-02-22 19:57 INBOX.Trash
The formatting is inconsistent. A file that is not an order of magnitude (an ‘M’, or a ‘K’ or a ‘G’), has no extension at all, presuming it means ‘Bytes’. It wouldn’t have been hard to put ‘B’ at the end for consistency, but that didn’t cross their minds. Additionally, some entries include a decimal point (3.0M), and others do not (22K). WHY?
The characters chosen are in capitals. This does not differenciate them from the digits in any meaningful way, so it’s very easy to mistake a letter for a digit, and you have to look very carefully to get real information out of the listing.
Because of the mixed formatting, it’s almost impossible to, at a glance, determine real file sizes. Looking at that listing, are there any files that look unusually large or small? I can’t tell. Geeks will point out that “well, if you’re looking for large files, you should have sorted by size” – well certainly, if I had a specific question, yes, but what’s the point of making a ‘human readable’ format that a human can’t read?
This format is now well established in the Unix world, and probably will never go away. I’m assuming some shell hack wrote it 15 years ago on a lunch break, and it will never be removed or updates. ‘ls’ is such a core utility, changing any of its functionality to remove or alter the output will raise a huge outcry from script-writers everywhere, because scripts that had been running perfectly for years will suddenly break.
Sure this is a small thing. I’m picking nits. But when I see BAD DESIGN decisions, I feel it’s my moral duty to stand up and foam at the mouth about it. Thank you, and good night.