Obligatory badges

Del.icio.us links
The Woman I Live For
Some badges provided by
Subscribe with Bloglines
DorianX @ A MUD Forever Voyaging
Uploads uploaded with Filezilla
ACDSee: Best. Image. Viewer. Ever.
'11 Outback
Me Like Coffee
Technorati Profile
Globe of Blogs
My neighbors in Meatspace
Blogarama
Blogcatalog
Cafepress store for Moments out of Time
Buy me stuff!

Typekey Enabled
Powered By Movable Type
Creative Commons -- Some Rights Reserved
XFN Friendly
This weblog is licensed under a Creative Commons License. Some rights reserved.

Powered by Movable Type

February 5, 2012

Ross Codes! Sorting Human-Readable Numbers

I've been running linux at home for a few years now. One of the things I like best about it is that things tend to be built up from lots of little command line component programs instead of big GUI programs. This may seem like it makes it harder to use, but that's only true for things you only plan on doing once. If I want to, say, resize the 500 pictures I took of my little boy over the weekend (He is darned cute), I can do it with some big GUI tool where I load each picture, click resize, move some sliders, hit OK, click Save Aa, type in a new file name. Five hundred times. Or I can write this:

for x in *.jpg; do convert -geometry 1280x1024 "$x" output/"$x"; done

Having a rich command line available to me lets me do operations on large sets of data in batches, and that's a good thing because that's what computers are good at.

But that's a bit of a tangent. When I am working in linux, I often find myself dealing with big numbers. File sizes. Free memory. Free disk space.

Because I rip all my DVDs to the hard drive, I'm very concerned about free disk space. So I'll run "df":


Filesystem1K-blocksUsedAvailable Use%Mounted on
torchwood:/mnt/store04326436544365454553667189100885%/mnt/store0
saxon:/mnt/store1213056256010739686401056593920 51%/mnt/store1
saxon:/mnt/store221452451844670115841678233600 22%/mnt/store2
badwolf:/mnt/store357685754884445833216 132274227278%/mnt/store3

But those numbers start to get blurry after a while. Fortunately, df has an option that makes its output "human readable", "-h":



Filesystem1K-blocksUsedAvailable Use%Mounted on
torchwood:/mnt/store04.1T 3.5T 641G 85% /mnt/store0
saxon:/mnt/store1 2.0T 1.1T 1008G 51% /mnt/store1
saxon:/mnt/store2 2.0T 446G 1.6T 22% /mnt/store2
badwolf:/mnt/store3 5.4T 4.2T 1.3T 78% /mnt/store3

A lot easier to read. Several of the standard linux commands have a "-h" option -- ls, du, free has a similar "-m" option.

The disadvantage to using the human readable numbers flag is sorting. The standard command for sorting output, sort, has a flag (-n) that will make it handle numbers correctly. But if the numbers have been mangled into ugly human-readable form, this breaks, and suddenly 1G sorts below 10k.

So I wrote this quick-and-dirty little perl script which sorts the lines in a document, properly ordering numbers which have been converted into "human readable" format in the style done by df and du.

In case anyone finds it handy, This is hsort.

Copyright 2010