UNIX Disk-Space Usage Report

Newtun

Storage is nice, especially if it doesn't rotate
Joined
Nov 21, 2002
Messages
484
Location
Virginia
Greetings, all.

"At work", I recently got an Email that a 1 TB file system was 96% full on a UNIX server, and I had one of the largest 10 files. But it was way under 1% of the total space.

It occurred to me that instead of the system admins reporting a few real big files, it might be better for them to report directories that use a lot of space (though individual files therein may not be top-10).

So I started writing a script to try to do that, and yesterday, it found one directory that was using about 6% of the disk. That directory's owner needs to do some cleanup, IMHO.

Anyway, the (korn-shell) script takes a file-name parameter (defaulting to the current directory), finds the file-system directory it's in, gets the files there (which should be immediate subdirectories for the most part), computes their total space usage, sorts by that, and reports the 20 biggest "space hogs". I do some checking to see that the subdirectories are in the same file system and that their total space computation only includes files in that file system (using the -x option of du).

If anyone has suggestions about how to improve the script, please let me know (the seds just add thousands-delimiting commas):

Code:
if test $# -eq 0
 then DIR=$PWD
 else DIR=$1
fi
df -k $DIR | sed 's/  *(/ (/' | sed -e :a -e 's/\(.*[0-9]\)\([0-9]\{3\}\)/\1,\2/;ta'
df -k $DIR|head -1|sed 's/  *)/)/'|read MNTPNT DEV COLON KBTOTALLOC REST
cd $MNTPNT
echo "Top 20 Space Report for $MNTPNT:  KB used by subdirectory"
echo "   KB Disk      Pct of    subdir      owner        Group       Owner"
echo "    Space      Tot Alloc   name         ID           ID         Name"
for DIR in *
 do
  if test -z "`grep \" $MNTPNT/$DIR \" /etc/mnttab`"
   then    ####    SAME FILE SYSTEM / MOUNT POINT    !!!!
        du -xks $DIR 2>&-
  fi
 done | sort -rn |
 while read SPACE DIR
  do
   ls -dl $DIR | read PERMS DUMMY OWNER GROUP REST
   NAME=`grep "^$OWNER:" /etc/passwd | cut -d: -f5 | cut -d "," -f1`
   PCT=`echo 2 k $SPACE $KBTOTALLOC 100 / / .005 + p | dc`
   SPACEY=`echo $SPACE | sed -e :a -e 's/\(.*[0-9]\)\([0-9]\{3\}\)/\1,\2/;ta'`
   printf "%12s%9.2f%%    %-12s%-12s%-12s%-s\n" $SPACEY $PCT $DIR $OWNER $GROUP "$NAME"
  done | head -20
 

Mercutio

Fatwah on Western Digital
Joined
Jan 17, 2002
Messages
22,269
Location
I am omnipresent
There are actually some pretty well known tools for looking at this stuff. I've used FileLight, which makes a lovely graphical display of usage.

On the other hand, your script looks like it would work perfectly well.
 

Newtun

Storage is nice, especially if it doesn't rotate
Joined
Nov 21, 2002
Messages
484
Location
Virginia
I'm just a lowly end-user, so I don't know what tools the admins have (or could get).

But as I thought about it some more, I might modify it somewhat. It was originally intended to check the main "/home" kind of directories for users' usage, but for some other file systems, it might be better to be able to "drill down" in directories other than the main mount-points'. For instance, looking at /fu might show 60% of the space is used by /fu/bar. So then, check out /fu/bar to see which of its subdirectories are using the most space.
 

Mercutio

Fatwah on Western Digital
Joined
Jan 17, 2002
Messages
22,269
Location
I am omnipresent
When I was a lowly user on a hugely multi-user system, I'd abuse the crap out of /tmp and /var, since I had permission to write there. I actually unpacked my home directory out of my mail spool and stored it on /tmp so I could bypass the 2MB disk quota on /home. I kept refining my script until, by the time I graduated, I basically had a 47MB (huge at the time; the whole machine might've had 1GB of disk space) persistent directory structure full of binaries that dozens of people were using for their shell sessions. I heard there was chaos when they finally deleted my account on that machine.
 

Newtun

Storage is nice, especially if it doesn't rotate
Joined
Nov 21, 2002
Messages
484
Location
Virginia
I don't trust /tmp or /var/tmp not to be flushed at random; this is a work environment.

But speaking of school days, in the mid-70's in math at Berkeley, I did take a few computer classes as well, using their "dumb" terminals (I don't even remember what kind, VT52s?).

Submitted for your amusement: already then, there was malware - spoofing/man-in-the middle. Somebody wrote a program to present a login screen on the terminal, collect and save the username and password, then submit them to log into a child process.
 
Top