bash progress monitor
I have a remote machine that is used to store and process XML files. Recently, I had need to duplicate a directory of XML files (e.g., cp -r a b). It’s not really germane to the subject here, but this particular server has a whack configuration and I gotta rant before I continue.
The office server (scrappy) has pretty good specs.
[scrappy ~]$ cat /proc/meminfo MemTotal: 3980800 kB [scrappy ~]$ cat /proc/cpuinfo processor : 0 model name : Intel(R) Core(TM)2 CPU 6600 @ 2.40GHz cpu MHz : 2394.000 cache size : 4096 KB processor : 1 model name : Intel(R) Core(TM)2 CPU 6600 @ 2.40GHz cpu MHz : 2394.000 cache size : 4096 KB [scrappy ~]$ cat /proc/scsi/scsi Attached devices: Host: scsi0 Channel: 00 Id: 00 Lun: 00 Vendor: SONY Model: DVD RW AW-Q170A Rev: 1.72 Type: CD-ROM ANSI SCSI revision: 05 [scrappy ~]$ cat /proc/ide/hd?/model ST3320620AS
Whoa! What’s my SATA drive doing attached to the IDE driver? When I compare to my home CentOS box (marmaduke), I see that its drives are connected differently. Yes, marmaduke
has one HDD connected via the IDE driver (ST3320620A) but that drive is a PATA drive. The four SATA drives are connected via SATA drivers. (The SATA drives will be configured as a software RAID 10, stay tuned. There’s a xen project in the making.)
[marmaduke ~]$ cat /proc/scsi/scsi Attached devices: Host: scsi2 Channel: 00 Id: 00 Lun: 00 Vendor: ATA Model: ST3500630AS Rev: 3.AA Type: Direct-Access ANSI SCSI revision: 05 Host: scsi3 Channel: 00 Id: 00 Lun: 00 Vendor: ATA Model: ST3300620AS Rev: 3.AA Type: Direct-Access ANSI SCSI revision: 05 Host: scsi4 Channel: 00 Id: 00 Lun: 00 Vendor: ATA Model: ST3300620AS Rev: 3.AA Type: Direct-Access ANSI SCSI revision: 05 Host: scsi5 Channel: 00 Id: 00 Lun: 00 Vendor: ATA Model: ST3300620AS Rev: 3.AA Type: Direct-Access ANSI SCSI revision: 05 [marmaduke ~]$ cat /proc/ide/hd?/model PIONEER DVD-RW DVR-111D ST3320620A
scrappy
was configured before arriving at the office by a friend of a friend who runs a PC shop. “But it was such a deal!” Yeah, right. Bunch of monkeys. How hard is it to configure the BIOS to use the SATA interface rather than the IDE interface?
Anyway, I don’t have time to rebuild scrappy
right now so I live with the dismal disk performance. Here’s the problem at hand. I have numerous XML files—some largish and some smallish. I have several sets and each set has about 4000 files.
[scrappy ~]$ ls src/*xml | wc -w 4323 [scrappy ~]$ ls -l src/*xml | sort -n -r -k5 -rw-r--r-- 1 kelly kelly 315804120 Dec 19 15:46 0001.xml -rw-r--r-- 1 kelly kelly 275651475 Dec 19 17:34 0002.xml -rw-r--r-- 1 kelly kelly 260250994 Dec 19 16:15 0003.xml -rw-r--r-- 1 kelly kelly 222402294 Dec 19 16:25 0004.xml -rw-r--r-- 1 kelly kelly 204642813 Dec 19 15:52 0005.xml . . . -rw-r--r-- 1 kelly kelly 1467 Dec 19 19:15 4321.xml -rw-r--r-- 1 kelly kelly 1467 Dec 19 16:01 4322.xml -rw-r--r-- 1 kelly kelly 1098 Dec 19 19:19 4323.xml
I wanted to duplicate the set of files as I needed to run some prototype code that I didn’t trust to be non-destructive. Simple.
[scrappy ~]$ cp -r src tgt
However, the disk performance is agonizing. So bad that I leave it while I work on another machine. But I want to know the progress and see it as it changes. With six to ten shells open, I want something that can be resized to use minimal screen real estate. I want a quick command line progress monitor.
bash
to the rescue. I didn’t want to create a script file so I just jack it right into the terminal’s command line. When you open the while
loop, bash will continue on the next line until you close it with the done
keyword.
[scrappy ~]$ while 'true'; do > ts=`date` > src=`ls src/*xml 2>/dev/null | wc -w` > tgt=`ls tgt/*xml 2>/dev/null | wc -w` > echo -ne " ${ts} ${src} ${tgt} \r" > sleep 1 > done Fri Jan 9 15:20:17 PST 2009 4323 2304
Recall we’ve previously covered that 2>/dev/null
hides the error message generated by ls
if no file is found.
The components are stored in local variables as a matter of convenience and displayed using echo.
echo
is passed two switches. The -n
switch supresses the trailing newline so that the cursor remains on the same line as the displayed text. The -e
switch causes backslashes in the text to be interpreted as the escape character. This is useful since I want to add a trailing carriage return character. This will push the cursor to the beginning of the line while remaining on the same line as the text.
After sleeping for one second, the script generates a new echo
output which overwrites the old text. I suppose I could add a test to the script to break when ${src}
equals ${tgt}
.
I don’t know why disk I/O is so slow on scrappy
. Perhaps the mode is set to use programmed I/O rather than DMA. Who knows? Who cares? Both scrappy
and marmaduke
have Intel ICH8 SATA controllers. scrappy
has a faster processor with more cache. Yet, marmaduke
smokes on disk throughput on either the SATA or IDE drives. Something is wonky.
I’d like to say that I can ignore the issue. I have way too much going on right now. But it bugs me.