BASH Script Performace

Today we will look at some bash code snippests and the performance issues. Lets first look at the problem and the implemented solution:

Problem: We needed to log the output of the ps command for all the process’s. This was required to be done on per minute basis and the output was required in comma separated files. So, here is what was implemented:

        pslog=`ps -e -opid,ppid,user,nlwp,pmem,vsz,rss,s,time,stime,pri,nice,pcp:u,args|grep -v PID|sort -r -k 13,13`
        OLD_IFS=$IFS
        IFS=$'\n'
        logarr=( $pslog )
        for LOGLINE in ${logarr[@]}
        do
                LOGLINE=`echo $LOGLINE|awk '{OFS=",";print $1,$2,$3,$4,$5,$6,$7,:$8,$9,$10,$11,$12,$13,$14}'`
                echo $LOGLINE >> output
        done
        IFS=$OLD_IFS

This was working well and there were no issues. But suddenly we started seeing issues with the reported CPU usages. We would see that whenever this script was running the CPU usage was high, specially if there were too many process’s/thread’s on the system during that time.  This code was definitely part of a very large code base, and at this point of time we did not know what was causing the issues.

So, a drill started and luckily I was testing this on a Solaris box, so I took some dtrace scripts from DTraceToolkit and started looking at what was happening.  But before this we had already spent quite some time in trying to figure out the issues and fixing a lot of them, to no avail 🙁

After doing some dtracing, the problem was soon evident:

pslog=`ps -e -opid,ppid,user,nlwp,pmem,vsz,rss,s,time,stime,pri,nice,pcp:u,args|grep -v PID|sort -r -k 13,13`
        OLD_IFS=$IFS
        IFS=$'\n'
        logarr=( $pslog )
        for LOGLINE in ${logarr[@]}
        do
                LOGLINE=`echo $LOGLINE|awk '{OFS=",";print $1,$2,$3,$4,$5,$6,$7,:$8,$9,$10,$11,$12,$13,$14 }'`
                echo $LOGLINE >> output
        done
        IFS=$OLD_IFS

Looking at the code again, we store each line of the output in array and then cycle through the same. Do awk on each line to print the same and redirect the same to output file. Now, that looks like a bad design but should not cause drastic CPU usage, but alas that was the case.

Now, we changed the script to :

ps -e -opid,ppid,user,nlwp,pmem,vsz,rss,s,time,stime,pri,nice,pcpu,args|grep -v PID|sed -e 's/ \\{1,\}/,/g;'  >> output

A simple one liner does the job. After making this change I am not relived as we do not see any unexpected CPU peeks 🙂

Enhanced by Zemanta