Posted on 05-12-2011 03:36 PM
Hello All.
I am not what I would consider a good shell scripter. I do aspire to improve though and thought peer review might point out flaws and/or better/easier ways of doing things. I think a severe issue with my capabilities is that I have never really started from the beginning, building upon the basics. I usually just look at another script, decipher what is happening and, modify the syntax and relevant parts for what I need. Much of this is often a blink guess as I have no idea why some things are happening - I just follow the leader.
I've been cleaning up some scripts and thought it would be nice to hear from wiser people what might be improved upon. It might even help out others like me. So, with that in mind, I have a script which I am using to manage uptime on machines. It restarts machines that have no console user after 13 days and if machines have a logged in user, alerts users starting with e-mail and then moving to screen alerts. I have it running once a day on all machines using the every 15 trigger.
Original credit for the script goes to Steve Wood. It's been changed a bit to suit our needs. I would love to hear how it might be improved in functionality or design. Any takers?
#!/bin/bash
## Originally based off of a script by Steve Wood
## Modified a "a bit" by Aaron Robinson
## 2011/05/13
## SET DEFAULT VALUES - Default values will be overwritten via $4 and $5 passed
# Set the number of days at which you wish to alert users
# Alert via e-mail
emailwarn=6
# Alert via screen popup
popupwarn=12
# Restart days if no user logged in
restardays=13
# USE VALUES FROM $4 and $5 IF PASSED
# Use $4 for e-mail warning
if [ "$4" != "" ]; then
emailwarn=$4
fi
# # Use $5 for popup warning
if [ "$5" != "" ]; then
popupwarn=$5
fi
# set some variables and grab the uptime
days=uptime | awk '{ print $4 }' | sed 's/,//g'
num=uptime | awk '{ print $3 }'
consoleuser=ls -l /dev/console | awk '{ print $3 }'
# Check to see if it's been up for short time
if [[ $days != "days" ]]; then
logger -i "$0: Machine is up less than 2 days. No business here. Exiting."
exit 0
# Just restart if nobody is logged in and the machine has been up for 7+ days.
elif [[ $days = "days" && $num -gt $restardays && $consoleuser = "root" ]]; then
logger -i "$0: No user is logged in and machine is up for $num days. Restarting machine."
/sbin/reboot
exit 0
# If someone is logged in
elif [[ $days = "days" && $consoleuser != "root" ]]; then
# If we are here, then there's someone logged in and it's been more than a couple of days. Now check how long they've been logged in and alert accordingly.
# Now send the correct notices depending on days up if [ $num -gt $emailwarn ]; then
# we want the computer name to ID the computer in the mail message
computername=systemsetup -getComputerName
# set the message message1="You have been logged into $computername which has not been restarted for $num days. It is" message2="important to restart machines regularly to order to run efficiently and to apply updates " message3="that have been pushed in the background. Please restart your machine ASAP. Thank you."
# grab the current user and query AD for the email address
CurrentUser=ls -l /dev/console | awk '{ print $3 }'
email=dscl "/Active Directory/All Domains/" -read /Users/$CurrentUser | grep EMailAddress | awk '{ print $2 }'
touch /Library/Application Support/JAMF/Receipts/UpTooLong.pkg
echo "$message1
$message2
$message3" | mail -s "Machine Up Too Long" $email
if [ $num -gt $popupwarn ]; then
# Screen Popup to logged in user
/Library/Application Support/JAMF/bin/jamfHelper.app/Contents/MacOS/jamfHelper -windowType hud -title "PLEASE RESTART" -description "Your computer has been up for $num days. Please restart ASAP"
fi
fi
fi
# now run a recon
# /usr/sbin/jamf recon
exit 0
Posted on 05-12-2011 05:13 PM
Here are some changes I'd recommend, and why I recommend them (the full .diff is attached):
< #!/bin/bash
#!/bin/sh
• I generally avoid using /bin/bash as the invoking shell, to avoid compatibility concerns. These days, it's moot, because pretty much any computer that has /bin/sh has /bin/bash.
< emailwarn=6
< popupwarn=12
< restardays=7
emailWarn=6 popupWarn=12 restartDays=7
• I like consistency.
• "restardays" was misspelled.
/usr/bin/logger -i "$0: Checking uptime."
• The next log entry makes much more sense with this one in place.
< days=uptime | awk '{ print $4 }' | sed 's/,//g'
< num=uptime | awk '{ print $3 }'
< consoleuser=ls -l /dev/console | awk '{ print $3 }'
unit=$(/usr/bin/uptime | /usr/bin/awk '{ print $4 }' | /usr/bin/sed 's/,//g') num=$(/usr/bin/uptime | /usr/bin/awk '{ print $3 }') consoleUser=$(ls -l /dev/console | /usr/bin/awk '{ print $3 }')
• I think "unit" is a more appropriate name than "days" for this variable.
• I find $( ) to be more readable than ` `. They are equivalent, but the big difference is that $( ) can be nested.
• When writing a script that will run as root it's a very good practice to ALWAYS use full paths for commands (except built-in commands). This helps prevent malicious commands from getting escalated privileges.
< if [[ $days != "days" ]]; then
if [ "$unit" != "days" ]; then
• More quotes help keep input sanitized. Remember little Bobby Tables<http://xkcd.com/327/>.
• I've never used [[ ]]; only [ ]. I'm not sure what the difference is, but I know how [ ] works.
< logger -i "$0: Machine is up less than 2 days. No business here. Exiting."
/usr/bin/logger -i "$0: Machine has been up less than 2 days. No business here. Exiting."
• Grammar fix.
< exit 0
• This is unneeded, since control falls off the end of the script outside of the if.
< elif [[ $days = "days" && $num -gt $restardays && $consoleuser = "root" ]]; then
elif (( "$num" > "$restartDays" )) && [ "$consoleUser" == "root" ]; then
• $days check is superfluous; by the time you get here, it's already been checked.
• Arithmetic comparison (( )) is probably what you want when comparing numbers.
< logger -i "$0: No user is logged in and machine is up for $num days. Restarting machine."
/usr/bin/logger -i "$0: No user is logged in and machine has been up for $num $unit. Restarting machine."
• Grammar fix.
• Practically, using "days" here makes no difference. It just feels right to me to use $unit.
subject="Machine Up Too Long"
• If you're going to parameterize the message, you may as well parameterize the subject too.
< message2="important to restart machines regularly to order to run efficiently and to apply updates "
message2="important to restart machines regularly to order to run efficiently and to apply updates"
• Extra space at the end of the line.
< email=dscl "/Active Directory/All Domains/" -read /Users/$CurrentUser | grep EMailAddress | awk '{ print $2 }'
email=$(/usr/bin/dscl "/Active Directory/All Domains/" -read "/Users/$currentUser" EMailAddress | /usr/bin/awk '{ print $2 }')
• The grep is unneeded; just pull the info you want.
< touch /Library/Application Support/JAMF/Receipts/UpTooLong.pkg
/usr/bin/touch "/Library/Application Support/JAMF/Receipts/UpTooLong.pkg"
• I like quoting my paths when they have spaces.
popupTitle="PLEASE RESTART" popupMessage="Your computer has been up for $num $unit. Please restart ASAP."
• Parameterizing messages.
< # Screen Popup to logged in user
# Display popup to logged in user
• I think that's what you meant to say here.
< exit 0
• Exit status is 0 if the last command executed succeeded. If it didn't for some reason, it might be useful to not hide that fact.
You may want to check for other users who might be logged in (try $(/usr/bin/who | /usr/bin/grep console) ). You could loop through them and display notices or send emails to each. And it might not be wise to assume nobody is logged in just because the login window is showing.
These might not matter for you right now, as you might not have Fast User Switching enabled. But one day, you (or someone else) might enable it, and now you have to fix your script (once you've found the cause of the very subtle bug). Better to be prepared, since you can do so.
You may want to use osascript to display the dialog instead of jamfHelper. This way you can include a Restart button (which can execute a restart via Apple Event, to allow unsaved documents and such to be handled appropriately).
Posted on 05-12-2011 11:09 PM
Thanks for the review! It's been educational for me (and hopefully other novices) to see another way of doing this especially with the why.
I did a lot of restructuring and changing of the script (like the restart) and did think of the potential multiple user issue. I didn't address it because we don't allow fast user switching or ssh for standard users. You're right though - it's better to address it now than discover it later.
Anyone have other suggestions?
BTW, implementation of this has resulted in ~90% of the machines to an uptime of less than a week. It used to be around 50% or less.
Posted on 05-13-2011 05:34 AM
i usually use just awk in place of grep + awk constructs and prefer heredoc style blocks to multiline echo statements. a slightly modified version is here, and you can see the diff from the original.
https://gist.github.com/2369b8ce9ac1c2737bf4
does not restarting often cause significant problems for your users?
Posted on 05-13-2011 06:41 AM
Thanks for your version.
Yes, lack of restarts was often an issue. Most of the 100 or so machines are used heavily with the creative suite and iTunes, firefox, etc. I got tired of "fixing" random issues with a restart. Uptimes of months were not uncommon.
I've found myself increasingly trying to use shell scripts and the Casper suite to tackle things like this - so much that I think I'm going to read a book!