UNIX/Linux & C Programming:
Chapter nn: Shell Programming



Coverage: [UPE] Chapters 3 and 5


Shell scripts

  • a shell script (or shell program) is a series of UNIX commands placed in an ASCII text file
  • each shell (ksh, bash, or csh) provides mechanisms for control (e.g., if, while, and for statements)
  • UNIX commands (filters) + shell variables + control mechanisms = powerful (interpreted) programming language
  • see [UIAN] Chapter 4 (Nutshell) or [PGUS] Chapter 12 (Sobell) for ksh information


return vs. exit

  • same difference as in C (i.e., same semantics in main; different semantics in functions)
  • return allows you return a value from a function
  • exit exits the current shell entirely


Command-line arguments

  • arguments given to a shell script on the command line when it is invoked are available through the variable $* (a space separated list) and "$@" (a list with each argument double quoted separately)
  • individual arguments to the shell script are referenced as $1, $2, $3, ..., $9 and $0 is the name of the shell script itself
  • $# stores the number of command line arguments (i.e., the shell's analog to C's argc minus the command name)
  • $# indicates how many arguments were passed
  • examples:
    $ echo $* # prints all the command-line arguments
    $ echo $# # the number of command-line arguments (does not include the command name)
    $ echo $0 # prints the command name
    $ echo $1 # prints the first command-line argument
    $ shift n # shifts the arguments left by n (e.g., if n = 1, arg1 = arg2, arg2 = arg3, and so on)
    


"$*" vs. "$@"

  • same when unquoted: all arguments on command line, except the command name
  • "$*": all arguments on command line as one string ("$1 $2 ...")
  • "$@": all arguments on command line, individually quoted ("$1" "$2" ...)


for loops

  • a for loop can be used to iterate over all items in a list or array
  • syntax:
    for var [in list]
    do
       statements
    done
    
  • if in list is omitted in a for loop in a script, the list is assumed to be $* (i.e., all of the command line arguments to the script)
  • do and done must be on lines by themselves (or use ; statement separator)
  • examples:
    for name in Lucy Linus Lucia Larry Leisel; do
       print "Next person is $name."
    done
    exit 0
    
    #!/usr/bin/env ksh
    #print all arguments to a shell script
    for arg in $*; do
       print $arg
    done
    exit 0
    


Illustrative script

    #!/usr/bin/env ksh
    
    echo '$* is ' $*
    echo '$@ is ' $@
    print '$# is ' $#
    print "The number of arguments to $0 was $#."
    
    print $0
    print $1
    print $2
    print $3
    print $#
    
    print
    
    #for file
    #for file in "$*"
    for file in "$@"
    do
       echo $file
    done
    
    exit 0
    
    $ ./prog "a b" c d
    $* is a b c d
    $@ is a b c d
    $# is 3
    The number of arguments to ./prog was 3.
    ./prog
    a b
    c 
    d
    3
    
    a b
    c 
    d
    


String operators

${<varname>:-<word>} if <varname> exists and is not null, return its value; otherwise return <word>
${<varname>:=<word>} if <varname> exists and is not null, return its value; otherwise set it to <word> and then return its value
${<varname>:?<message>} if <varname> exists and is not null, return its value; otherwise print <varname>: followed by <message>, and abort the current command or script
${<varname>:+<word>} if <varname> exists and is not null, return <word>; otherwise return null
${<varname>#<pattern>} if <pattern> matches the beginning of the variable's value, delete the shortest part which matches and return the rest
${<varname>##<pattern>} if <pattern> matches the beginning of the variable's value, delete the longest part which matches and return the rest
${<varname>%<pattern>} if <pattern> matches the end of the variable's value, delete the shortest part which matches and return the rest
${<varname>%%<pattern>} if <pattern> matches the end of the variable's value, delete the longest part which matches and return the rest


Hostname examples

    HOST=$(hostname | cut -d. -f1)
    HOST=$(hostname | awk -F. '{print $1}')
    HOST=${HOSTNAME%%.*}
    


String variable comparisons

  • use within [[ <expression> ]]
  • [[ and ]] are each tokens and may only appear with whitespace on each side
  • within <expression> you can use parentheses for grouping and the relational operators <, >, <=, >=, ==, !=, &&, and ||
  • examples:
    [[ $person = lucia ]]
    [[ $person != linus ]]
    [[ ($person != linus ) && ($person != lucia ) ]]
    
  • = is an overloaded operator: assignment and comparison; no space on each side implies assignment; spaces on each side implies comparison
  • string variables containing only digits can be treated as numbers
  • arithmetic relational operators (for strings representing integers): -lt, -le, -eq, -ge, -gt, -ne with the implied semantics.


if statement

  • syntax:
    if condition
    then
       statements
    [elif condition
    then
       statements]
    [else
       statements]
    fi
    
  • the keywords then/else/elif/fi take the place of curly braces ({}), which have special meaning in the shell (RE)
  • the keywords elif or else can be omitted
  • example:
    if [[ $person = linus ]]
    then
       print $person is on the sixth floor.
    elif [[ $person = lucia ]]
    then
       print $person is on the fifth floor.
    elif [[ $person = linda ]]
    then
       print $person is on the fifth floor.
    else
       print "Who are you talking about?"
    fi
    
  • a condition can be anything which returns an exit status:
    options="-f -d -L"
    if print - $options | grep -q -e -d
       then print "option '-d' present in list."
    fi
    


Additional condition tests

    -n <string> string not null?
    -z <string> string null?
    -a <filename> exists?
    -f <filename> is plain file?
    -d <filename> is directory?
    -L <filename> is symbolic link?
    -s <filename> exists and not empty?
    -r <filename> read permission?
    -w <filename> write permission?
    -x <filename> execute permission?
    -O <filename> your file?
    -G <filename> your group?
    <file1> -nt <file2> <file1> newer than <file2>?
    <file1> -ot <file2> <file1> older than <file2>?

  • example:
    if [[ ! -f output.file ]]; then
       print "output.file does not exist."
    fi
    


while statement

  • syntax:
    while condition
    do
       statements
    done
    
  • condition has the same syntax as the if statement
  • can use break or continue, or return or exit inside a loop with the same meaning as in C
  • example:
    #!/usr/bin/env ksh
    # report type of executable file anywhere in search path
        
    path=$PATH
    dir=${path%%:*}
    while [[ -n $path ]]; do
       if [[ -x $dir/$1 && ! -d $dir/$1 ]]; then
          file $dir/$1
          exit 0
       fi
       path=${path#*:}
       dir=${path%%:*}
    done
    print "File not found."
    exit 1
    


ourwhich script

    #!/usr/bin/env ksh
    
    # insert code here to catch aliases
    
    integer exit_status=0
    
    if [[ $# -ne 0 ]]; then
       path=$(echo $PATH | sed 's/:/ /g')
       #print $path
       
       #for cmd in $*; do
       for cmd; do
          found=0
          for dir in $path; do
             # is $dir a directory
             # following if is superfluous
             if [[ -d $dir ]]; then
                #if [[ (-f $dir/$cmd) && (-x $dir/$cmd) ]]; then
                if [[ (-x $dir/$cmd) && (! -d $dir/$cmd) ]]; then
                   print $dir/$cmd
                   found=1
                   break
                fi
             fi
          done
          if [[ $found -eq 0 ]]; then 
            if [[ -n $PATH ]]; then
               print "$0: no $cmd in ($PATH)" >&2 
            else
               print "$0: no $cmd in ((null))" >&2 
            fi
            (( exit_status++ ))
          fi
       done
    else
       print "Usage: $0 [filename...]" >&2
       exit_status=255
    fi
    
    exit $exit_status
    
  • extend to handle more than one command-line argument akin to the which command on our system
  • extend to catch aliases akin to the which command on our system
  • will have problems with directory names containing a whitespace character


case selection

  • syntax:
    case expression in
       pattern1 )
          statements ;;
       pattern2 )
          statements ;;
    .
    .
    .
    esac
    
  • double semicolon (;;) is required to terminate <statements>
  • <statements> corresponding to the first pattern matching the <expression> are executed, after which the case statement terminates
  • <expression> is usually some variable's value
  • <patterns> can be plain strings, or they can be Korn shell patterns using *, ?, !, [], and so on (such as file-matching patterns)
  • a <pattern> can consist of several patterns separated by | (logical or), and the <pattern> can also be written as (pattern)
  • attractive for parsing/factoring command-line arguments
  • example:
    case $person in
       linus)
          print "Oh..He's on the tenth floor." ;;
       lucia | linda)
          print "They're out to lunch." ;;
       *)
          print "Hmm. Not sure." ;;
    esac
    
  • note that inside a case | does not act as a pipe


Factoring arguments

    #usage -d -f a b c
    
    args=" "$*
    
    echo args:$args:
    
    # investigate the use of getopt and getopts
    options=${args%% ([a-zA-Z0-9]|/)*}
    
    options=$(echo $options | sed 's/^[ ]//')
    
    files=${args# $options }
    
    print - options:$options:
    print files:$files:
    
    #grep
    # -q: quiet; just return exit status
    # -e: following is a pattern, not an option; protects patterns with a leading -
    # -e is same as -
    
    #if print - $options | grep -q -e -d
    #if print - $options | grep -q - -d
    #then
    #   print - "-d is present"
    #   echo "-d is present"
    #else
    #   print - "-d is absent"
    #fi
    
    for option in $options
    do
       case $option in
          -d)
             print "found a -d." ;;
          -f | -q)
             print - "-f or -q" ;;
          *)
             print "some other option(s)" ;;
       esac
    done
    
    exit 0
    


Numeric variables

  • ksh variables are strings (by default) or integers, depending on how they are defined
  • A=100 makes A the string '100'; integer A=100 makes A the integer 100
  • integer is an alias for typeset -i
  • to manipulate numeric variables using C-style expressions, use either $(( <expression> )) to return the value of <expression>, or (( <expression> )) to return only an exit status:
  • $ integer x=1
    $ (( y=x*10 ))
    $ echo $y
    10
    $ (( x+=1 ))
    $ echo $x
    2
    $ print $x $y
    2 10
    $ integer a=10
    $ integer b=21
    $ (( a == 10 ))
    $ echo $?
    0
    $ integer X=$(( a+10 ))
    $ echo $X
    20
    $ X=$(( a == 10 ))
    $ echo $X
    0
    $ (( a == 10 ))
    $ echo $?
    0
    $ (( b < 20 ))
    $ echo $?
    1
    $ (( (a < 10) || (a > 100) ))
    $ echo $?
    1
    
  • within <expression> you can use parentheses for grouping, the arithmetic operators +, -, *, /, %, <<, >>, &, |, ~, ^, and the relational operators <, >, <=, >=, ==, !=, &&, and ||
  • further, within the $(( <expressions> )) and (( <expressions> )) syntax, variables need not be preceded by a dollar sign, and special characters need not be quoted or escaped
  • let is same as (( <expressions> )) except <expressions> in the latter need not be quoted
  • another example of printing all arguments to a shell script:
    #!/usr/bin/env ksh
    
    integer i=0
    
    for arg in $*; do
       # any of following five lines works
       print "Argument $i is '$arg'."
       # inside (( ... )) or after a let statement the $ may be omitted
       print "Argument $(( i++ )) is $arg"
       (( ++i ))
       (( i++ ))
       (( i+= 1 ))
       let i='i+1'
       print "Arg $i is $arg"
    done
    
    exit 0
    


Notes

  • spaces are very important
  • [[, ]], ((, and )) are tokens and must be delimited by whitespace
  • use == for arithmetic comparisons
  • use = for string comparisons
  • how could one do both in a single expression?
    • nest them, or
    • use [[ ... ]] && (( ... ))


Example: renaming multiple .c files to .cpp

  • the command line mv *.c *.cpp does not work. why?
  • nor will the find command work. why?
  • script to generate some empty input files:
    #!/usr/bin/env ksh
    
    #$1 = prefix
    #$2 = number of files desired
    
    prefix=$1
    integer i=1
    integer n=$2
    
    while (( i <= n )); do
       touch ${prefix}${i}.cpp
       (( ++i ))
    done
    
    exit 0
    
  • rename (multiple move) script:
    #!/usr/bin/env ksh
    #rename (multiple move) script
    
    from=$1
    
    to=$2
    
    for file in *.$from; do
       mv $file ${file%.$from}.$to
       print ${file%.$from}.$to
    done
    
    exit 0
    


Array variables

  • an array variable provides a way to index a list of values
  • quite different from arrays in C and Perl; in the shell, we can define x[10] without first having defined elements 1..9
  • ${arrayname[*]} represents all elements of the array arrayname
  • items in an array can be accessed by position (first item is at index 0); $arrayname refers to ${arrayname[0]} (i.e., the first element of array arrayname)
  • the number of defined elements in an array variable is given by ${#arrayname[*]}
  • ${arrayname[$(( ${#arrayname[*]} - 1 ))]} accesses the last element of array arrayname
  • example:
    $ set -A people Lucy and Linus
    $ set -A others ${people[*]} and Larry and Lucia
    $ others[7]=and ; others[8]=Leisel
    $ print $people # prints first element of array others (i.e., ${people[0]})
    Lucy
    $ print ${people[0]} # same as above
    Lucy
    $ print ${people[1]} # prints second element of array others
    and
    $ print "The length of array others is ${#others[*]}." # prints length of array others
    9
    $ print ${others[$(( ${#others[*]} - 1 ))]} # prints last element of array others
    Leisel
    $ set -A files=$(ls)
    
  • ${#arrayname[i]} represents the number of characters in element i of array arrayname:
    $ print ${#others} # print the number of characters in first element of array others (i.e., ${others[0]})
    4
    $ print ${#others[0]} # same as above
    4
    $ print ${#others[1]} # print the number of characters in second element of array others
    3
    
  • another example
    $ set -A today $(date)
    $ print ${today[*]}
    Thu Oct 12 16:03:44 EDT 2006
    $ print ${#today[*]}
    6
    $ print "${today[1]} ${today[2]}, ${today[5]}"
    Oct 12, 2006
    $ date | awk '{print $2 " " $3 ", " $6}'
    Oct 12, 2006
    $ date | awk 'BEGIN {OFS=" "} {print $2, $3 "," , $6}'
    Oct 12, 2006
    


Restricted shells

  • include #!/usr/bin/env ksh -r as the first line of the script
  • enter ksh -r or rksh at the command prompt
  • motivation for a restricted shell
  • what does a restrict shell restrict? a cd among other things


Shell programming vs. Linux Filter Style of Programming

Filter Script Model           Filter Script           Shell Script Model           Shell Script
         
# P1
cat | \

# P2
sed | \

# P3
awk | \

...

# Pn
sort
                   
# S1
print

# S2: P2
sed

# S3
print

# S4: P3
awk

# S5
print

# S6: P4 | P5
ls | wc -l

# S7
print

# S8: P6
sed

# S9
print

# S10: P7 | P8 | P9
spell | sort | uniq

# S11
print

# S12: P10
date

# S13
print


Summary

    regular expressions + pattern matchings + filters + pipes + programmable shell = powerful & flexible programming abstractions


References

    [PGUS] M.G. Sobell. A Practical Guide to the UNIX System. Addison-Wesley, Reading, MA, Third edition, 1995.
    [UIAN] A. Robbins. UNIX in a Nutshell. O'Reilly, Beijing, Third edition, 1999.
    [UPE] B.W. Kernighan and R. Pike. The UNIX Programming Environment. Prentice Hall, Upper Saddle River, NJ, Second edition, 1984.

Return Home