Shell Scripting

Scripting

So we know how to run commands from an interactive prompt, but what if we want to save the commands we run so that we can reuse in the future? That's where scripting comes into play

Basics of Scripting

You can write programs directly at the prompt, or write into a file (writing scripts)

#!/bin/sh
echo something
  • Open an editor (for beginner, nano is recommended), save the script as example-script

  • On your shell, run chmod +x example-script

  • You can run your script as ./example-script

  • #!/bin/sh is also known as the shebang, specifies the interpreter

  • echo is a command that prints its arguments to the standard output.

More on Flags

Most command line utilities take parameters using flags. They come in short form (-h) and long form (–help). Usually, running COMMAND -h or man COMMAND will give you a list of the flags the program takes.

  • Short flags can be combined: rm -r -f is equivalent to rm -rf or rm -fr

  • A double dash is used in to signify the end of command options, after which only positional parameters are accepted.

    • For example, to create a file called -v, Use touch -- -v instead of touch -v

    • For example, to grep a file called -v, grep pattern -- -v will work while grep pattern -v will not.

Common Flags

There are a few flags that are widely accepted and have similar meanings throughout many programs

  • -a commonly refers to all files (i.e. also including those that start with a period[^4])

  • -f usually refers to forcing something, e.g. rm -f

  • -h displays the help for most commands

  • -v usually enables a verbose output

  • -V usually prints the version of the command

Unix Directory Structure

The Unix Directory Structure Unix has a different directory structure from Windows.

There is no concept of drives.

Everything is files and directories. The root directory is /

We use forward slash / instead of backward slash \

Specifically for Linux, there is FHS

Important Unix Directories

  • /bin, /sbin, /usr/bin, /usr/local/bin, /opt = executables

  • On Linux: /home = user home directories

  • On macOS: /Users = user home directories

  • /var/log = log files

  • /tmp = temporary files

  • /dev/urandom = random number generator

Shell Syntax

echo Hello

We've seen this command before, but we've never assigned it the proper terminology. Whenever we type something out, we can split the input into COMMANDs and ARGs (short for arguments)

  • COMMAND ARG1 ARG2 ARG3

Variables

echo location
name=COM3
echo $name
  • Used to store text

  • name=value to set variable

  • $name to access variable

:There are also a bunch of special variables we can use in our scripts:

  • $?: get exit code of the previous command

  • $1 to $9: arguments to a script

  • $0: name of the script itself

  • $#: number of arguments

  • $$: process ID of current shell

Environment Variables

On top of variables you can declare, there are a bunch of global variables that are declared in order for your system to run. We call these Environment Variables. You can see the full list of environment variables using the command:

env

Quick Exercise

Create a script variable-example containing the code below, then try running it with various arguments.

#!/bin/sh
echo $0
echo $1
echo $2
echo $#

Loops

Loop is used to run a command a bunch of times.

For example:

for i in $(seq 1 5); do echo hello; done

Let's unpack this!

`for x in list; do BODY; done`

for x in list; do BODY; done

  • ; terminates a command -- equivalent to newline

  • Split list, assign each to x, and run BODY

  • Split by "whitespace" -- we will get into it later

  • Compared to C, no curly braces, instead do and done

So, knowing the above,

for i in $(seq 1 5); do echo hello; done
  • $(seq 1 5)

    • Run the program seq with arguments 1 and 5

    • Substitute the $(...) block with the output of the program

    • Equivalent to

      for i in 1 2 3 4 5; do echo hello; done
  • echo hello

    • Everything in a shell script is a command

    • Here, it means run the echo command, with argument hello.

    • All commands are searched in $PATH (colon-separated)

    • Find out where a command is located by running which COMMAND, e.g. which ls

Conditionals

if test -d /bin; then echo true; else echo false; fi;

Let's unpack this!

if CONDITION; then BODY; fi
  • CONDITION is a command.

  • If its exit code is 0 (success), then BODY is run.

  • Optionally, you can also hook in an else or elif

So, knowing the above,

if test -d /bin; then echo true; else echo false; fi;
  • test -d /bin

    • test is a program that provides various checks and comparison which exits with exit code 0 if the condition is true.

  • Alternate syntax: [ condition ], e.g. [ -d /bin ]

Let's create a command that only prints directories

Bug! Hold on! What if the directory is called "My Documents"?

  • for f in $(ls) expands to for f in My Documents

  • Will first perform the test on My, then on Documents

Argument Splitting

  • Bash splits arguments by whitespace (tab, newline, space)

  • Same problem somewhere else: test -d $f

  • If $f contains whitespace, test will error!

  • Need to use quote to handle spaces in arguments for f in "My Documents"

  • How do we fix our script?

  • What do you think for f in "$(ls)" does?

Globbing

bash knows how to look for files using patterns:

  • Thus, for f in * means all files in this directory

  • When globbing, each matching file becomes its own argument

  • However, still need to make sure to quote, e.g. test -d "$f"

You can make advanced patterns

  • for f in a*: all files starting with a in the current directory

  • for f in foo/*.txt: all .txt files in foo

  • for f in foo/*/p??.txt: all three-letter text files, starting with p, in subdirectories of foo

Whitespace issues

  • if [ $foo = "bar" ]; then: What's the issue?

  • What if $foo is empty? arguments to [ are = and bar

  • Possible workaround: [ x$foo = "xbar" ], but very hacky

  • Instead, use [[ CONDITION ]]: bash built-in comparator that has special parsing

  • Good news: it also allows && instead of -a, || instead of -o, etc.

Shellcheck

  • The mentioned problems are the most common bugs in shell scripts.

  • A good tool to check for these kinds of possible bugs in your shell script: https://www.shellcheck.net/

Composability

  • Shell is powerful, in part because of Composability

  • You can chain multiple programs together, rather than one program that does everything

  • Remember The Unix Philosophy:

    1. Write programs that do one thing and do it well.

    2. Write programs to work together.

    3. Write programs to handle text streams, because that is a universal interface.

More Pipes

cat /var/log/sys*log | grep "Sep 10" | tail

  • cat /var/log/sys*log prints the system log

  • This output is fed into grep Sep 10, which looks for all entries from today.

  • This output is then further fed into tail, which prints only the last 10 lines.

Streams

  • All programs launched have 3 streams:

    • STDIN: the program reads input from here

    • STDOUT: the program prints to here

    • STDERR: a second output that the program can choose to use.

  • By default, STDIN is your keyboard, STDOUT and STDERR are both your terminal

Stream Redirection

  • However, this can be changed!

  • a | b: makes STDOUT of a the STDIN of b.

  • a > foo: STDOUT of a goes to the file foo

  • a 2> foo: STDERR of a goes to the file foo

  • a < foo: STDIN of a is read from the file foo

  • a <<< some text: STDIN of a is read from what comes after <<<

  • You can also pipe to tee (look up in man what tee does)

So why is this useful?

It lets you manipulate output of a program!

  • ls | grep foo: all files that contain the word foo

  • ps | grep foo: all processes that contain the word foo

  • On Linux: journalctl | grep -i intel | tail -n 5: last 5 system log messages with the word intel (case-insensitive)

  • Note that this forms the basis for data-wrangling, which will be covered later.

Grouping Commands (a; b) | tac

  • Run a, then b, and send all their output to tac[^7]

  • For example: (echo qwe; echo asd; echo zxc) | tac

Process Substitution b <(a)

  • Run a, generate a temporary file name for its output stream, and pass that filename to b

  • To demonstrate: echo <(echo a) <(echo b)

  • On Linux: diff <(journalctl -b -1 | head -n20) <(journalctl -b -2 | head -n20)

  • This shows the difference between the first 20 lines of the last boot log and the one before that.

Jobs

Used to run longer-term things in the background.

  • Use the & suffix

    • It will give back your prompt immediately.

    • For example: (for i in $(seq 1 100); do echo hi; sleep 1; done) &

    • Note that the running program still has your terminal as STDOUT. Instead, can redirect STDOUT to file.

    • Handy especially to run 2 programs at the same time like a server and client: server & client

    • For example: nc -l 1234 & nc localhost 1234 <<< test

  • jobs: see all jobs

  • fg %JOBS: bring the job corresponding to the id to the foreground (with no argument, bring the latest job to foreground)

  • You can also background the current program: ^Z, then run bg

    • ^Z stops the current process and makes it a job.

    • bg runs the last job in the background.

  • $! is the PID of the last background process.

Some Exercises

  • Sometimes piping doesn't quite work because the command being piped into does not expect the newline separated format.

  • For example, file command tells you properties of the file.

  • Try running ls | file and ls | xargs file

  • What is xargs doing?

Last updated