BASH and Shell Scripting

IO and Redirecting

The following is derived from https://tldp.org/LDP/abs/html/io-redirection.html in a slightly different format. Re-written for my own edification.

There are always three open “files”, STDIN, STDOUT, STDERR. Each gets a file descriptor; 0, 1, and 2, respectively. 3 – 9 are additional file descriptors that can also be used.

# Redirecting STDOUT to file.  Will overwrite existing file or create a new file.
OUTPUT > file

# Append to existing file or create a new file.
OUTPUT >> file

# Effectively touch a file, or truncate an existing file
: > file
# or, if your shell supports it
> file

# Using specific file descriptor references:
# Redirect STDOUT to a file
1> file

# Redirect STDERR to a file
2> file

# Bash 4 supports redirecting both STDOUT and STDERR with the following
&> file

# M is a file descriptor, the default is 1 if not specified
# N is a file
# Descriptor M is redirected to file N
M>N

M>&N
# M, again is a file descriptor.  Again, the default is 1 if not specified
# N is another file descriptor

# Replacing > with >> will append to files instead of truncating and
# overwriting them

Take the following script

#!/bin/bash

echo "to stdout" >&1
echo "more stdout" >&1
echo "to stderr" >&2
echo "more stderr" >&2

Running it in the following ways enables you to redirect the output for different use cases

# Running it, as-is, prints both the STDOUT and STDERR to the console
$ ./output.sh 
to stdout
more stdout
to stderr
more stderr

# Running it and redirecting STDERR to /dev/null
$ ./output.sh 2> /dev/null
to stdout
more stdout

# If we want to pipe the output and search for more, we are only
# searching output written to STDOUT
$ ./output.sh | grep more
to stderr
more stderr
more stdout

# Redirecting STDERR to STDOUT
$ ./output.sh 2>&1 | grep more
more stdout
more stderr

# Redirecting both STDOUT and STDERR to /dev/null
# We first redirect STDOUT to /dev/null.  Then we redirect STDERR to
# STDOUT which is already defined to go to /dev/null.  Thus, both go
# to /dev/null
$ ./output > /dev/null 2>&1
# no output :)

# Here we want to pip ONLY STDERR to grep to search for some output.
# In this case, we first need to redirect STDERR to STDOUT, then
# redirect STDOUT to /dev/null. The result is that we define STDERR
# to be going to where STDOUT is going by default.  Then we redefine
# where STDOUT is going.
$ ./output.sh 2>&1 1> /dev/null | grep more
more stderr

Reading Files via File Descriptors

# Redirects, or opens for reading, the file to STDIN.  Remember, 0 is the file
# descriptor for STDIN
0< file

# Will open a file for reading and writing, assigning it to the file descriptor
# defined by j.  If the file does not already exist, it will be created.
[j]<>file

# Create a file that contains "123456789"
echo "123456789" > file
# Open the file for reading assigning it to file descriptor 3
exec 3<> file
# Read the first 4 character
read -n 4 <&3
# Write a character to the, now, 5th character position
echo -n . >&3
# Close the file
exec 3>&-

cat file
1234.6789

Use Output of a Command as a Virtual File

TODO: come up with a good example for this.

# Creates a FIFO (named pipe) in /dev/fd/n
<(seq 1 10)

Variables

.bashrc and .bash_profile

How do you declare a function that you can use in all of your terminal sessions

  • Add the func declaration to ~/.bash_profile or ~/.bashrc

How do you see the definition of any functions declared for your shell.

declare -f

See a specific one

declare -f <name-of-func>

Reading From Files

How do you read a text file, line-by-line?

while IFS= read -r line; do echo $line; done < /path/to/file

The key is the read command.

To read the details of the read built-in, type help read in your terminal.

find

How do you find files by their inode number?

I have typically used this technique when I end up with some file that starts with a character that I can not use a mv, rm, or other standard command to work with.

First, do a listing of the directory as follows ls -li. This will display both the file an its inode. Once you have the inode, you can do a find by the inode number and then exec the command that you want to run on it.

find ./ -inum 1109845324 -exec ls {} \;

How do you exclude a directory, or directories from find?

Given the following directory structure

|-- dir1
|   |-- a
|   |   `-- b
|   |       `-- c
|   |-- d.txt
|   `-- f1.txt
|-- dir2
|   |-- d
|   |   |-- e
|   |   |   `-- f
|   |   `-- f3.txt
|   |-- d2.txt
|   `-- f2.txt
`-- dir3
    `-- f4.txt

Which can be created by the following command

mkdir -p dir1/a/b/c dir2/d/e/f dir3 && touch dir1/d.txt dir1/f1.txt dir2/d2.txt dir2/f2.txt dir2/d/f3.txt dir3/f4.txt

There are a couple of ways to go about this.

To remove “dir2” from a find of all files

find ./ -type d -name '*dir2' -prune -o -type f -print
./dir1/d.txt
./dir1/f1.txt
./dir3/f4.txt

To remove dir1 and dir2

find ./ -type d \( -name '*dir1' -o -name '*dir2' \) -prune -o -type f -print

Another way to filter out directories. You can chain the -not -path <expr> together to add additional dirs

find ./ -type f -not -path '*dir1/*' -print

What is the difference between -print and -print0 in find and why would you use either?

Per the find man page -print prints the file name on STDOUT followed by a newline. If you are piping the output of find to some other program and their are newline chars in the name, or other special characters use -print0. -print0 will print the full file name to STDOUT separated by a null character which will then be correctly interpreted by other programs that can read null character delimited input such as xargs which has a similar -0 option.

If you do not use -print0 and are post processing that output some unpredictable and sometimes bad things can happen.

For example, if we add the following file to our set of test files

touch "blah foo.txt"

$ find ./ -type f -not -path '*dir1/*' -print0 | xargs -0 du -s
0	./dir2/d/f3.txt
0	./dir2/d2.txt
0	./dir2/f2.txt
0	./dir3/f4.txt
0	./blah foo.txt

$ find ./ -type f -not -path '*dir1/*' -print | xargs du -s
0	./dir2/d/f3.txt
0	./dir2/d2.txt
0	./dir2/f2.txt
0	./dir3/f4.txt
du: cannot access './blah': No such file or directory
du: cannot access 'foo.txt': No such file or directory

How to use find to search for content in a file(s)

find ./ -name '.html' -type f | xargs egrep -i '@' -l or find ./ -name '.html" -type f -exec egrep -i '@' -l {} \;

or using regex

find ./ -regex './.*.html'

awk

parallel

grep (fgrep)

. grep (1)
. http://www.uccs.edu/~ahitchco/grep/
. print line numbers:
$ grep -n “foo” file.txt

. egrep
. extended grep (regexes)

. fgrep: fast grep, just string literals

print the negative result (lines that do not match)
            $ grep -vn "foo" file.txt

    . The only command of the three that supports backreferences and

sed

echo Sunday | sed ‘s/day/night’ => Sunnight

http://www.grymoire.com/Unix/Sed.html
http://lowfatlinux.com/linux-sed.html

xxd

Used to generate a hexdump or convert from hex. Can also generate a binary representation of data.

hexdump, od

Other utilities for picking through the bytes of a file.

xargs

xargs – build and execute command lines from standard input

http://coldattic.info/shvedsky/pro/blogs/a-foo-walks-into-a-bar/posts

In general you should always use the -0 option if you can provide null char delimited input.

To specify the position of the argument use “{}”.

If you are piping the output of find, which might have spaces in it, xargs will split on the spaces, which isn’t what you want. In that case, you need to use find with the -print0 option and invoke xargs with the -0 option as in the following example:

find ./ -type f -print0 | xargs -0 grep 'someString'

To run with multiple ‘threads’

seq 15 | xargs --max-procs=4 -n 1 echo

xargs -L 1 tells it to run the command for each line in the stdin

cat someFile.txt | xargs -L 1 ./some.script.pl

seq

seq – print a sequence of numbers

TODO

rename


tr