Chapter 2 Working with Unix

2.1 Self-Help

Each of the commands that we’ve discussed so far are thoroughly documented, and you can view their documentation using the man command, where the first argument to man is the command you’re curious about. Let’s take a look at the documentation for ls:

man ls

LS(1)                     BSD General Commands Manual                    LS(1)
NAME
     ls -- list directory contents
SYNOPSIS
     ls [-ABCFGHLOPRSTUW@abcdefghiklmnopqrstuwx1] [file ...]
DESCRIPTION
     For each operand that names a file of a type other than directory, ls
     displays its name as well as any requested, associated information.

The controls for navigating man pages are the same as they are for less. You can use man pages for quickly searching for an option. For example, if you want to know how to get ls to print a long list. After typing man ls to open the page, type / in order to start a search. Then type the word or phrase that you’re searching for, in this case type in long list and then press Enter. The page jumps to this entry:

-l      (The lowercase letter ``ell''.)  List in long format.  (See
             below.)  If the output is to a terminal, a total sum for all the
             file sizes is output on a line before the long listing.

Press the n key in order to search for the next occurrence of the word, and if you want to go to the previous occurrence type Shift + n. This method of searching also works with less. When you’re finished looking at a man page type q to get back to the prompt.

The man command works wonderfully when you know which command you want to look up, but what if you’ve forgotten the name of the command you’re looking for? You can use apropos to search all of the available commands and their descriptions. For example if you forget the name of your favourite command line text editor, you could type apropos editor into the command line which will print a list of results:

apropos editor

dyld(1)                  - the dynamic link editor
ed(1), red(1)            - text editor
nano(1)                  - Nano's ANOther editor, an enhanced free Pico clone
pdisk(8)                 - Apple partition table editor
psed(1)                  - a stream editor
sed(1)                   - stream editor
vim(1)                   - Vi IMproved, a programmers text editor
zshzle(1)                - zsh command line editor
Mach-O(5)                - Mach-O assembler and link editor output

Both man and apropos are useful when a search is only a few keystrokes away, but if you’re looking for detailed examples and explanations you’re better off using a search engine if you have access to a web browser.

2.1.1 Summary

Use man to look up the documentation for a command.
If you can’t think of the name of a command use apropos to search for a word associated with that command.
If you have access to a web browser, using a search engine might be better than man or apropos.

2.2 Wildcard

A wildcard is a character that represents other characters, much like how joker in a deck of cards can represent other cards in the deck. Wildcards are a subset of metacharacters. The * (“star”) wildcard represents zero or more of any character, and it can be used to match names of files and folders in the command line. For example if I wanted to list all of the files in my Photos directory which have a name that starts with “2017” I could do the following:

ls 2017*

ls 2017* literally means: list the files that start with “2017” followed by zero or more of any character. As you can imagine using wildcards is a powerful tool for working with groups of files that are similarly named.

list the files with names ending in .jpg:

ls *.jpg

list all files containing specific string:

ls *first_day*

moved all files that start with “2017-” into the 2017 folder

mv 2017-* 2017/

2.3 Regular Expression

Regular expressions are a language used for parsing and manipulating text. You can use them to perform complex search-and-replace operations, and to validate that text data is well-formed. Most programming languages, as well as in many scripting languages, editors, applications, databases and command-line tools, all include regular expressions. Here we will focus Unix shell. One important thing to note is that the syntax and pattern-matching operations can be different for different languages.

We will show how to through text files using a list of the names of US states. You can download the file here.

grep

# get state names that contain at least one instance of the letter "x"
grep "x" states.txt

2.3.1 Metacharacters

Metacharacters are characters that can represent other characters. To take full advantage of all of the metacharacters we should use grep’s cousin egrep. The first metacharacter we should discuss is the . metacharacter.

.: represents any character

## search states.txt for the character “i”, 
## followed by any character, followed by the character “g”
egrep "i.g" states.txt

## Virginia
## Washington
## West Virginia
## Wyoming

Besides characters that can represent other characters, there are also metacharacters called quantifiers which allow you to specify the number of times a particular regular expression should appear in a string. One of the most basic quantifiers is “+”:

+: represents one or more occurrences of the preceeding expression. > s+as: one or more “s” followed by “as”

egrep "s+as" states.txt

## Arkansas
## Kansas

*: represents 0 or more occurrences of the preceding expression (remember + represents 1 or more)

egrep "s*as" states.txt

## Alaska
## Arkansas
## Kansas
## Massachusetts
## Nebraska
## Texas
## Washington

{ }: specifies an exact number of occurrences of an expression

egrep "s{2}" states.txt

## Massachusetts
## Mississippi
## Missouri
## Tennessee

> "`s{2}`" == "`ss`" (exactly two occurrences)

egrep "s{2,3}" states.txt

> "`s{2,3}`": state names that have between two and three adjacent occurrences of the letter “s”

( ): creates capturing groups within regular expressions. You can use a capturing group in order to search for multiple occurrences of a string.

egrep "(iss){2}" states.txt

## Mississippi

> search for the string “iss” occurring twice in a state name

We could combine more quantifiers and capturing groups to dream up even more complicated regular expressions:

egrep "(i.{2}){3}" states.txt

## Mississippi

> three occurrences of an “i” followed by two of any character

2.3.2 Character Sets

For the next couple of examples we’re going to need some text data beyond the names of the states. Let’s just create a short text file from the console:

touch small.txt
echo "abcdefghijklmnopqrstuvwxyz" >> small.txt
echo "ABCDEFGHIJKLMNOPQRSTUVWXYZ" >> small.txt
echo "0123456789" >> small.txt
echo "aa bb cc" >> small.txt
echo "rhythms" >> small.txt
echo "xyz" >> small.txt
echo "abc" >> small.txt
echo "tragedy + time = humor" >> small.txt
echo "http://www.jhsph.edu/" >> small.txt
echo "#%&-=***=-&%#" >> small.txt

In addition to quantifiers there are also regular expressions for describing sets of characters.

\w: all “word” characters
\d: all “number” characters
\s: “space” characters

egrep "\w" small.txt

# abcdefghijklmnopqrstuvwxyz
# ABCDEFGHIJKLMNOPQRSTUVWXYZ
# 0123456789
# aa bb cc
# rhythms
# xyz
# abc
# tragedy + time = humor
# http://www.jhsph.edu/

\w metacharacter matches all letters, numbers, and even the underscore character (_)

egrep "\d" small.txt

# 0123456789

egrep "\s" small.txt

# aa bb cc
# tragedy + time = humor

The -v flag (which stands for invert match) makes grep return all of the lines not matched by the regular expression.

add -v to the commandis to get the compliment:

egrep -v "\w" small.txt

# #%&-=***=-&%#

Note that the character sets for regular expressions also have their inverse sets:

\W for non-words
\D for non-digits
\S for non-spaces

egrep "\W" small.txt

## aa bb cc
## tragedy + time = humor
## http://www.jhsph.edu/
## #%&-=***=-&%#

The returned strings all contain non-word characters. Note the difference between the results of using the invert flag -v versus using an inverse set regular expression.

In addition to general character sets we can also create specific character sets using square brackets ([ ]) and then including the characters we wish to match in the square brackets. For example the regular expression for the set of vowels is [aeiou]. You can also create a regular expression for the compliment of a set by including a caret (^) in the beginning of a set. For example the regular expression [^aeiou] matches all characters that are not vowels. Let’s test both on small.txt:

egrep "[aeiou]" small.txt

## abcdefghijklmnopqrstuvwxyz
## aa bb cc
## abc
## tragedy + time = humor
## http://www.jhsph.edu/

egrep "[^aeiou]" small.txt

## abcdefghijklmnopqrstuvwxyz
## ABCDEFGHIJKLMNOPQRSTUVWXYZ
## 0123456789
## aa bb cc
## rhythms
## xyz
## abc
## tragedy + time = humor
## http://www.jhsph.edu/
## #%&-=***=-&%#

Every line in the file is printed, because every line contains at least one non-vowel!

If you want to specify a range of characters you can use a hyphen (-) inside of the square brackets. For example the regular expression [e-q] matches all of the lowercase letters between “e” and “q” in the alphabet inclusively. Case matters when you’re specifying character sets, so if you wanted to only match uppercase characters you’d need to use [E-Q]. To ignore the case of your match you could combine the character sets with the [e-qE-Q] regex (short for regular expression), or you could use the -i flag with grep to ignore the case. Note that the -i flag will work for any provided regular expression, not just character sets. Let’s take a look at some examples using the regular expressions that we just described:

egrep "[e-q]" small.txt

## abcdefghijklmnopqrstuvwxyz
## rhythms
## tragedy + time = humor
## http://www.jhsph.edu/

egrep "[E-Q]" small.txt

## ABCDEFGHIJKLMNOPQRSTUVWXYZ

egrep "[e-qE-Q]" small.txt

## abcdefghijklmnopqrstuvwxyz
## ABCDEFGHIJKLMNOPQRSTUVWXYZ
## rhythms
## tragedy + time = humor
## http://www.jhsph.edu/

egrep -i "[E-Q]" small.txt

## abcdefghijklmnopqrstuvwxyz
## ABCDEFGHIJKLMNOPQRSTUVWXYZ
## rhythms
## tragedy + time = humor
## http://www.jhsph.edu/

2.3.3 Escaping, Anchors, Odds and Ends

How to search for certain punctuation marks in text considering that those same symbols are used as metacharacters? For example, how would you find a plus sign (+) in a line of text since the plus sign is also a metacharacter? The answer is simply using a backslash (\) before the plus sign in a regex, in order to “escape” the metacharacter functionality. Here are a few examples:

egrep "\+" small.txt

## tragedy + time = humor

egrep "\." small.txt

## http://www.jhsph.edu/

There are three more metacharacters that we should discuss, and two of them come as a pair:

caret (^), which represents the start of a line
dollar sign ($) which represents the end of line

These “anchor characters” only match the beginning and ends of lines when coupled with other regular expressions.

The following command will search for all strings that begin with “A” :

egrep "^A" small.txt

## ABCDEFGHIJKLMNOPQRSTUVWXYZ

There’s a mnemonic for remembering which metacharacter to use for each anchor:

“First you get the power, then you get the money.”

pipe character “|”

Now let’s talk about “or” metacharacter (|), which is also called the “pipe” character. This metacharacter allows you to match either the regex on the right or on the left side of the pipe. Let’s take a look at a small example:

egrep "A|bc" small.txt

## abcdefghijklmnopqrstuvwxyz
## ABCDEFGHIJKLMNOPQRSTUVWXYZ
## abc

You can also use multiple pipe characters to, for example, search for lines that contain the words for all of the cardinal directions:

egrep "North|South|East|West" states.txt

## North Carolina
## North Dakota
## South Carolina
## South Dakota
## West Virginia

Just two more notes on grep: you can display the line number that a match occurs on using the -n flag:

egrep -n "t$" states.txt

## 7:Connecticut
## 45:Vermont

And you can also grep multiple files at once by providing multiple file arguments:

egrep "New" states.txt canada.txt

## states.txt:New Hampshire
## states.txt:New Jersey
## states.txt:New Mexico
## states.txt:New York
## canada.txt:Newfoundland and Labrador
## canada.txt:New Brunswick

If you wanted to search for all strings that begin with a vowel and end with character {a, b, c}:

egrep "^[aeiou]{1}.+[a-c]{1}$" small.txt

## aa bb cc
## abc

Table of Metacharacters

Metacharacter	Meaning
`.`	Any Character
`\w`	A Word
`\W`	Not a Word
`\d`	A Digit
`\D`	Not a Digit
`\s`	Whitespace
`\S`	Not Whitespace
[def]	A Set of Characters
[^def]	Negation of Set
[e-q]	A Range of Characters
^	Beginning of String
$	End of String
`\n`	Newline
+	One or More of Previous
*	Zero or More of Previous
?	Zero or One of Previous
\|	Either the Previous or the Following
{6}	Exactly 6 of Previous
{4, 6}	Between 4 and 6 or Previous
{4, }	More than 4 of Previous

If you want to experiment with writing regular expressions before you use them I highly recommend playing around with http://regexr.com/.

2.3.4 Find

If you want to find the location of a file or the location of a group of files you can use the find command. This command has a specific structure where the first argument is the directory where you want to begin the search, and all directories contained within that directory will also be searched. The first argument is then followed by a flag that describes the method you want to use to search. In this case we’ll only be searching for a file by its name, so we’ll use the -name flag. The -name flag itself then takes an argument, the name of the file that you’re looking for. Let’s go back to the home directory and look for some files from there:

find . -name "small.txt"

## ./Documents/GitHub/Unix/files/small.txt

find ./Documents -name "*loop.sh"

## ./Documents/GitHub/Unix/files/foreverloop.sh
## ./Documents/GitHub/Unix/files/whileloop.sh
## ./Documents/GitHub/Unix/files/forloop.sh
## ./Documents/GitHub/Unix/files/ifloop.sh

2.3.5 Summary

grep and egrep can be used along with regular expressions to search for patterns of text in a file.
Metacharacters are used in regular expressions to describe patterns of characters.
find can be used to search for the names of files in a directory.

2.4 Configure

2.4.1 History

Near the start of this book we discussed how you can browse the commands that you recently entered into the prompt using the Up and Down arrow keys. Bash keeps track of all of your recent commands, and you can browse your command history two different ways. The commands that we’ve used since opening our terminal can be accessed via the history command. Let’s try it out:

history

Whenever we close a terminal our recent commands are written to the ~/.bash_history file. Let’s a take a look at the beginning of this file:

head -n 5 ~/.bash_history

Searching your ~/.bash_history file can be particularly useful if you’re trying to recall a command you’ve used in the past. The ~/.bash_history file is just a regular text file, so you can search it with grep. Here’s a simple example:

grep "canada" ~/.bash_history

## egrep "New" states.txt canada.txt

2.4.2 Customizing Bash

Besides ~/.bash_history, another text file in our home directory that we should be aware of is ~/.bash_profile. The ~/.bash_profile is a list of Unix commands that are run every time we open our terminal, usually with a different command on every line. One of the most common commands used in a ~/.bash_profile is the alias command, which creates a shorter name for a command. Let’s take a look at a ~/.bash_profile:

alias docs='cd ~/Documents'
alias edbp='nano ~/.bash_profile'

The first alias creates a new command docs. Now entering docs into the command line is the equivalent of entering cd ~/Documents into the comamnd line. let’s edit our ~/.bash_profile with nano. If there’s anything in your ~/.bash_profile already then start adding lines at the end of the file. Add the line alias docs='cd ~/Documents', then save the file and quit nano. In order to make the changes to our ~/.bash_profile take effect we need to run source ~/.bash_profile in the console:

source ~/.bash_profile

Now let’s try using docs:

docs
pwd

## /Users/sean/Documents

It works! Setting different aliases allows you to save time if there are long commands that use often. In the example ~/.bash_profile above, the second line, alias edbp='nano ~/.bash_profile' creates the command edbp (edit bash profile) so that you can quickly add aliases. Try adding it to your ~/.bash_profile and take your new command for a spin!

There are a few other details about the ~/.bash_profile that are important when you’re writing software which we’ll discuss in the Bash Programming chapter.

2.4.3 Summary

history displays what commands we’ve entered into the console since opening our current terminal.
The ~/.bash_history file lists commands we’ve used in the past. alias creates a command that can be used as a substitute for a longer command that we use often.
The ~/.bash_profile is a text file that is run every time we start a shell, and it’s the best place to assign aliases.

2.5 Differentiate

It’s important to be able to examine differences between files. First let’s make two small simple text files in the Documents/GitHub/Unix/files/ directory.

cd ./Documents/GitHub/Unix/files/
head -n 4 small.txt > four.txt
head -n 6 small.txt > six.txt

If we want to look at which lines in these files are different we can use the diff command:

diff four.txt six.txt

## 4a5,6
## > rhythms
## > xyz

Only the differing lines are printed to the console. We could also compare differing lines in a side-by-side comparison using sdiff:

sdiff four.txt six.txt

## abcdefghijklmnopqrstuvwxyz       abcdefghijklmnopqrstuvwxyz
## ABCDEFGHIJKLMNOPQRSTUVWXYZ       ABCDEFGHIJKLMNOPQRSTUVWXYZ
## 0123456789                       0123456789
## aa bb cc                         aa bb cc
##                                > rhythms
##                                > xyz

In a common situation you might be sent a file, or you might download a file from the internet that comes with code known as a checksum or a hash. Hashing programs generate a unique code based on the contents of a file.¹ People distribute hashes with files so that we can be sure that the file we think we’ve downloaded is the genuine file. One way we can prevent malicious individuals from sending us harmful files is to check to make sure the computed hash matches the provided hash. There are a few commonly used file hashes but we’ll talk about two called MD5 and SHA-1.

Since hashes are generated based on file contents, then two identical files should have the same hash. Let’s test this my making a copy of states.txt.

cp small.txt small_copy.txt

To compute the MD5 hash of a file we can use the md5 command:

md5 small.txt

## MD5 (small.txt) = 45ea2935d11b9d6057b8c8f901a37822

md5 small_copy.txt

## MD5 (small_copy.txt) = 45ea2935d11b9d6057b8c8f901a37822

As we expected they are the same! We can compute the SHA-1 hash using the shasum command:

shasum small.txt

## 542a81f4556b79a0e1f8b6639657609437835fad  small.txtshasum states_copy.txt

shasum small_copy.txt

## 542a81f4556b79a0e1f8b6639657609437835fad  small_copy.txt

Once again, both copies produce the same hash. Let’s make a change to one of the files, just to illustrate the fact that the hash changes if file contents are different:

head -n 5 small_copy.txt > small_copy.txt
shasum small_copy.txt

## da39a3ee5e6b4b0d3255bfef95601890afd80709  small_copy.txt

2.5.1 Summary

The md5 and shasum commands use different algorithms to create codes (called hashes or checksums) that are unique to the contents of a file. These hashes can be used to ensure that a file is genuine.

2.6 Pipes

Pipe (|) is one of the most powerful features of the command line. It allows us to take the output of a command, which would normally be printed to the console, and use it as the input to another command. It’s like fitting an actual pipe between the end of one program and connecting it to the beginning of another program! Let’s take a look at a basic example. We know the cat command takes the contents of a text file and prints it to the console:

cd Documents/GitHub/Unix/files/
cat canada.txt

## Nunavut
## Quebec
## Northwest Territories
## Ontario
## British Columbia
## Alberta
## Saskatchewan
## Manitoba
## Yukon
## Newfoundland and Labrador
## New Brunswick
## Nova Scotia
## Prince Edward Island

This output from cat canada.txt will go into our pipe, and we will look at the first few lines of a file:

cat canada.txt | head -n 5

## Nunavut
## Quebec
## Northwest Territories
## Ontario
## British Columbia

Notice that this is the same result we would get from head -n 5 canada.txt, we just used cat to illustrate how the pipe works. The general syntax of the pipe is [program that produces output] | [program uses pipe output as input instead of a file].

A more common and useful example where we could use the pipe is answering the question: “How many US states end in a vowel?” We could use grep and regular expressions to list all of the state names that end with a vowel, then we could use wc to count all of the matching state names:

grep "[aeiou]$" states.txt | wc -l

## 32

The pipe can also be used multiple times in one command in order to take the output from one piped command and use it as the input to yet another program! For example we could use three pipes with ls, grep, and less so that we could scroll through the files in the current directory that were created in October:

ls -al | grep "Oct" | less

## drwxr-xr-x   2 happyrabbit  staff    64 Oct 31 17:00 .ipynb_checkpoints
## -rw-r--r--   1 happyrabbit  staff    74 Oct 29 16:28 four.txt
## -rw-r--r--@  1 happyrabbit  staff   106 Oct 29 20:19 iago.txt
## -rw-r--r--   1 happyrabbit  staff    82 Oct 29 22:39 math.sh
## -rw-r--r--   1 happyrabbit  staff    86 Oct 29 16:28 six.txt
## -rw-r--r--   1 happyrabbit  staff   149 Oct 28 07:46 small.txt
## -rw-r--r--   1 happyrabbit  staff     0 Oct 29 16:54 small_copy.txt
## -rw-r--r--@  1 happyrabbit  staff   472 Oct 13 18:19 states.txt
## -rw-r--r--   1 happyrabbit  staff   944 Oct 29 20:32 states2.txt
## -rw-r--r--   1 happyrabbit  staff     0 Oct 30 06:53 var_exe1.sh
## -rw-r--r--   1 happyrabbit  staff   111 Oct 30 06:51 vars.sh

Remember you can use the Q key to quit less and return to the prompt.

2.6.1 Summary

The pipe (|) takes the output of the program on its left side and directs the output to be the input for the program on its right side.

2.7 Make

Once upon a time there were no web browsers, file browsers, start menus, or search bars. When somebody booted up a computer all they got was a shell prompt, and all of the work they did started from that prompt. Back then there was always the problem of how software should be installed. The make program is the best attempt at solving this problem and still in wide use today. The guiding design goal of make is that in order to install some new piece of software one would:

Download all of the files required for installation into a directory.
cd into that directory.
Run make.

This is accomplished by specifying a file called makefile, which describes the relationships between different files and programs. In addition to installing programs, make is also useful for creating documents automatically. Let’s build up a makefile that creates a readme.txt file which is automatically populated with some information about our current directory.

Let’s start by creating a very basic makefile with nano:

cd Documents/GitHub/Unix/Journal/
nano makefile
draft_journal_entry.txt:
  touch draft_journal_entry.txt

The simple makefile above shows a rule that has the following general format:

[target]: [dependencies...]
  [commands...]

In the simple example, draft_journal_entry.txt is the target, a file which is created as the result of the command(s).

It’s very important to note that any commands under a target must be indented with a Tab. If we don’t use Tabs to indent the commands then make will fail.

Let’s save and close the makefile, then we can run the following in the console:

ls

## makefile

Let’s use the make command with the target we want to be “made” as the only argument:

make draft_journal_entry.txt

## touch draft_journal_entry.txt

ls

## draft_journal_entry.txt
## makefile

The commands that are indented under our definition of the rule for the draft_journal_entry.txt target were executed, so now draft_journal_entry.txt exists! Let’s try running the same make command again:

make draft_journal_entry.txt

## make: 'draft_journal_entry.txt' is up to date.

Since the target file already exists no action is taken, and instead we’re informed that the rule for draft_journal_entry.txt is “up to date” (there’s nothing to be done).

If we look at the general rule format we previously sketched out, we can see that we didn’t specify any dependencies for this rule. A dependency is a file that the target depends on in order to be built. If a dependency has been updated since the last time make was run for a target then the target is not “up to date.” This means that the commands for that target will be run the next time make is run for that target. This way, the changes to the dependency are incorperated into the target. The commands are only run when the dependencies change, or when the target doesn’t exist at all, in order to avoid running commands unnecessarily.

Let’s update our makefile to include a readme.txt that is built automatically. First, let’s add a table of contents for our journal:

echo "1. 2017-06-15-In-Boston" > toc.txt

Now let’s update our makefile with nano to automatically generate a readme.txt:

nano makefile

draft_journal_entry.txt:
  touch draft_journal_entry.txt

readme.txt: toc.txt
  echo "This journal contains the following number of entries:" > readme.txt
  wc -l toc.txt | egrep -o "[0-9]+" >> readme.txt

Take note that the -o flag provided to egrep above extracts the regular expression match from the matching line, so that only the number of lines is appended to readme.txt. Now let’s run make with readme.txt as the target:

make readme.txt

## echo "This journal contains the following number of entries:" > readme.txt
## wc -l toc.txt | egrep -o "[0-9]+" >> readme.txt

Now let’s take a look at readme.txt:

cat readme.txt

## This journal contains the following number of entries:
## 1

Looks like it worked! What do you think will happen if we run make readme.txt again?

make readme.txt

## make: 'readme.txt' is up to date.

You guessed it: nothing happened! Since the readme.txt file still exists and no changes were made to any of the dependencies for readme.txt (toc.txt is the only dependency) make doesn’t run the commands for the readme.txt rule. Now let’s modify toc.txt then we’ll try running make again.

echo "2. 2017-06-16-IQSS-Talk" >> toc.txt
make readme.txt

## echo "This journal contains the following number of entries:" > readme.txt
## wc -l toc.txt | egrep -o "[0-9]+" >> readme.txt

Looks like it ran! Let’s check readme.txt to make sure.

cat readme.txt

## This journal contains the following number of entries:
## 2

It looks like make successfully updated readme.txt! With every change to toc.txt, running make readme.txt will programmatically update readme.txt.

In order to simplify the make experience, we can create a rule at the top of our makefile called “all” where we can list all of the files that are built by the makefile. By adding the all target we can simply run make without any arguments in order to build all of the targets in the makefile. Let’s open up nano and add this rule:

nano makefile

all: draft_journal_entry.txt readme.txt
draft_journal_entry.txt:
  touch draft_journal_entry.txt
  
readme.txt: toc.txt
  echo "This journal contains the following number of entries:" > readme.txt
  wc -l toc.txt | egrep -o "[0-9]+" >> readme.txt

While we have nano open let’s add another special rule at the end of our makefile called clean which destroys the files created by our makefile:

all: draft_journal_entry.txt readme.txt
draft_journal_entry.txt:
  touch draft_journal_entry.txt
  
readme.txt: toc.txt
  echo "This journal contains the following number of entries:" > readme.txt
  wc -l toc.txt | egrep -o "[0-9]+" >> readme.txt
  
clean:
  rm draft_journal_entry.txt
  rm readme.txt

Let’s save and close our makefile then let’s test it out.

First let’s clean up our repository:

make clean

## rm draft_journal_entry.txt
## rm readme.txt

ls

## makefile
## toc.txt

make

## touch draft_journal_entry.txt
## echo "This journal contains the following number of entries:" > readme.txt
## wc -l toc.txt | egrep -o "[0-9]+" >> readme.txt

ls

## draft_journal_entry.txt
## readme.txt
## makefile
## toc.txt

Looks like our makefile works! The make command is extremely powerful, and this section is meant to just be an introduction. For more in-depth reading about make I recommend Karl Broman’s tutorial or Chase Lambert’s makefiletutorial.com.

2.7.1 Summary

make is a tool for creating relationships between files and programs, so that files that depend on other files can be automatically rebuilt.
makefiles are text files that contain a list of rules.
Rules are made up of targets (files to be built), commands (a list of bash commands that build the target), and dependencies (files that the target depends on to be built).

Hashing is the transformation of a string of characters into a usually shorter fixed-length value or key that represents the original string. Hashing is used to index and retrieve items in a database because it is faster to find the item using the shorter hashed key than to find it using the original value.↩