Introduction to Bash for beginners
2024-04-24
Source: EBAII training material from IFB
Material under CC-BY-SA
licence
Open a terminal and write things in it!
Use a language it understands, e.g. BASH (Bourne Again SHell) 1:
the BASH language is one of many extremely
similar Shell dialects (bsh
, ksh
,
csh
, zsh
, …)
BASH is based on a set of modular commands, which perform specific tasks
Commands are written just after the prompt.
Here, the $
character symbolizes that the computer is
waiting/ready for your commands.
__
(1) A pun on the first Shell language written by Stephen Bourne
himself (bsh
)
# below, the format [xxx] indicates that xxx is an optional element
command_name [argument_name [argument_value]] [file]
*there’s no standard term for ‘arguments’, you may also
come accross the terms ‘flags’ or ‘options’
cal
command (part 1)cal
(short for calendar) is a handy command to view a
certain date, month or year:
c.toffano-nioche@SSFA-18:$ cal
Avril 2024
di lu ma me je ve sa
1 2 3 4 5 6
7 8 9 10 11 12 13
14 15 16 17 18 19 20
21 22 23 24 25 26 27
28 29 30
cal -3
displays 3 months, centered on the current
month:
c.toffano-nioche@SSFA-18:$ cal -3
Mars 2024 Avril 2024 Mai 2024
di lu ma me je ve sa di lu ma me je ve sa di lu ma me je ve sa
1 2 1 2 3 4 5 6 1 2 3 4
3 4 5 6 7 8 9 7 8 9 10 11 12 13 5 6 7 8 9 10 11
10 11 12 13 14 15 16 14 15 16 17 18 19 20 12 13 14 15 16 17 18
17 18 19 20 21 22 23 21 22 23 24 25 26 27 19 20 21 22 23 24 25
24 25 26 27 28 29 30 28 29 30 26 27 28 29 30 31
31
cal
command (part 2)c.toffano-nioche@SSFA-18:$ cal -m may
Mai 2024
di lu ma me je ve sa
1 2 3 4
5 6 7 8 9 10 11
12 13 14 15 16 17 18
19 20 21 22 23 24 25
26 27 28 29 30 31
e.g. instead of cal -3
, we can use
cal --three
c.toffano-nioche@SSFA-18:$ cal --three
Mars 2024 Avril 2024 Mai 2024
di lu ma me je ve sa di lu ma me je ve sa di lu ma me je ve sa
1 2 1 2 3 4 5 6 1 2 3 4
3 4 5 6 7 8 9 7 8 9 10 11 12 13 5 6 7 8 9 10 11
10 11 12 13 14 15 16 14 15 16 17 18 19 20 12 13 14 15 16 17 18
17 18 19 20 21 22 23 21 22 23 24 25 26 27 19 20 21 22 23 24 25
24 25 26 27 28 29 30 28 29 30 26 27 28 29 30 31
31
Call the police, call your colleagues, search the Internet… or use
the man
command (manual)
CAL(1) User Commands CAL(1)
NAME
cal - display a calendar
SYNOPSIS
cal [options] [[[day] month] year]
DESCRIPTION
cal displays a simple calendar. If no arguments are specified, the current
month is displayed.
OPTIONS
-1, --one
Display single month output. (This is the default.)
-3, --three
Display prev/current/next month output.
...
SYNOPSIS
explains how to write the command line;
optional elements are written between [..]
DESCRIPTION
describes the result of the command
OPTIONS
list the available arguments, with their short and
long forms
man
interface:/color
: to search for the term colorn
: (next) to search for the next occurrence of the term
searched forp
: (previous) to search for the previous occurrence of
the term searched forq
: to quit help__
*Custom programmes/commands or scripts often have the
-h
or --help
arguments to print help messages
on how to use them.
The ls
command lists
directory contents and can take a number of arguments.
-l
(long/lots) gives
a lot of information about files-a
(--all
) shows all files, including
hidden ones-t
(time) sorts by modification date-h
(--human-readable
) displays file sizes
in readable units-r
(--reverse
) reverses sort orderNotes:
.git
).
and ..
directories are special (detailed
later)ls-l
command does not exist! ls -l --all
c.toffano-nioche@SSFA-18:~/formation_bash0$ ls -l --all
total 56
drwxr-xr-x 4 c.toffano-nioche tous 4096 avril 18 11:18 .
drwxr-xr-x 4 c.toffano-nioche tous 4096 avril 17 18:51 ..
-rw-r--r-- 1 c.toffano-nioche tous 587 avril 16 16:06 bash0_chatTerm.md
-rw-r--r-- 1 c.toffano-nioche tous 1131 avril 17 19:02 bash0_FindingHelp.md
-rw-r--r-- 1 c.toffano-nioche tous 833 avril 18 11:18 bash0_zoomLS.md
drwxr-xr-x 8 c.toffano-nioche tous 4096 avril 18 11:08 .git
drwxr-xr-x 2 c.toffano-nioche tous 4096 avril 16 15:14 images
ls -lahtr
for a complete (-a
) and detailed view (-l
),
sizes in KB,MB,GB,TB… i.e. human readable (-h
),
sorted by date/time (-t
) from oldest to most recent
(-r
):
c.toffano-nioche@SSFA-18:~/formation_bash0$ ls -lahtr
total 56K
drwxr-xr-x 2 c.toffano-nioche tous 4.0K avril 16 15:14 images
-rw-r--r-- 1 c.toffano-nioche tous 587 avril 16 16:06 bash0_chatTerm.md
drwxr-xr-x 4 c.toffano-nioche tous 4.0K avril 17 18:51 ..
-rw-r--r-- 1 c.toffano-nioche tous 1.2K avril 17 19:02 bash0_FindingHelp.md
drwxr-xr-x 8 c.toffano-nioche tous 4.0K avril 18 11:08 .git
drwxr-xr-x 4 c.toffano-nioche tous 4.0K avril 18 11:18 .
-rw-r--r-- 1 c.toffano-nioche tous 833 avril 18 11:18 bash0_zoomLS.md
/
directoryWhen we go up in the tree (=down in the hierarchy) by following the
branches, we can see that the /
(root) contains multiple
directories (e.g. shared
)
The shared
directory contains
bank
The bank
directory contains
homo_sapiens
Thus, the pathway to go in the homo_sapiens
directory
from the root is : /shared/bank/homo_sapiens
Our goal is to get to the homo_sapiens
directory. For
this, we are going to use the pwd
(=print work directory)
and cd
(=change directory) commands. We have two
choices:
/
).
)/
) : absolute
path.
)cd ..
or grandparent
directory : cd ../..
Relative and absolute paths give the same result. If the working
directory is shared
and we have to access
homo_sapiens
:
/
)tree
:emilie.drouineau@cluster-i2bc:~$ tree -d
.
├── GRCh38
│ ├── fasta
│ ├── gff3
├── hg19
│ ├── bwa
│ ├── fasta
│ ├── gtf
│ └── star-2.7.5a
├── hg38
│ ├── fasta
│ └── star-2.7.5a
└── latest_genome -> GRCh38
There’s not better place than home!
~
(tilde)*/home/userName
(but it may vary
according to the infrastructure you’re on e.g. on the IFB cluster, it’s:
/shared/home/userName
)
__
*~
for Mac users: option
+
N
or Alt
+ N
If you want to shine in society or with your family by giving the impression of typing quickly, use auto-completion!
More seriously:
E.g. try moving into the directory: /usr/local/bin
using
<TAB><TAB>
Rest assured, you haven’t heard the last of
It’s important to organise files and directories to easily find data. 4 commands are useful:
mkdir
(make directories) : to create a directory# read the documentation
man mkdir
# create a directory named my_new_dir
mkdir my_new_dir
# check if the directory was created
ls
cp
(copy) : to copy files and directories (/!\ to copy
a directory, you need to add an option)mv
(move) : to move a file to an other directory or
rename it# read the documentation
man mv
# rename a file
mv my_file_with_a_long_useless_name_i_want_to_change.txt my_file.txt
# put my_file.txt in the directory called my_dir
mv my_file.txt my_dir/
rm
(remove) : to remove files or directories.
Warning: it’s easy to remove more files/directories
than planned. To be sure, you can run the ls
command before
to check if it’s what you want.Sometimes an error message may appear saying that you are “not authorised” to perform an action…
In Linux, your rights are dictated by three letters r
,
w
and x
:
r
: read
, right to read the file and open
itw
: write
, right to write and modify a
filex
: execute
, right to execute the file (a
script, for example) Remember ls -l
? => ls
to list
the files of a folder, -l
argument to get more information
on the files
c.toffano-nioche@SSFA-18:~/formation_bash0$ ls -l
total 56
-rw-r--r-- 1 c.toffano-nioche tous 587 avril 16 16:06 bash0_chatTerm.md
-rw-r--r-- 1 c.toffano-nioche tous 1131 avril 17 19:02 bash0_FindingHelp.md
-rw-r--r-- 1 c.toffano-nioche tous 833 avril 18 11:18 bash0_zoomLS.md
drwxr-xr-x 2 c.toffano-nioche tous 4096 avril 16 15:14 images
When you type the ls -l
command, you may notice that
some lines start with a d
(=directory) and other with a
-
(=file).
Note also the fact that rwx
may or may not be repeated
three times. The first triplet corresponds to the rights held by the
owner of the file, the second corresponds to the rights allocated to
users in the same group as the owner of the file and the last
corresponds to the rights of all other users.
We’re often interested in the content of files: reading files, counting the number of lines, extracting a part (lines, columns), sorting lines, etc. Warning: some commands cannot access compressed files.
wc
(word count): count the number of lines, words and
bytes of file(s)less
or more
you can read a file line
by line (pager tool)head
or tail
you can visualize the
first n lines or the last n lines of a filecat
you print the full contents of the file
less
, it’s
the same as the man
command:commands | results |
---|---|
↑ or ↓ | Go up or down in the file |
> or < | Go to the first or last line |
/chr | Then press enter to find the word
chr |
n or p | find the next or
previous word chr |
q | quit |
If you have a tabular file format (e.g. .csv
,
.tsv
, …) as below:
# file with chromosome, start, stop, name, score, strand
chr1 7517 7517 NM_023732__Abcb6 . -
chr10 1826 1826 NM_019487__Hebp2 . -
chrX 1494 1494 NONE . -
chrY 3470 3470 NA . +
chr11 3054 3054 NA . -
chr2 1929 1929 SITE . +
cut
: removes columns from each line of filesE.g. if you want to select the name and the score columns of the file above:
chr1 7517 7517 -
chr10 1826 1826 -
chrX 1494 1494 -
chrY 3470 3470 +
chr11 3054 3054 -
chr2 1929 1929 +
If the separator is not a \t
(tabulation), you can
change it with the option -d
.
sort
: sort lines of text filesE.g. if you want to sort the previous file by chromosome (first
column in myf.tsv
):
# numeric | # alphanumeric | # version
# sort -k1,1n myf.tsv | # sort -k1,1d myf.tsv | # sort -k1,1V myf.tsv
chr10 1826 1826 - | chr1 7517 7517 - | chr1 7517 7517 -
chr11 3054 3054 - | chr10 1826 1826 - | chr2 1929 1929 +
chr1 7517 7517 - | chr11 3054 3054 - | chr10 1826 1826 -
chr2 1929 1929 + | chr2 1929 1929 + | chr11 3054 3054 -
chrX 1494 1494 - | chrX 1494 1494 - | chrX 1494 1494 -
chrY 3470 3470 + | chrY 3470 3470 + | chrY 3470 3470 +
Some commands work on the basis of data either typed in by the user or written in a file. It is referred to as the standard command input stream or stdin.
Similarly, some commands provide data either displayed on screen
(e.g. the ls
command) or in a file. It is referred
to as the standard command output stream or stdout.
There is a 3rd standard stream, which is the standard error stream or stderr. Under Unix, a command that ends without an error returns “0”, otherwise, it returns a number indicating the error code, or an explanatory sentence. By default, the error stream is also displayed on the screen (like the output stream).
When the result of a command is of interest to a subsequent
question/command, it can be transformed into a file rather than
displayed on the screen, so that the next command can read this file as
input data.
This is known as stream redirection.
E.g. below, the output file of command 1 becomes the input file of
command 2, thanks to the pipe redirection operator
indicated in red (it will be the |
character in the command
line*):
> myfile.txt
: stores the stdout stream by creating
(or overwriting) the myfile.txt
file>> myfile.txt
: stores the stdout stream by adding
lines to the myfile.txt
file *|
: Shift
+\
on Mac
keyboards, AltGr
+6
on Windows
For example, to count the number of files (assuming the filenames all
have an extension) you can use the ls
command followed by
the wc
command:
There’s no limit (apart from human understanding) to the number of pipe redirections you can associate.
Here is how to count the number of files created by each user in a
shared project: from the list of items (with ls
), you can
extract (with cut
) the user column (that starts at around
the 14th character), sort them (with sort
) and then count
them with the uniq
command and its -c
option
(uniq
lists single lines):
It is possible to create a new file (named
my_txt_files.txt
) containing a list of all files with a
special filter, for example, the extension .txt
:
_
” instead)emiliedrouineau@is152868-2:~$ ls
'file name.txt'
emiliedrouineau@is152868-2:~$ cat file name.txt
cat: file: No such file or directory
cat: name.txt: No such file or directory
The terminal raised a cat
error because it didn’t
understand the fact that “file name.txt” was a unique argument. Spaces
are commonly argument separators.
Thus, it reads your input as 2 separate files: file
and
name.txt
, which don’t exist.
It’s possible to use
spaces by escaping the character with a backslash
(file\ name.txt
) or using quotes
("file name.txt"
) but it’s messy and can be a source of
future errors.
Watch out for hidden characters when you edit files with Windows or
MacOS. Linux can be picky…
For example, although my script looks ok, I get an error when I run it:
# run the script on linux
emiliedrouineau@is152868-2:~$ ./my_script_windows.sh
bash: ./my_script_windows.sh: /bin/bash^M: bad interpreter: No such file or directory
What is /bin/bash^M
? I don’t see it in my file??!
The -A
option of cat
will help you see all
invisible characters:
emiliedrouineau@is152868-2:~$ cat -A ./my_script_windows.sh
#!/bin/bash^M$
# revomed score column from my file^M$
cut -f 1-3,6 exoBed.bed > filter_exoBed.bed^M$
# sorted file by chromosome name and start coordinates^M$
sort -k1,1V -k2,2n filter_exoBed.bed > sort_filter_exoBed.bed^M$
Here, the end of line is not correctly encoded for linux
(^M
).
/!\ No Excel or Word formats!!!!!!! Stick to simple text
formats
/!\ Avoid copy-pasting code from the web (hidden characters =
danger)
Now you know:
It’s a good start.
But there’s more to the Unix world than that!
The door is open: welcome!