Introduction to bash

Last updated: 2025-02-25

ba$h0 training session

Introduction to Bash for beginners

Past & present contributions

  • Emilie Drouineau, I2BC/CEA
  • Chloe Quignot, BIOI2 - I2BC/CEA
  • Claire Toffano-Nioche, I2BC/CNRS
  • Fadwa El-Khaddar, BIOI2 - I2BC/Univ. Paris-Saclay



Source: EBAII training material from IFB

Material under CC-BY-SA licence CC-BY-SA

I2BC BIOI2 IFB

How do I chat with the computer?

Open a terminal and write things in it!

Can I speek with the computer in human language?

Use a language it understands, e.g. BASH (Bourne Again SHell) 1:

  • the BASH language is one of many extremely similar Shell dialects (bsh, ksh, csh, zsh, …)

  • BASH is based on a set of modular commands, which perform specific tasks

  • commands are written just after the prompt. Here, the $ character symbolizes that the computer is waiting/ready for commands.

  • it’s you, the user, who will write commands after the prompt to make the computer perform tasks.

__

(1) A pun on the first Shell language written by Stephen Bourne himself (bsh)

Prototype of a command

  • a command performs a task (sorting, selecting, opening, aligning reads, etc.)
  • generally speaking, a terminal instruction always begins with the name of a command
  • it has a certain number of arguments*, which may be optional, and which can modify its mode of operation
# below, the format [xxx] indicates that xxx is an optional element
command_name [argument_name [argument_value]] [file]
  • argument names are not standardized
  • arguments may or may not take values
  • arguments may have short and/or long forms
  • /!\ spaces between the command and its arguments are seen as separators



*there’s no standard term for ‘arguments’, you may also come accross the terms ‘flags’ or ‘options’

Example using the cal command (part 1)

cal (short for calendar) is a handy command to view a certain date, month or year:

⮕ without arguments

me@here:$ cal
     Avril 2024       
di lu ma me je ve sa  
    1  2  3  4  5  6  
 7  8  9 10 11 12 13  
14 15 16 17 18 19 20  
21 22 23 24 25 26 27  
28 29 30  

⮕ with a single argument with no value

cal -3 displays 3 months, centered on the current month:

me@here:$ cal -3
     Mars 2024             Avril 2024             Mai 2024        
di lu ma me je ve sa  di lu ma me je ve sa  di lu ma me je ve sa  
                1  2      1  2  3  4  5  6            1  2  3  4  
 3  4  5  6  7  8  9   7  8  9 10 11 12 13   5  6  7  8  9 10 11  
10 11 12 13 14 15 16  14 15 16 17 18 19 20  12 13 14 15 16 17 18  
17 18 19 20 21 22 23  21 22 23 24 25 26 27  19 20 21 22 23 24 25  
24 25 26 27 28 29 30  28 29 30              26 27 28 29 30 31     
31      

Example using the cal command (part 2)

⮕ with a single argument which requires a value

me@here:$ cal -m may
      Mai 2024        
di lu ma me je ve sa  
          1  2  3  4  
 5  6  7  8  9 10 11  
12 13 14 15 16 17 18  
19 20 21 22 23 24 25  
26 27 28 29 30 31     

⮕ short vs long forms of an argument

  • often, arguments have short & long forms (more explicit/readable but longer to type…)
  • long forms are generally preceded by two dashes

e.g. instead of cal -3, we can use cal --three

me@here:$ cal --three
     Mars 2024             Avril 2024             Mai 2024        
di lu ma me je ve sa  di lu ma me je ve sa  di lu ma me je ve sa  
                1  2      1  2  3  4  5  6            1  2  3  4  
 3  4  5  6  7  8  9   7  8  9 10 11 12 13   5  6  7  8  9 10 11  
10 11 12 13 14 15 16  14 15 16 17 18 19 20  12 13 14 15 16 17 18  
17 18 19 20 21 22 23  21 22 23 24 25 26 27  19 20 21 22 23 24 25  
24 25 26 27 28 29 30  28 29 30              26 27 28 29 30 31     
31      

How to get help?

Call the police, call your colleagues, search the Internet… or use the man command (manual)

me@here:~$ man cal
CAL(1)                                     User Commands                       CAL(1)

NAME
       cal - display a calendar

SYNOPSIS
       cal [options] [[[day] month] year]

DESCRIPTION
       cal  displays  a  simple calendar.  If no arguments are specified, the current
       month is displayed.

OPTIONS
       -1, --one
              Display single month output.  (This is the default.)

       -3, --three
              Display prev/current/next month output.

...

SYNOPSIS explains how to write the command line; optional elements are written between [..]
DESCRIPTION describes the result of the command
OPTIONS list the available arguments, with their short and long forms

Shortcuts for the man interface:

  • navigate with your keyboard arrows (↑ & ↓)
  • /color: to search for the term color
  • n: (next) to search for the next occurrence of the term searched for
  • p: (previous) to search for the previous occurrence of the term searched for
  • q: to quit help

__

*Custom programmes/commands or scripts often have the -h or --help arguments to print help messages on how to use them.

Focus on the ls command and its arguments

The tasks requested of the computer are often applied to data. This data is usually contained in files, and these files are in turn stored in folders. To find out how files and directories are arranged, we use the ls command that lists the contents of directories.

This ls command can take a number of arguments.

Among the main arguments:

  • -l (long/lots) gives a lot of information about files
  • -a (--all) shows all files, including hidden ones
  • -t (time) sorts by modification date
  • -h (--human-readable) displays file sizes in readable units
  • -r (--reverse) reverses sort order

Notes:

  • names of hidden files & folders begin with a dot (e.g. .git)
  • . and .. directories are special (detailed later)
  • pay attention to the spaces between the command and its arguments. The ls-l command does not exist!

⮕ Arguments can be combined: ls -l --all

me@here:~/formation_bash0$ ls -l --all
total 56
drwxr-xr-x 4 me tous 4096 avril 18 11:18 .
drwxr-xr-x 4 me tous 4096 avril 17 18:51 ..
-rw-r--r-- 1 me tous  587 avril 16 16:06 bash0_chatTerm.md
-rw-r--r-- 1 me tous 1131 avril 17 19:02 bash0_FindingHelp.md
-rw-r--r-- 1 me tous  833 avril 18 11:18 bash0_zoomLS.md
drwxr-xr-x 8 me tous 4096 avril 18 11:08 .git
drwxr-xr-x 2 me tous 4096 avril 16 15:14 images

⮕ Arguments can be merged (in short format): ls -lahtr

for a complete (-a) and detailed view (-l), sizes in KB,MB,GB,TB… i.e. human readable (-h), sorted by date/time (-t) from oldest to most recent (-r):

me@here:~/formation_bash0$ ls -lahtr
total 56K
drwxr-xr-x 2 me tous 4.0K avril 16 15:14 images
-rw-r--r-- 1 me tous  587 avril 16 16:06 bash0_chatTerm.md
drwxr-xr-x 4 me tous 4.0K avril 17 18:51 ..
-rw-r--r-- 1 me tous 1.2K avril 17 19:02 bash0_FindingHelp.md
drwxr-xr-x 8 me tous 4.0K avril 18 11:08 .git
drwxr-xr-x 4 me tous 4.0K avril 18 11:18 .
-rw-r--r-- 1 me tous  833 avril 18 11:18 bash0_zoomLS.md

Filesystems are like trees

General information

  • The filesystem can be compared to a tree where leaves are directories and files. We can go throught it by following the branches.
  • The tree is anchored by the root: the / directory
arborescence0.png nautilus0.png

Filesystems are like trees

General information

When we go up in the tree (=down in the hierarchy) by following the branches, we can see that the / (root) contains multiple directories (e.g. shared)

arborescence2.png nautilus2.png

Filesystems are like trees

General information

The shared directory contains bank

arborescence3.png nautilus3.png

Filesystems are like trees

General information

The bank directory contains homo_sapiens

arborescence4.png nautilus4.png

Filesystems are like trees

General information

Thus, the pathway to go in the homo_sapiens directory from the root is : /shared/bank/homo_sapiens

arborescence5.png nautilus5.png

Other useful command - tree:

login@mylaptop:~$ tree -d
.
├── GRCh38
   ├── fasta
   ├── gff3
├── hg19
   ├── bwa
   ├── fasta
   ├── gtf
   └── star-2.7.5a
├── hg38
   ├── fasta
   └── star-2.7.5a
└── latest_genome -> GRCh38

Home sweet home

There’s not better place than home!

  • It’s the user’s directory & it stores all of the user’s documents
  • It’s symbolized by ~ (tilde)*
  • Most of the time it’s /home/userName (but it may vary according to the infrastructure you’re on e.g. on the IFB cluster, it’s: /shared/home/userName)
# absolute path
cd /home/userName
# short way for the same result
cd ~
# or
cd



__

*~ for Mac users: option + N or Alt + N

Autocompletion (<TAB><TAB>) - your new best friend!

If you want to shine in society or with your family by giving the impression of typing quickly, use auto-completion!

More seriously:

  • it’s essential for typing a path without making mistakes
  • it also saves time because you won’t need to type every single letter

E.g. try moving into the directory: /usr/local/bin using <TAB><TAB>

Where can I find the tab key?

Rest assured, you haven’t heard the last of

How to create, copy, remove files or directories

It’s important to organise files and directories to easily find data. 4 commands are useful:

  • mkdir (make directories) : to create a directory
# read the documentation
man mkdir
# create a directory named my_new_dir
mkdir my_new_dir
# check if the directory was created
ls
  • cp (copy) : to copy files and directories (/!\ to copy a directory, you need to add an option)
# read the documentation
man cp
# create a copy of a file
cp gameshell.sh copy_gameshell.sh
# create a copy of a directory
cp --recursive dir0/ dir1/

How to create, copy, remove files or directories

  • mv (move) : to move a file to an other directory or rename it
# read the documentation
man mv
# rename a file
mv my_file_with_a_long_useless_name_i_want_to_change.txt my_file.txt
# put my_file.txt in the directory called my_dir
mv my_file.txt my_dir/
  • rm (remove) : to remove files or directories. Warning: it’s easy to remove more files/directories than planned. To be sure, you can run the ls command before to check if it’s what you want.
# read the documentation
man rm
# remove a file
rm my_file_with_a_long_useless_name_i_want_to_change.txt

Know your rights!

Sometimes an error message may appear saying that you are “not authorised” to perform an action…

In Linux, your rights are dictated by three letters r, w and x:

  • r: read, right to read the file and open it
  • w: write, right to write and modify a file
  • x: execute, right to execute the file (a script, for example)

How do I know the rights of a file?

Remember ls -l? => ls to list the files of a folder, -l argument to get more information on the files

me@herer:~/formation_bash0$ ls -l
total 56
-rw-r--r-- 1 me tous  587 avril 16 16:06 bash0_chatTerm.md
-rw-r--r-- 1 me tous 1131 avril 17 19:02 bash0_FindingHelp.md
-rw-r--r-- 1 me tous  833 avril 18 11:18 bash0_zoomLS.md
drwxr-xr-x 2 me tous 4096 avril 16 15:14 images

When you type the ls -l command, you may notice that some lines start with a d (=directory) and other with a - (=file).

Note also the fact that rwx may or may not be repeated three times. The first triplet corresponds to the rights held by the owner of the file, the second corresponds to the rights allocated to users in the same group as the owner of the file and the last corresponds to the rights of all other users.

Accessing file contents

We’re often interested in the content of files: reading files, counting the number of lines, extracting a part (lines, columns), sorting lines, etc. Warning: some commands cannot access compressed files.

Counting the number of lines

  • wc (word count): count the number of lines, words and bytes of file(s)
# -l : count the number of lines in my_file.txt
wc -l my_file.txt

Read a file

  • With less or more you can read a file line by line (pager tool)
  • With head or tail you can visualize the first n lines or the last n lines of a file
  • With cat you print the full contents of the file
  • To navigate in the file when you are using less, it’s the same as the man command:
commands results
↑ or ↓ Go up or down in the file
> or < Go to the first or last line
/chr Then press enter to find the word chr
n or p find the next or previous word chr
q quit
# read a file 
# -N : display the line number
# -S : don't wrap lines even if they are longer than the screen (arrows to navigate)
less -S -N my_file.txt
# print the first five lines
head -n 5 my_file.txt
# print the last eleven lines
tail -n 11 my_file.txt

Select/remove a column in a file

If you have a tabular file format (e.g. .csv, .tsv, …) as below:

# file with chromosome, start, stop, name, score, strand
chr1    7517    7517    NM_023732__Abcb6    .   -
chr10   1826    1826    NM_019487__Hebp2    .   -
chrX    1494    1494    NONE    .   -
chrY    3470    3470    NA  .   +
chr11   3054    3054    NA  .   -
chr2    1929    1929    SITE    .   +
  • cut : removes columns from each line of files

E.g. if you want to select the name and the score columns of the file above:

# keeped the column 1 to 3 and 6
cut -f 1-3,6 myf.tsv
chr1    7517    7517    -
chr10   1826    1826    -
chrX    1494    1494    -
chrY    3470    3470    +
chr11   3054    3054    -
chr2    1929    1929    +

If the separator is not a \t (tabulation), you can change it with the option -d.

Sort a file

  • sort : sort lines of text files
# for help on sort usage
man sort

E.g. if you want to sort the previous file by chromosome (first column in myf.tsv):

# numeric               |  # alphanumeric         |   # version
# sort -k1,1n myf.tsv   |  # sort -k1,1d myf.tsv  |   # sort -k1,1V myf.tsv
chr10    1826  1826 -   |  chr1   7517  7517  -   |   chr1   7517  7517  -
chr11    3054  3054 -   |  chr10  1826  1826  -   |   chr2   1929  1929  +
chr1     7517  7517 -   |  chr11  3054  3054  -   |   chr10  1826  1826  -
chr2     1929  1929 +   |  chr2   1929  1929  +   |   chr11  3054  3054  -
chrX     1494  1494 -   |  chrX   1494  1494  -   |   chrX   1494  1494  -
chrY     3470  3470 +   |  chrY   3470  3470  +   |   chrY   3470  3470  +

Input, Output, and Error Streams

Some commands work on the basis of data either typed in by the user or written in a file. It is referred to as the standard command input stream or stdin.

Similarly, some commands provide data either displayed on screen (e.g. the ls command) or in a file. It is referred to as the standard command output stream or stdout.

There is a 3rd standard stream, which is the standard error stream or stderr. Under Unix, a command that ends without an error returns “0”, otherwise, it returns a number indicating the error code, or an explanatory sentence. By default, the error stream is also displayed on the screen (like the output stream).

Standard streams (arrows)

Stream redirections

When the result of a command is of interest to a subsequent question/command, it can be transformed into a file rather than displayed on the screen, so that the next command can read this file as input data.
This is known as stream redirection.

E.g. below, the output file of command 1 becomes the input file of command 2, thanks to the pipe redirection operator indicated in red (it will be the | character in the command line*):

redirection with pipe
  • Some other redirection operators:
    • > myfile.txt: stores the stdout stream by creating (or overwriting) the myfile.txt file
    • >> myfile.txt: stores the stdout stream by adding lines to the myfile.txt file

*|: Shift+\ on Mac keyboards, AltGr+6 on Windows

Redirection examples

Example 1 - Count the number of files

For example, to count the number of files (assuming the filenames all have an extension) you can use the ls command followed by the wc command:

ls *.* | wc -l

There’s no limit (apart from human understanding) to the number of pipe redirections you can associate.

Example 2 - Count number of files per user

Here is how to count the number of files created by each user in a shared project: from the list of items (with ls), you can extract (with cut) the user column (that starts at around the 14th character), sort them (with sort) and then count them with the uniq command and its -c option (uniq lists single lines):

ls -lah *.* | cut -c 14-20 | sort | uniq -c

Example 3 - File inventory

It is possible to create a new file (named my_txt_files.txt) containing a list of all files with a special filter, for example, the extension .txt :

ls *.txt > my_txt_files.txt

Best practices

File naming!

  1. Avoid spaces (it’s possible but it’s encouraged to use “_” instead)
  2. Keep concise (<30 characters if possible)
  3. Use ISO 8601 formatted dates (YYYYMMDD or YYYY-MM-DD)
  4. Avoid special characters, such as: é è ç ~ ! @ # $ % ^ & * ( ) ; : < > ? . , { } ’ ” |

Example dealing with file names with spaces

me@here:~$ ls
'file name.txt'
me@here:~$ cat file name.txt
cat: file: No such file or directory
cat: name.txt: No such file or directory

The terminal raised a cat error because it didn’t understand the fact that “file name.txt” was a unique argument. Spaces are commonly argument separators.
Thus, it reads your input as 2 separate files: file and name.txt, which don’t exist.

It’s possible to use spaces by escaping the character with a backslash (file\ name.txt) or using quotes ("file name.txt") but it’s messy and can be a source of future errors.

Warning with Windows/MacOS files

Watch out for hidden characters when you edit files with Windows or MacOS. Linux can be picky…

For example, although my script looks ok, I get an error when I run it:

# run the script on linux
me@here:~$ ./my_script_windows.sh
bash: ./my_script_windows.sh: /bin/bash^M: bad interpreter: No such file or directory

What is /bin/bash^M? I don’t see it in my file??!

The -A option of cat will help you see all invisible characters:

me@here:~$ cat -A ./my_script_windows.sh
#!/bin/bash^M$
# revomed score column from my file^M$
cut -f 1-3,6 exoBed.bed > filter_exoBed.bed^M$
# sorted file by chromosome name and start coordinates^M$
sort -k1,1V -k2,2n filter_exoBed.bed > sort_filter_exoBed.bed^M$

Here, the end of line is not correctly encoded for linux (^M).

/!\ No Excel or Word formats!!!!!!! Stick to simple text formats
/!\ Avoid copy-pasting code from the web (hidden characters = danger)

Conclusion

Now you know:

  • how to navigate the file system in the unix world
  • several commands to access file contents
  • that a succession of bash commands can be used to compose more complex tasks

It’s a good start (see FAQ if you’re lost)

But there’s more to the Unix world than that!

  • we’ve only mentioned the most common bash commands, but there are many more (a useful cheat sheet of basic Unix commands)
  • you can also design your own commands - this is called programming. With a programming language: bash, but there are many others (python, R, C, etc.)
  • you can also install programs written by others. These are often referred to as packages

The door is open: welcome!

End of session

Your sessions will stay accessible for another 7 days.

You can continue your gameshell session on:

  • your PC if you have Linux (not sure if it works on Windows or Mac)
  • the I2BC cluster

All you need is to copy your gameshell-save.sh file.

End of session

I2BC cluster option

To copy your file to the I2BC cluster from your laptop:

scp gameshell-save.sh your.login@passerelle.i2bc.paris-saclay.fr:/store/USERS/your.login/

To connect to the I2BC cluster from your laptop:

# connect to passerelle
ssh your.login@passerelle.i2bc.paris-saclay.fr
# then from passerelle to cluster
ssh your.login@cluster-i2bc.calcul.i2bc.paris-saclay.fr

To run gameshell on the cluster:

cd /store/USERS/your.login/
bash gameshell-save.sh

End of session

Up-coming training sessions

See the BIOI2 webpage

  • First steps with the I2BC cluster
  • First steps with Snakemake
  • Learning how to manipulate & interprete AlphaFold results