This is the common name for the application that gives you text-based access to the operating system of the computer. Basically, it allows you to type input into the computer so that you can receive output from the programs you call. On Unix based machines there is always a terminal
program.
Some brief tips before we move further:
You will need to frequently resize this window as we do the hands-on activities. This is done as with any other window.
You can change the size of the text by holding command + shift
and hitting +
or command
plus the -
key.
You can have multiple terminal windows open at the same time.
You can also have multiple tabs open in each terminal window.
Everything you type in the terminal is case sensitive. grep -F
is not the same as grep -f
. This will be a very important thing to remember when using the terminal and creating files or folders.
Often with bioinformatics analysis, you will be performing tasks on a remote network server, one much more powerful than your workstation. Let's use the terminal program you just learned about to connect with this server. We will use Secure SHell (SSH) protocol to talk to the remote server. In the following example, replace username
with the user name provided to you.
Two-Factor Authentication
To improve computer security, University of Wyoming has started requiring two-factor authentication to grant access to network servers. Two factor implies password plus a second type of authentication. You all have Yubikeys (a usb flash drive like device) for the token needed for this.
ENTER
key. This will bring up the password prompt. $ ssh username@mtmoran.uwyo.edu
TWO-FACTOR AUTHENTICATION
=============================================================================
This system requires two-factor authentication.
The password requirement is your UWYO domain password.
The token can be generated by your registered YubiKey or manually input with
the Duo mobile app. If you have questions about using this implementation of
two-factor authentication, contact the ARCC team at arcc-info@uwyo.edu
Please enter the two-factor password the in the form:
<password>,<token>
=============================================================================
$ wyoinbre,<Press YubiKey Gold Button>
~
is a shortcut for your home directory (/home/username
) on the remote server. [username@mmmlog1 ~]$
control + shift + n
), new tabs (control + shift + t
), and resizing this window.As you will learn everything in Linux is relative to where you are in the file system. Therefore, knowing where you are before launching a command is valuable information. Luckily, there are built in commands for this type of information. Understanding the location of files will be a key part of success.
To find out where you are in the file system type pwd
in the terminal window. This will return your current working directory (print working directory)
As you can see above the working directory is: /project/inbre-train/username
The present directory is represented as .
(dot) and the parent directory is represented as ..
(dot dot)
Directories are separated by forward slashes /
. Together they make up path
. /project/inbre-train/username/
is the path to your home directory.
Linux files are arranged in a hierarchical structure, or a directory tree. From the root directory (/
) there are many subdirectories. Each subdirectory can contain files or other subdirectories, etc., etc. Whenever you're using the terminal you will always be 'in' a directory. The default behavior of opening a terminal window, or logging into a remote computer, will place you in your ‘home’ directory. This is true when we logged into MtMoran. The home directory contains files and directories that only you can modify, we will get to those permissions later.
To see what files, or directories, you have in your home directory we will use the ls
command.
Type ls
and hit enter
You should see the list of files and directories in your current folder.
After the ls
command finishes it produces a new command prompt that is ready for your next command.
The ls
command can be used to list the contents of any directory not necessarily just the one you are currently in.
Type ls LearnLinux
To move between directories (folders) we use the cd
(change directory) command. We are currently in our home directory. Lets move to /project/inbre-train/username/LearnLinux
. The cd
command uses following syntax:
$ cd DIRECTORY
$ cd /project/inbre-train/username/LearnLinux
Type ls
Type pwd
You can see that using the cd
command moved us to a different directory.
By typing ls
we can see that there is different stuff in this directory.
Finally, using pwd
shows us what directory we are not located in.
We could have done the previous example in separate steps
Type cd /project
Type cd inbre-train/username
Type cd LearnLinux
Note that we needed to type /project
but not /username
. When using a /Directory
you are specifying a directory that is directory below the root directory. Without the leading /
the system looks below the current directory.
Type cd project
Type cd /project
What happened with the first command?
You will frequently need to move up a level to a parent directory. Remember that two dots ..
are used to represent the parent directory. Every directory has a parent except the root level.
Type cd ..
Type pwd
You can move multiple levels at the same time
Type cd /project/inbre-train/username/LearnLinux
Type cd ../..
Type pwd
When using cd
everything is relative to your current location. However you can always use the absolute location to change directories. Lets move into the Code
directory and look at two ways to switch to the Data
directory.
Type cd /project/inbre-train/username/LearnLinux/Code
Option 1: Type cd ../Data
Type pwd
Type cd /project/inbre-train/username/LearnLinux/Code
Option 2: Type cd /project/inbre-train/username/LearnLinux/Data
Type pwd
As you can see both options get us to the same place but Option 1 will only work from within a directory below Data. Option 2 will work from any location on the machine.
Creating directories in Unix is done with the mkdir
(make directory) command.
$ mkdir DirectoryName
Using spaces when naming directories, like on your desktop, is not advised in the Unix file system. This is why you see the use of _
in place of spaces. You can escape a space in Unix but it creates unnecessary typing and can create issues executing certain programs. Generally, using spaces in file and directory names is something to avoid.
Type cd /project/inbre-train/username/LearnLinux
Type mkdir Work
Type ls
Type mkdir Temp1
Type cd Temp1
Type mkdir Temp2
Type cd Temp2
Type pwd
In the previous example we created two temporary directories but it took two steps. We could have done this in one step with by adding an option/flag to the mkdir command.
Type cd ..
Type mkdir –p Temp1.1/Temp2.1
In this section you will learn the basics of making files and putting things into those files. There are a variety of ways we can accomplish this as Unix has built in multiple editors for these tasks. We will review a few here.
$ touch FILENAME
This will create a new, empty file.
$ nano FILENAME
This is a built in text editor that will allow us to put information into a file.
LearnLinux/
$ touch earth.txt
$ touch heaven.txt
Type ls
We have now created two empty files called earth.txt
and heaven.txt
Type cd Work
Type touch basic_info.txt
Type nano basic_info.txt
We are now using an internal text editor that we can use to alter the contents of this file.
Add your name, email address, and favorite food to this file on separate lines.
Press control + x (^x)
to exit and then y
to save the file
$ nano onestep.txt
To move a file or directory the mv
(move) command is used. This is the first command we have used that requires two arguments. You need to specify the source and the destination for the moving.
$ mv SOURCE DESTINATION
Lets move heaven.txt
and earth.txt
$ cd /project/inbre-train/username/LearnLinux
$ mv heaven.txt Work/
$ mv earth.txt Work/
$ ls
$ ls Work/
We could have moved these files all at once using wildcards. An asterisk *
means match anything.
$ mv *.txt Work/
#This will move any file that ends with .txt
$ mv *t Work/
# This moves any file or directory that ends with a t
$ mv *ea* Work/
#This works because only heaven and earth contain ‘ea’
$ mv
can also be used to rename files
$ touch rags
$ ls
$ mv rags Work/riches
$ ls Work/
Here we move and renamed the file rags
to Work/riches
We can rename it without moving the file
$ mv Work/riches Work/rags
$ ls Work/
The mv
command is also used to rename files or directories.
To copy a file or directory cp
(copy) command is used. Just like mv you will need a source and a destination to copy something.
$ cp SOURCE DESTINATION
Copying files is similar to moving them
$ cd /project/inbre-train/username/LearnLinux/Work
$ mkdir Copy
$ cd Copy
$ touch file1
$ cp file1 file2
$ ls
Remember we do not have to be in a directory to make, move, or copy files.
$ touch /project/inbre-train/username/file3
$ ls
$ cp /project/inbre-train/username/file3 .
# here we represent the current directory with a .
(dot)
$ ls
The cp
command can also move directories using a flag
$ mkdir Example
$ mv file* Example/
$ ls
$ cp –r Example/ Example2
$ ls Example Example2/
What happens if you do not use the -r
flag?
$ cp Example2/ Example3
The error occurs because the –r
flag means copy recursively. Since Example2 is not empty cp (without –r) does not descend into Example2 and copy those files it simply tries to move a directory without moving the things in the directory.
To view the contents of directories we use the ls
(list segments) command.
$ ls DIRECTORY
If no directory is provided ls will list the contents of the current directory.
We have been using ls
frequently to check directory contents. However, there are many options for using ls
. As the previous example noted we can use ls on multiple directories at the same time.
$ ls –l /project/inbre-train/username/LearnLinux
$ ls –p /project/inbre-train/username/LearnLinux
$ ls –othr /project/inbre-train/username/LearnLinux
#Nic’s favorite!!
As you can see these flags/options change the way ls
displays the contents of the directory, giving us more or less information.
Notice the changes that the –p
or the combination of –o
, -t
, -h
, and –r
flags makes to the output.
Caution: This is a dangerous command. File/Folder deletion in Unix is permanent and nonreversible.
If you run ls
on your LearnLinux/Work/
directory, it is likely full of lots of empty files and directories by this point. Wouldn't it be nice if there were a way to clean that up? Of course there is a way, however it can be dangerous. To delete directors and files from the system we have two options the rm
(remove) and rm –r
commands.
$ rm FILE
One more time just to be clear. It is possible to delete EVERY file you have ever created with the rm
command. Thankfully there is a way to make rm a bit safer, and on DT2, this is the default setting. Using the –i
flag rm will ask for confirmation before deleting anything.
$ cd /project/inbre-train/username/LearnLinux/Temp1
$ ls
If you remember the Temp2
directory is empty therefore we can use rm -r
to delete it.
$ rm -r Temp2/
$ ls
We can now move up a level and remove Temp1
$ cd ..
$ rm -r Temp1/
$ ls
Now try $ rm Temp1.1
See the error message letting us know that we are trying to remove a directory not a file.
See? Linux is warning us. So back to the recursive flag:
$ rm –r Temp1.1
Note the default behavior of rm
is to simple delete with out confirmation of what you typed (except for directories of course). This is why it is so dangerous.
You can have rm ask for conformation before deleting anything using the –i flag (no one does this in practice though).
$ mkdir –p Temp1/Temp2/Temp3/Temp4
$ cp –r Temp1 Temp1.1
$ ls
$ rm –r Temp1
$ rm –ir Temp1.1
$ ls
There are various commands available to display/print the contents of a file. The default of all these commands is to display the contents of the file on the terminal. These commands are less
, cat
, head
, and tail
.
$ less FILENAME
Displays file contents on the screen with line scrolling (to scroll you can use 'arrow' keys, 'PgUp/PgDn' keys, 'space bar' or 'Enter' key). Press 'q' to exit.
$ cat FILENAME
Simplest form of displaying contents. It catalogs the entire contents of the file on the screen. In case of large files, entire file will scroll on the screen without pausing.
$ head FILENAME
Displays only the 10 starting lines of a file by default. Any number of lines can be displayed with the -n
flag followed by the number of lines.
$ tail FILENAME
As the name implies the opposite of head this displays the last 10 lines. Again -n
option can be used to change this.
$ cd /project/inbre-train/username/LearnLinux/
$ less Data/Arabidopsis/At_proteins.fasta
Try this: type =
. This is a large file. You can see at the bottom, less
displays that we are looking at lines 1-32 of 269,463 and we are 0% through the file.
We can use h
to get help commands for less.
Page forward using 'space', move a line at a time with j
(forward) or k
(backward) or N
lines.
Hit q
to exit the help
Navigate around using the various commands
Try hitting j
ENTER
100
ENTER
Press q
when ready to exit less.
Navigate the file using the more command, press q
to exit.
$ cat
is the simplest form of viewing and file. Command cat
prints all of the file to the screen from start to finish.
$ cat Data/Arabidopsis/At_genes.gff.short
Did you get all of that?
$ cat
is most useful with combined with other commands using |
(pipes). We will cover this later.
The last two commands head
and tail
are fantastic when you need to look at a file and make sure things are in order.
$ head Data/GenBank/E.coli.genbank
$ head Data/GenBank/Y.pestis.genbank
We can change how many lines we see using the -n
flag
$ head -n 1 Data/GenBank/E.coli.genbank Data/GenBank/Y.pestis.genbank
$ tail Data/GenBank/E.coli.genbank Data/GenBank/Y.pestis.genbank
This shows us the end of a file. This can be important when transferring files or data and needing to make sure everything transferred completely.
All files in any operating system have a set of permissions associated with the file that define what can be done with the file and by whom. What = read, write (modify), and/or execute a file. Whom = user, group, or public. These permissions are denoted with the following syntax:
Permissions
Read: r
Write: w
Execute: x
Relations
User: u
Group: g
Others: o
All users: a
Changing permissions is done via chmod
(change mode) command
$ chmod [Options] RELATIONS [+ or -] PERMISSIONS FILE
Lets make a new directory and add some files.
From the LearnLinux/
directory
$ mkdir Allow
$ cd Allow/
$ touch read.txt write.txt execute.go all.txt
$ ls
$ ls –l
We have created some files but we need to change the permission for these files in order to share these or execute them as programs.
Since you created these files you’re the owner and have the ability to change their permissions with chmod
.
From this you can see the default is for the user to have rw
access and the group and others to have r
access.
Lets add execute permissions for everyone on execute.go
$ chmod a+x execute.go
$ ls –l
We have now added the x
option to all three levels of permission for this file
If we want other members of the group to have write permission for write.txt
we can do that as well.
$ chmod a+w write.txt
$ ls –l
Others still cannot modify this file but now members of the group will be able to modify the contents.
$ chmod a+rwx all.txt
$ ls –l
Now the file all.txt
can be read, written, or executed by anyone on this system.
We can also remove permissions using this same command.
$ chmod a-rwx all.txt
$ ls –l
Now we have removed all access to the all.txt
file even the owner's access.
Finally we can change the permissions of all the files in a directory with the -R
flag.
$ cd ..
$ chmod –R a+rwx Allow/
$ ls –l Allow/
This made all of these files public in one step.
Just like Perl, Python, R, C++ etc. BASH (Bourne Again Shell) is a programming language that works on Unix and Unix-like computers (Linux, Macintosh, BSD etc.). All the commands that you have been passing to the terminal, are in fact being executed by bash
, the command shell. A shell script is simply a collection of various bash commands that are executed sequentially. To make a script we simply write shell commands into a file and then treat that file like any other program or command.
Open a new shell script
$ cd LearnLinux/Code/
$ nano hello.sh
Type the following two lines in this file
# This is my first shell script.
echo "Hello World!"
Save the file and exit nano
At the command prompt, make the file executable
$ chmod u+x hello.sh
$ ./hello.sh
The commands that you have learned so far are essential for doing any work in Unix, but they don’t really let you do anything that is very useful. The following section will introduce new commands that will start to show you the power of Unix.
Everything we have done so far has sent the result of the command to the screen. This is feasible when the data being displayed is small enough to fit the screen or if it is the endpoint of your analysis. But for large data outputs, or if you need a new file, printing to the screen isn't very useful. Unix has built in methods to hand output from commands using > (greater than) or < (lesser than) or >> signs.
< redirects the data to the command for processing
> redirects the data from the command’s output to a file. The file will be created if it is non-existing and if present it will overwrite the contents with the new output data (you will lose the original file).
>> unlike > this redirection lets user append the data to an already existing file or a new file
Another special operator | (called pipe) is used to pass the output from a command to another command (as input) before sending it to an output file or display.
Some examples:
# Creates a new file (file2) with same contents as old file (file1)
$ cat FILE1 > FILE2
# Appends the contents for file1 to file2, equivalent to opening file1,
# copying all the contents, pasting the copied contents to the end of
# the file2 and saving it!
$ cat FILE1 >> FILE2
$ cat FILE1 | less
Here, cat command displays the contents of the file1, but instead of sending it to standard output (screen) it sends it through the pipe to the next command less so that contents of the file are now displayed on the screen with line scrolling.
From the LearnLinux/Data/
directory
$ cat seq.fasta
$ head seq.fasta > new.txt
$ cat new.txt
$ tail seq.fasta > new.txt
$ cat new.txt
Now lets try that with the append option.
$ head –n 1 seq.fasta > new.txt
$ tail –n 1 seq.fasta >> new.txt
The grep (globally search a regular expression and print) is one of the most useful commands in Unix and it is commonly used to filter a file/input, line by line, against a pattern.
$ grep [OPTIONS] PATTERN FILENAME
Like any other command there are various options available man grep for this command. Most useful options include:
Some typical scenarios to use grep
Counting number of sequences in a multi-fasta sequence file
Get the header lines of fasta sequence file
Find a matching motif in a sequence file
Find restriction sites in sequence(s)
Get all the Gene IDs from a multi-fasta sequence files and many more.
You might already know that fasta files header must start with a > character, followed by a DNA or protein sequence on subsequent lines. To find only those header lines in a fasta file, we can use grep.
Move to LearnLinux/Data/Arabidopsis
$ grep ">" intron_IME_data.fasta
Did you get that?
Remember the default for a program is to output to the screen.
We can fix this with a redirect or a pipe.
$ grep ">" intron_IME_data.fasta | less
This takes the output from grep and sends it as input to less
What if we want to know how many sequences are in a file?
$ grep –c ">" intron_IME_data.fasta
We can also get lines that don't match our string.
$ grep –v ">" intron_IME_data.fasta | less
Given a the fasta file structure we can use grep to separate this information
$ grep ">" intron_IME_data.fasta > intron_headers.txt
$ grep –v ">" intron_IME_data.fasta > intron_sequences.txt
We can even get some biological information from grep
$ grep –-color "GAATTC" chr1.fasta
GAATTC is the EcoR1 cut site. The --color
option highlights the matches in this sequence.
grep + regular expressions (also called regex) = power! Before we get into this let's start with a task.
TASK
The '.' and '*' characters are also special characters that form part of the regular expression. Try to understand how the following patterns all differ. Try using each of these patterns with grep -c
against any one of the sequence files. Can you predict which of the five patterns will generate the most matches?
ACGT
AC.GT
AC*GT
AC.*GT
The asterisk in a regular expression is similar to, but NOT the same, as the other asterisks that we have seen so far. An asterisk in a regular expression means: match zero or more of the preceding character or pattern Try searching for the following patterns to ensure you understand what '.' and '*' are doing:
A...T
AG*T
A*C*G*T*
When working with the sequences (protein or DNA) we are often interested to see if a particular feature is present or not. This could be various things like a start codon, restriction site, or even a motif. In Unix all strings of text that follow some pattern can be searched using some formula called regular expressions. e.g. As you learned above regular expressions consist of normal and meta characters. Commonly used characters include:
Expression | Function |
---|---|
. |
matches any single character |
$ |
matches the end of a line |
^ |
matches the beginning of a line |
* |
matches one or more character |
\ |
quoting character, treat the next character followed by this as an ordinary character. |
[] |
matches one or more characters between the brackets |
[range] |
match any character in the range |
[^range] |
match any character except those in the range |
\{N\} |
match N occurrences of the character preceding (sometimes simply +N) where N is a number. |
\{N1,N2\} |
match at least N1 occurrences of the character preceding but not more than N1 |
? |
match 1 occurrence of the character preceding |
| |
match 2 conditions together, (this|that) matches both this or that in the text |
Here are some common regex patterns for Nucleotide/Protein searches:
Patterns | Matches |
---|---|
^ATG |
Find a pattern starting with ATG |
TAG$ |
Find a pattern ending with TAG |
^A[TGC]G |
Find patterns matching either ATG, AGG or ACG |
TA[GA]$ |
Find patterns matching either TAG or TAA |
^A[TGC]G*TGTGAACT*TA[GA]$ |
Find gene containing a specific motif |
[YXN][MPR]\_[0-9]\{4,9\} |
Find patterns matching NCBI RefSeq (e.g. XM_012345) |
\(NP\|XP\)\_[0-9]\{4,9\} |
Find patterns matching NCBI RefSeq proteins |
Let's use grep to find a zinc finger motif. For simplicity let’s assume zinc finger motif to be CXXCXXXXXXXXXXXXHXXXH. Either you can use dots to represent any amino acids or use complex regular expressions to come up with a more representative pattern.
$ grep --color "C..C............H...H" At_proteins.fasta
$ grep --color "C.\{2\}C.\{12\}H.\{3\}H" At_proteins.fasta
$ grep --color "C[A-Z][A-Z]C[A-Z]\{12\}H[A-Z][A-Z][A-Z]H" At_proteins.fasta
These all do the exact same thing. As you can see, regular expressions can be very useful for finding patters of all kinds.
UNIX Tip: You can use regular expressions in grep, sed, awk, less, perl, python, certain text editors almost any programing language or tool can utilize the power of regex.
sed is a stream editor that reads one or more text files and makes changes or edits then writes the results to standard output. The simple syntax for sed is:
$ sed 'OPERATION/REGEXP/REPLACEMENT/FLAGS' FILENAME
Above, /
is the delimiter but you can use _
|
or :
as well.
OPERATION
= the action to be performed, the most common being s
which is for substitution.
REGEXP
and REPLACEMENT
= the search term and the substitution for the operation be executed.
FLAGS
= additional parameters that control the operation.
Common FLAGS include:
g replace all the instances of REGEXP with REPLACEMENT (globally)
n (n=any number) replace nth instance of the REGEXP with REPLACEMENT
p If substitution was made, then prints the new pattern space
i ignores case for matching REGEXP
w If substitution was made, write out the result to the given file
d when specified without REPLACEMENT, deletes the found REGEXP
From LearnLinux/Data/Arabidopsis
$ head –n 1 chr1.fasta
$ sed "s/Chr1/Chromosome_1/g" chr1.fasta | head –n 1
$ sed "s:Chr1:Chromosome_1:g" chr1.fasta | head –n 1
As you can see these two commands do the same thing with different delimiters. We Changed "Chr1" to "Chromosome_1" in the file. However this was not done permanently. To do that we would have to write to a new file or use a flag within sed.
$ touch greetings.txt
$ echo "Hello there" >> greetings.txt
$ head greetings.txt
Now we have our file to manipulate with sed. We have three options for altering and saving the file:
Option 1: Make a new file
$ sed 's/Hello/Hi/g' greetings.txt > greetings_short.txt
$ head greetings*
Option 2: Edit in place but make a backup of the original with the given extension
$ sed –i.bak 's/Hello/Hi/g' greetings.txt
$ head greetings*
Option 3: Edit in place but without a backup. NOTE if you run out of system memory or have an error this will rewrite the original file. You will not get that file back.
$ sed –i 's/Hello/Hi/g' greetings.txt.bak
$ head greetings*
wc
(word count) is a useful command in bioinformatics because it can quickly identify how many lines or words are in a file.
$ wc FILENAME
From Data/Arabidopsis
$ wc At_genes.gff
Here we have the total number of lines, words, and bytes in this file
$ wc -l At_genes.gff
This prints out just the line count for the input file.
$ wc
is best used in with pipes but it can be useful to count things as well
$ ls /project/inbre-train/UserName/LearnLinux/Data/Sequences | grep ".fa" | wc –l
This command tells you how many .fa
files there are in the Sequences directory.
sort
command can be used to arrange things in a file. Simplest way to use this command is:
$ sort FILE1 > SORTED_FILE1
sort
has these commonly used flags:
-n numerical sort
-r reverse sort
-k N,N sort the Nth field (column), where N is a number. Sorting can also be done on the exact character on a particular field e.g. –k 4.3,4.4 sorts based on 3rd and 4th character of the 4th field. Additionally you can supply additional –k for resolving ties.
-t specify the delimiters to be used to identify fields (default is TAB) -t ‘:‘ to use ‘:’ as delimiter
TASK
The LearnLinux/Data/Sequences
directory consists of numerically labeled files. Unix can sort either alphabetically or numerically (not both) and hence they are arranged in Seq1.fa
, Seq10.fa
, Seq11.fa
etc. In order to sort them in an easy to read way, try using:
$ ls |sort –t 'q' -k 2n
This command lists all the files in Sequences/
directory and then passes it to sort command. Sort command then sorts it numerically but only using 3rd and 4th letters of the first field (file name)
Try using sort on Data/Arabidopsis/At_genes.gff
$ sort -r -k 1 At_genes.gff
$ sort -r -k 4 At_genes.gff
uniq
(unique) command removes duplicate lines from a sorted file, retaining only one instance of the running matching lines. Optionally, it can show only lines that appear exactly once, or lines that appear more than once. uniq requires sorted input since it compares only consecutive lines.
$ uniq [OPTIONS] INFILE OUTFILE
Useful options include:
-c count; prints lines by the number of occurrences
-d only print duplicate lines
-u only print unique lines
-i ignore differences in case when comparing
-s N skip comparing the first N characters (N=number)
TASK
From Data/
$ cat uniq.txt
### 15.8 Dividing files by Columns ###
cut
extracts entire columns of data from files. By default, it assumes that columns are tab
delimited, but this is not always the case. If your data file contains columns (called 'fields' here) that are separated by other delimiters e.g. space
or comma
, then you will need to tell cut
about it.
The following example assumes that the fields are separated by tabs. This will print the first column from the input file to the screen.
$ cut –f1 FILE
Here is an example of a .csv
file (comma separated values). The following command will display columns 2 through 4 from this file.
$ cut –d ',' –f2-4 FILE
Another example where the delimiter is a pipe (|
) and the command will display 1st and 9th column.
$ cut –d '|' –f1,9 FILE
TASK
Display only first column of the At_genes.gff
file using cut
Now can you display that in a way so you can actually see what the output looks like?
What if you want column 1, 4, and 5?
i. On Mt.Moran Remote Server
Make sure you are in /project/inbre-train/username/week1
After you are finished with all exercises, use the history
command to direct all your activity to a text file as follows:
$ history > netid_week1_history.sh
ii. Homework
$ mkdir -p /Users/username/molb4485/LastName_week1
$ history > /project/inbre-train/username/week1/netid_week1_history.sh
$ mkdir -p /home/username/molb4485/week1
$ history > ~/molb4485/week1/netid_week1_history.sh