FASTA is among the most popular formats for storing biological sequence data. Here is what it is and how it is used.
The FASTA format is a text-based file format for storing protein and nucleic acid sequences. FASTA files commonly have the .fasta
or .fa
file extensions.
A FASTA file can contain one or more sequences. A sequence "block" in FASTA starts with a single-line description. The description line always begins with a greater-than character (>
).
The sequence description line is immediately followed by the lines of sequence data. The residues in the sequence are represented using single-letter codes.
Here is an example of one protein sequence in the FASTA format:
>FER1_ARATH
MASTALSSAIVSTSFLRRQQTPISLRSLPFANTQSLFGLKSSTARGGRVTAMATYKVKFI
TPEGEQEVECEEDVYVLDAAEEAGLDLPYSCRAGSCSSCAGKVVSGSIDQSDQSFLDDEQ
MSEGYVLTCVAYPTSDVVIETHKEEAIM
Whether you perform sequence analysis on a daily basis or only occasionally run BLAST, it is important that you are familiar with the FASTA format and can easily open FASTA file and navigate their contents. To open and view FASTA files, you can use the web-based Eckher Sequence Alignment Viewer which is free to use and does not require downloading or installation.