A nucleic acid sequence database created by the European Bioinformatics Institute (EBI). EMBL's data sources mainly come from two parts. One part is directly submitted by scientific researchers or certain genome sequencing institutions through computer networks, and the other part comes from scientific literature or patents (Stoesser et al., 1998). EMBL has a cooperative relationship with DDBJ and GenBank. They collect nucleic acid sequence information around the world and exchange newly discovered or updated data with each other every day.
The size of DNA databases is growing exponentially, doubling in less than 9 months on average. In January 1998, the number of sequences included in EMBL exceeded one million, including 15,500 species, of which more than 50% were from model organisms, including humans (Homo sapiens), nematodes (Caenorhabditis elegans), and brewer's yeast (Saccharomyces). cerevisiae), mouse (Mus musculus) and Arabidopsis thalania.
The sequence query system SRS (Sequence Retrieval System) can be used to extract relevant information from the EMBL database (Etzold et al., 1996). The SRS sequence query system connects DNA sequence databases with various databases such as protein sequences, functional sites, structures, gene maps, and literature abstracts MEDLINE through hypertext links. The EMBL database can be searched for unknown sequence homology using the BLAST or FastA program provided on the EBI website.