genbankread
Read data from GenBank file
Description
reads a GenBank-formatted file, GenBankData = genbankread(File)File, and creates
GenBankData, a structure or array of structures, with the fields
corresponding to the GenBank® keywords.
sets the connection timeout (in seconds)
to read data from a remote file or URL.GenBankData = genbankread(File,
TimeOut=TimeOutValue)
Examples
Retrieve sequence information for the HEXA gene, store the data in a file, and then read into the MATLAB® software.
getgenbank("nm_000520",ToFile="TaySachs_Gene.txt") s = genbankread("TaySachs_Gene.txt")
s = struct with fields:
LocusName: 'NM_000520'
LocusSequenceLength: '4785'
LocusNumberofStrands: ''
LocusTopology: 'linear'
LocusMoleculeType: 'mRNA'
LocusGenBankDivision: 'PRI'
LocusModificationDate: '05-OCT-2024'
Definition: 'Homo sapiens hexosaminidase subunit alpha (HEXA), transcript variant 2, mRNA.'
Accession: 'NM_000520'
Version: 'NM_000520.6'
GI: ''
Project: []
DBLink: []
Keywords: 'RefSeq; MANE Select.'
Segment: []
Source: 'Homo sapiens (human)'
SourceOrganism: [4×65 char]
Reference: {[1×1 struct] [1×1 struct] [1×1 struct] [1×1 struct] [1×1 struct] [1×1 struct] [1×1 struct] [1×1 struct] [1×1 struct] [1×1 struct]}
Comment: [46×66 char]
Features: [160×74 char]
CDS: [1×1 struct]
Sequence: 'ctcacgtggccagccccctccgagaggggagaccagcgggccatgacaagctccaggctttggttttcgctgctgctggcggcagcgttcgcaggacgggcgacggccctctggccctggcctcagaacttccaaacctccgaccagcgctacgtcctttacccgaacaactttcaattccagtacgatgtcagctcggccgcgcagcccggctgctcagtcctcgacgaggccttccagcgctatcgtgacctgcttttcggttccgggtcttggccccgtccttacctcacagggaaacggcatacactggagaagaatgtgttggttgtctctgtagtcacacctggatgtaaccagcttcctactttggagtcagtggagaattataccctgaccataaatgatgaccagtgtttactcctctctgagactgtctggggagctctccgaggtctggagacttttagccagcttgtttggaaatctgctgagggcacattctttatcaacaagactgagattgaggactttccccgctttcctcaccggggcttgctgttggatacatctcgccattacctgccactctctagcatcctggacactctggatgtcatggcgtacaataaattgaacgtgttccactggcatctggtagatgatccttccttcccatatgagagcttcacttttccagagctcatgagaaaggggtcctacaaccctgtcacccacatctacacagcacaggatgtgaaggaggtcattgaatacgcacggctccggggtatccgtgtgcttgcagagtttgacactcctggccacactttgtcctggggaccaggtatccctggattactgactccttgctactctgggtctgagccctctggcacctttggaccagtgaatcccagtctcaataatacctatgagttcatgagcacattcttcttagaagtcagctctgtcttcccagatttttatcttcatcttggaggagatgaggttgatttcacctgctggaagtccaacccagagatccaggactttatgaggaagaaaggcttcggtgaggacttcaagcagctggagtccttctacatccagacgctgctggacatcgtctcttcttatggcaagggctatgtggtgtggcaggaggtgtttgataataaagtaaagattcagccagacacaatcatacaggtgtggcgagaggatattccagtgaactatatgaaggagctggaactggtcaccaaggccggcttccgggcccttctctctgccccctggtacctgaaccgtatatcctatggccctgactggaaggatttctacatagtggaacccctggcatttgaaggtacccctgagcagaaggctctggtgattggtggagaggcttgtatgtggggagaatatgtggacaacacaaacctggtccccaggctctggcccagagcaggggctgttgccgaaaggctgtggagcaacaagttgacatctgacctgacatttgcctatgaacgtttgtcacacttccgctgtgaattgctgaggcgaggtgtccaggcccaacccctcaatgtaggcttctgtgagcaggagtttgaacagacctgagccccaggcaccgaggagggtgctggctgtaggtgaatggtagtggagccaggcttccactgcatcctggccaggggacggagccccttgccttcgtgccccttgcctgcgtgcccctgtgcttggagagaaaggggccggtgctggcgctcgcattcaataaagagtaatgtggcatttttctataataaacatggattacctgtgtttaaaaaaaaaagtgtgaatggcgttagggtaagggcacagccaggctggagtcagtgtctgcccctgaggtcttttaagttgagggctgggaatgaaacctatagcctttgtgctgttctgccttgcctgtgagctatgtcactcccctcccactcctgaccatattccagacacctgccctaatcctcagcctgctcacttcacttctgcattatatctccaaggcgttggtatatggaaaaagatgtaggggcttggaggtgttctggacagtggggagggctccagacccaacctggtcacagaagagcctctcccccatgcatactcatccacctccctcccctagagctattctcctttgggtttcttgctgcttcaattttatacaaccattatttaaatattattaaacacatattgttctctaggcactgtggtagtgggtttttttgttgtttttgtttttgagactgtctcaaaaactctgtcgcccaggctgacagtgcagtggcacaatcttggctcactgcagcctctgcctcctgggttcaagcgattctcgtgcatcagcctcctgagtaactggaattaataggcacgtgccaccatgtccatctaattcatatatatatattttttttttctgagacggagtctcactgtcacacaggctggagtgcagtggcacgatctcgactcactgcaagctccacctcctgggttcacgccattctcctgcctcagcctccccagtagctgggactacaggcgcccgccaccacgcccggctaattttttgtatttttagtagagatggggtttctccgtgttagccaggatggtctcgatctcctgacctcgtgatccgcccgccttggcctcccaaagtgctgggattacaggcgtgagccaccgcgcccggccgaattcatctatttttagtagagatggggtttcactatattggccaggctggtcttgaactcatgacctcagatgttcacttgtcttggcctcccaaagtgctgggattagaggcgtgagccaccgcacgcgggcctgtggtaaattgttgaatttgaaggactcagaggccctggtcaattccaaaataacgtaggcgacttccatccccctcctcccaaccattttcagcccaaagcatcttcgcagggaatggatggctgcgcggaggtgggcggtggctctggagagggtctttgcaggtgtgattttctctagaaggaaatgtctcgtcgtggacccagactgccccctcctggtttcagatgcagaagtgatactgtaagccagaggcgggggcagtaatgcatcgcagccattttaggtgaggatttccttggcggttatttgttaagttctttggctgggccctgggctggggtaacaatggacaggttccaggcatttttttcagaaagcttccagtgtagtggatacagaaacttcaggaaggcagggctgagaaggatctgagtaaaactcggtccttcaacaccatccttcagcccctgggtcatgttccttcgaggtcctggtgggaggtagacaagcctagccttgtgctgttcctgtaaggacagggtgggcattttctaccaacagaattcttggaattttcacacagcccagcctagccaagtccagggctatagcccagatacacaagttaaggtcccagcactggcacccaccacaggagcccccttacctctattacccagaagcttgtaggaggggtggtccgcagacaaggaccctgcacaggtgcgaccctgcttccctcctggtcataactttcatgttactattgcttgggataatgttaagtaaaaatagcagacactgagttttaagtctcaagtggatgaaggcagagatcgtgatacacttgagttaaagcagtagggttctgtcattttctattcctgttgtaaacattttctttaatgttattatttttaccactaaactaacgtggcctggtcacgactttcattggtaaagtgtgctgttcctcaccctccaccgttgctcctttggtccactgatcataagagcatttacctgaaggtcgtcagacctcgaatgccaacaggtcaactgcagtggcctgcagttaccacccagtctgttccaatgaacagaatcgctgttgccccaactcatctcccttcacctaggctgtaaattgaaagtcccacccctgagcggaacacaggccatcttgtgtgctgtgcaccaccagggggtggggaagtttccagactgacttcctggctccagtcatcctaggaaaagagttctccagtcgctccccacccccaccccttcccattccaggagtctattaaggaggcaaagcaggcctaacgggtatcaaagcaaaggagtgaatggagactgggagagtcttcaacctctcctctccttggtaggagctgaggctgcatgccaggtaccttcccttcgaggaatctaataaagctaggtcactggtgttttcaggtgcttctcaaaggattgccgtaggggtaggatatcaggatgtgggagcacaggtgccaccacagcactagtgatggagagtcattgcccctagacttctgggacagtgagactgtgaggaaagctgaaatgatactgggaaagggtgaaagaaaggatgtaggtggaatttatttagtattaatgtaggtacacataccttatggcaacattcctagcactctaaattctagatttgtatagtttctgtcaatatcttttgtaagcttaatcaatacagggcatgacaagtatgtgtcacatacttttttttccacgaagaaaaaaaataagtaggaattgggtgctttgtttatcaaaatttgtatttcctttataaataaactttgaaataaaggttgaaaattagta'
Display the source organism for this sequence.
s.SourceOrganism
ans = 4×65 char array
'Homo sapiens '
'Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi;'
'Mammalia; Eutheria; Euarchontoglires; Primates; Haplorrhini; '
'Catarrhini; Hominidae; Homo. '
Input Arguments
GenBank-formatted file, specified as one of these values:
Character vector or string specifying a file name, a path and file name, or a URL pointing to a file. The referenced file is a GenBank-formatted file (ASCII text file). If you specify only a file name, that file must be on the MATLAB® search path or in the MATLAB Current Folder.
MATLAB character array or string vector that contains the text of a GenBank-formatted file.
You can use the getgenbank function with the
ToFile argument to retrieve sequence information from the
GenBank database and create an GenBank-formatted file.
Data Types: char | string
Connection timeout in seconds, specified as a positive number. The default value is 5. For details, see here.
Data Types: double
Output Arguments
Resulting data, returned as a MATLAB structure or array of structures containing fields corresponding to GenBank keywords.
When File contains multiple entries, each entry is stored as a
separate element in GenBankData. For a list of the GenBank keywords, see https://www.ncbi.nlm.nih.gov/Sitemap/samplerecord.html.
Version History
Introduced before R2006a
MATLAB Command
You clicked a link that corresponds to this MATLAB command:
Run the command by entering it in the MATLAB Command Window. Web browsers do not support MATLAB commands.
Sélectionner un site web
Choisissez un site web pour accéder au contenu traduit dans votre langue (lorsqu'il est disponible) et voir les événements et les offres locales. D’après votre position, nous vous recommandons de sélectionner la région suivante : .
Vous pouvez également sélectionner un site web dans la liste suivante :
Comment optimiser les performances du site
Pour optimiser les performances du site, sélectionnez la région Chine (en chinois ou en anglais). Les sites de MathWorks pour les autres pays ne sont pas optimisés pour les visites provenant de votre région.
Amériques
- América Latina (Español)
- Canada (English)
- United States (English)
Europe
- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)
- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom (English)