LOCUS X62625 3318 bp DNA linear PLN 14-NOV-2006 DEFINITION T.cacao csv gene for seed vicilin. ACCESSION X62625 S38078 VERSION X62625.1 KEYWORDS csv gene; seed protein; vicilin. SOURCE Theobroma cacao (cacao) ORGANISM Theobroma cacao Eukaryota; Viridiplantae; Streptophyta; Embryophyta; Tracheophyta; Spermatophyta; Magnoliopsida; eudicotyledons; Gunneridae; Pentapetalae; rosids; malvids; Malvales; Malvaceae; Byttnerioideae; Theobroma. REFERENCE 1 (bases 1 to 3318) AUTHORS McHenry L. JOURNAL Submitted (15-OCT-1991) to the INSDC. L. McHenry, Pennsylvania State University, 111 Borland Lab, University Park, PA 16802, USA REFERENCE 2 (bases 1 to 3318) AUTHORS McHenry L., Fritz P.J. TITLE Comparison of the structure and nucleotide sequences of vicilin genes of cocoa and cotton raise questions about vicilin evolution JOURNAL Plant Mol. Biol. 18(6), 1173-1176(1992). PUBMED 1600151 COMMENT See also x62626 Overlap of sequenced fragments FEATURES Location/Qualifiers source 1..3318 /organism="Theobroma cacao" /mol_type="genomic DNA" /dev_stage="mature tree" /clone_lib="cocoa genomic library in lambda GEM II" /clone="pHD5P1.7, pHD5P1.2, pHD5E4.5" /tissue_type="leaves" /db_xref="taxon:3641" regulatory 543..549 /note="csv gene" /regulatory_class="TATA_box" sig_peptide 610..681 /gene="csv" exon <610..1240 /number=1 CDS join(610..1240,1331..1506,1597..1677,1780..2064, 2177..2581) /gene="csv" /product="vicilin" /db_xref="GOA:Q43358" /db_xref="InterPro:IPR006045" /db_xref="InterPro:IPR006792" /db_xref="InterPro:IPR011051" /db_xref="InterPro:IPR014710" /db_xref="UniProtKB/Swiss-Prot:Q43358" /protein_id="CAA44493.1" /translation="MVISKSPFIVLIFSLLLSFALLCSGVSAYGRKQYERDPRQQYEQ CQRRCESEATEEREQEQCEQRCEREYKEQQRQQEEELQRQYQQCQGRCQEQQQGQREQ QQCQRKCWEQYKEQERGEHENYHNHKKNRSEEEEGQQRNNPYYFPKRRSFQTRFRDEE GNFKILQRFAENSPPLKGINDYRLAMFEANPNTFILPHHCDAEAIYFVTNGKGTITFV THENKESYNVQRGTVVSVPAGSTVYVVSQDNQEKLTIAVLALPVNSPGKYELFFPAGN NKPESYYGAFSYEVLETVFNTQREKLEEILEEQRGQKRQQGQQGMFRKAKPEQIRAIS QQATSPRHRGGERLAINLLSQSPVYSNQNGRFFEACPEDFSQFQNMDVAVSAFKLNQG AIFVPHYNSKATFVVFVTDGYGYAQMACPHLSRQSQGSQSGRQDRREQEEESEEETFG EFQQVKAPLSPGDVFVAPAGHAVTFFASKDQPLNAVAFGLNAQNNQRIFLAGRPFFLN HKQNTNVIKFTVKASAY" mat_peptide join(682..1240,1331..1506,1597..1677,1780..2064, 2177..2578) /product="vicilin" intron 1241..1330 /number=1 exon 1331..1506 /number=2 intron 1507..1596 /number=2 exon 1597..1677 /number=3 intron 1678..1779 /number=3 exon 1780..2064 /number=4 intron 2065..2176 /number=4 exon 2177..>2582 /number=5 regulatory 2856..2861 /regulatory_class="polyA_signal_sequence" BASE COUNT 1017 a 701 c 663 g 937 t ORIGIN 1 tcactttatc cagagattat ttttcacaat tttctccatt taaatcgagg aaaataaaaa 61 aaaattacgt caaaatttgt tcatatcata tccttgcagc tcatcgccat gcacgccaac 121 aggtgtacaa catgagcggt agattctgca gagtgcagat cattatcaac tcaatcttaa 181 ctcgtgttac gtccaatcca actcaatgaa acgcattcct aattcgcctt aacacacaca 241 attcccactc tttaactcaa ctaagttcac gcacaagaac aaaattaatc gacgagctct 301 gctgccaagc accatcactt cgctacatct agttgcagat ccgaccaaca atttcatggt 361 accaatgctt gtccggaccc tgcccgtacc acgtgaaggg atgttgcgtc tcgatagttt 421 ccccatctta agaaaaggat tcaaagtatt tgtcctgtta atttgcacac tcgtcacctt 481 gcatgtcaat gtcttctaca cgtagatgga gatttgcatg caaaagctta gcctccgcct 541 tctataaata cgttgcctct ctttgctctt atcaccaaga agaaaacaca gatcaaaagc 601 atagcaaata tggtgatcag taagtctcct ttcatagttt tgatcttctc tcttctcctt 661 tcttttgcgt tgctttgttc tggtgtcagc gcctatggca gaaaacaata tgagcgtgat 721 cctcgacagc aatacgagca atgccagagg cgatgcgagt cggaagcgac agaagaaagg 781 gagcaagagc agtgtgaaca acgctgtgaa agggagtaca aggagcagca gagacagcaa 841 gaagaagagc ttcaaaggca ataccagcaa tgtcaagggc gttgtcaaga gcaacaacag 901 gggcagagag agcagcagca gtgccagaga aaatgctggg agcaatataa ggaacaagag 961 agaggcgagc acgagaatta ccataatcac aaaaaaaata ggagcgaaga agaagaaggg 1021 caacaaagaa acaatcctta ctattttcct aaaagaagat cattccaaac tcgattcagg 1081 gatgaagagg gcaacttcaa gatcctccag aggtttgctg agaactctcc tccactcaag 1141 ggcatcaacg attaccgctt ggccatgttc gaagcaaatc ccaacacttt tattcttccg 1201 caccactgtg atgctgaggc aatttacttc gtgacaaacg gtaaattctc ttccctttcg 1261 aacaaaattt tgcggctttt atcaccaaat caccattgat tgaaaacata gtaaattatt 1321 tctgtgtcag gaaaggggac aattacgttt gtgactcatg aaaacaaaga gtcctataat 1381 gtacagcgtg gaacagtagt cagcgttcct gcaggaagca ctgtttacgt ggttagccaa 1441 gacaaccaag agaagctaac catagctgtg ctcgccctgc ctgttaattc tcctggcaaa 1501 tatgaggttt tctacttaac atttatttat tagaagtttc tacataacct tgatgtcttg 1561 gttagtctat tgaccctgta cttataattg ctgcagttat tcttccccgc tggaaataat 1621 aaacctgaat catattacgg agccttcagc tatgaagttc ttgagaccgt cttcaatgta 1681 tatggttctt tgactagcta tgaatcaatt ttttgctttt tacctttctc ttgtctttta 1741 cttatttatt tatttattca tttttatctt gtttgttaga cacaaagaga gaagctggag 1801 gagatcttgg aggaacagag agggcagaag aggcagcagg ggcagcaggg tatgttccgg 1861 aaagccaaac cagagcagat aagagcaata agccaacaag ctacttctcc aaggcacaga 1921 ggcggggaga gacttgccat caatctattg agccaatcgc ctgtctactc caaccaaaac 1981 ggacgcttct ttgaggcttg tcctgaggac ttcagtcaat ttcagaacat ggatgtcgct 2041 gtttcagcct tcaaactgaa tcaggtactt aaaatcaaat gatttttttc aataatttca 2101 ttaaaatttt tctcttgtct actcatacga ctactagatt agcttcatga atataagctc 2161 atgaatcctt ttgcagggag ccatatttgt gccacactac aattctaagg ctacattcgt 2221 ggtgtttgtc acggacggat atgggtacgc tcaaatggct tgcccgcatc tctccagaca 2281 gagccaggga tcccaaagtg gaaggcaaga cagaagagaa caagaagaag agtcagaaga 2341 ggagacattt ggagaattcc agcaggtcaa agccccattg tcacctggtg acgtctttgt 2401 agccccggca ggccatgcag ttacattctt tgcatccaaa gaccagcccc tgaatgcagt 2461 tgcgtttgga ctcaacgccc agaacaacca gagaattttc cttgcaggta ggcctttttt 2521 tctgaatcac aagcaaaaca caaatgtcat caaattcact gtaaaagcct ccgcttatta 2581 attgatcaat tggttcttgt ttgattctat agggaaaaag aacttggtca gacaaatgga 2641 tagcgaggca aaggagttat catttggggt accatcgaaa ttggtagata atatattcaa 2701 caacccggat gagtcgtatt tcatgtcttt ctctcaacag aggcagcgtg gagatgaaag 2761 gaggggcaat cccttggcct caattctgga ctttgcccgc ttgttctaag cagctgcttc 2821 cacttttgta tcagacatgc agaggcatgt aatgcaataa ataagttggc ctatgtaaag 2881 aggagagagt ttgcttttgt cttgttctaa ccttgttttt gaactagtaa actttcaatg 2941 taatgagagt tgttatcttt ctaagttaat gaataaaaga ccagggaatc tccgttttcc 3001 taatacaagt tctccaatta taaaacatct tttgtccaac gttcgatggc tagtgtcaca 3061 atgtttagcc aaagtacccg tttagattgc ttaatttttg taatcctttc tttatgaatt 3121 agaatttcct ttttctttgt tcacttgaac aaagatgtca tacataatgc atgaacaact 3181 acatcagcaa ttcaatagat gaaaggccta ttagacgcaa atcatttcat ttggatgaga 3241 aagttaatca attattaaat gaaaatgccc tttcaattgt aaagaatcaa aagttgtgta 3301 tttaaatgta tcctctgc //