LOCUS CCP46336.1 1489 aa PRT BCT 27-FEB-2015 DEFINITION Mycobacterium tuberculosis H37Rv PE-PGRS family protein PE_PGRS57 protein. ACCESSION AL123456-3614 PROTEIN_ID CCP46336.1 SOURCE Mycobacterium tuberculosis H37Rv ORGANISM Mycobacterium tuberculosis H37Rv Bacteria; Actinobacteria; Corynebacteriales; Mycobacteriaceae; Mycobacterium; Mycobacterium tuberculosis complex. REFERENCE 1 AUTHORS Cole S.T., Brosch R., Parkhill J., Garnier T., Churcher C., Harris D., Gordon S.V., Eiglmeier K., Gas S., Barry C.E.III., Tekaia F., Badcock K., Basham D., Brown D., Chillingworth T., Connor R., Davies R., Devlin K., Feltwell T., Gentles S., Hamlin N., Holroyd S., Hornsby T., Jagels K., Krogh A., McLean J., Moule S., Murphy L., Oliver K., Osborne J., Quail M.A., Rajandream M.A., Rogers J., Rutter S., Seeger K., Skelton J., Squares R., Squares S., Sulston J.E., Taylor K., Whitehead S., Barrell B.G. TITLE Deciphering the biology of Mycobacterium tuberculosis from the complete genome sequence JOURNAL Nature 393(6685), 537-544(1998). PUBMED 9634230 REMARK Erratum:[Nature 1998 Nov 12;396(6707):190] REFERENCE 2 AUTHORS Camus J.C., Pryor M.J., Medigue C., Cole S.T. TITLE Re-annotation of the genome sequence of Mycobacterium tuberculosis H37Rv JOURNAL Microbiology (Reading, Engl.) 148(Pt 10), 2967-2973(2002). PUBMED 12368430 REFERENCE 3 AUTHORS Lew J.M., Kapopoulou A., Jones L.M., Cole S.T. TITLE TubercuList--10 years after JOURNAL Tuberculosis (Edinb) 91(1), 1-7(2011). PUBMED 20980199 REFERENCE 4 (bases 1 to 4411529) AUTHORS Parkhill J. JOURNAL Submitted (11-JUN-1998) to the INSDC. Submitted on behalf of the Mycobacterium tuberculosis sequencing and mapping teams, Sanger Centre, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA Unite de Genetique Moleculaire Bacterienne, Institut Pasteur, 28 rue du Docteur Roux, 75724 Paris Cedex 15, France E-mail: parkhill@sanger.ac.uk REFERENCE 5 (bases 1 to 4411532) AUTHORS Lew J.M. JOURNAL Submitted (18-DEC-2012) to the INSDC. Lew J., Ecole Polytechnique Federale de Lausanne, CH-1015, Lausanne, Switzerland, and the Swiss Institute of Bioinformatics, CMU - Rue Michel-Servet 1, 1211 Geneva 4, SWITZERLAND COMMENT On or before Feb 1, 2013 this sequence version replaced gi:41352722, gi:38490165, gi:38490207, gi:41353619, gi:38490250, gi:38684030, gi:38490288, gi:41353667, gi:41353422, gi:41352756, gi:38490319, gi:41352785, gi:38490370, gi:41353971. Note: This annotation is from the TubercuList website, Release 26, Dec 2012 (URL: http://tuberculist.epfl.ch) (email: tuberculist@epfl.ch). FEATURES Qualifiers source /organism="Mycobacterium tuberculosis H37Rv" /strain="H37Rv" /mol_type="genomic DNA" /db_xref="taxon:83332" protein /transl_table=11 /gene="PE_PGRS57" /locus_tag="Rv3514" /note="Rv3514, (MTV023.21), len: 1489 aa. PE_PGRS57, Member of the Mycobacterium tuberculosis PE family, PGRS subfamily of gly-rich proteins (see citation below), similar to others from Mycobacterium tuberculosis strains H37Rv and CDC1551 e.g. AAK47971 (1715 aa) FASTA scores: opt: 6940,E(): 0, (67.0% identity in 1713 aa overlap); and upstream O53553|YZ08_MYCTU|Rv3508|MTV023.15 (1901 aa), FASTA scores: opt: 6598,E(): 0, (71.05% identity in 1533 aa overlap). Contains two PS00583 pfkB family of carbohydrate kinases signatures 1." /db_xref="EnsemblGenomes-Gn:Rv3514" /db_xref="EnsemblGenomes-Tr:CCP46336" /db_xref="InterPro:IPR000084" /db_xref="UniProtKB/TrEMBL:Q6MWW6" /inference="protein motif:PROSITE:PS00583" /experiment="EXISTENCE: identified in proteomics study" BEGIN 1 MSFVLIAPEF VTAAAGDLTN LGSSISAANA SAASATTQVL AAGADEVSAR IAALFGGFGL 61 EYQAISAQVA AYHQRFVQAL STGAGAYASA EAAAAEQIVL GVINAPTQAL LGRPLIGDGA 121 NATTPGGAGG AGGLLFGNGG AGAAGAPGQA GGPGGPAGLW GNGGPGGAGG SGGGTGGAGG 181 AGGWLFGVGG AGGVGGAGGG TGGAGGPGGL IWGGGGAGGV GGAGGGTGGA GGRAELLFGA 241 GGAGGAGTDG GPGATGGTGG HGGVGGDGGW LAPGGAGGAG GQGGAGGAGS DGGALGGTGG 301 TGGTGGAGGA GGRGALLLGA GGQGGLGGAG GQGGTGGAGG DGVLGGVGGT GGKGGVGGVA 361 GLGGAGGAAG QLFSASGAAG NAGVGGAGGQ GGDGGAGGAG ADADQPGATG GTGFAGGAGG 421 AGGAGGSSGA GGTNGSGGAG GQGGAGGAGG AGADNPTGIG GTGGDGGTGG AAGAGGAGGA 481 AGTGGTGGMI GTTGNAGVGG AGGQGGDGGA GGAGADADQP GATGGTGFAG GAGGAGGAGG 541 SSGAGGTNGS GGAGGTGGQG GAGGAGGAGA DNPTGIGGTG GDGGTGGAAG AGGAGGAAGT 601 GGTGGMIGTT GNAGVGGAGG QGGDGGAGGA GADADQPGAT GGTGFAGGAG GAGKAGGSSS 661 AGGTNSSGSA GGTGRQSGTG GAGGAGADNP TGIGGTGGDG GTGGAAGAGG AGGAAGTGGT 721 GGMIGTTGNA GVGGAGGSSG AGGTNGSGGA GGTDGQGGAG GAGGAGADNP TGIGGTGGDG 781 GTGGAAGAGG AGGAAGTGGT GGMIGTTGNA GVGGAGGQGG DGGAGGAGAD ADQPGATGGT 841 GFAGGAGGAG GSGGSSCAGG TNGSGGAGGT CGQVVAGGAG ISFSNGSNGG TGGTGGVGGT 901 GGDGGNAGTG AGDPGKGGTG GTGGTGGSGG AGGSGGANFN GGTGGTGGTG GKGGLNTDGL 961 SSATSGTGGT GGTGGKGGTG GAGDDSAGGT GGTGGAGGNA GAGGLANTGG TAGNAGIGGD 1021 GGQGGNGGQG DSGSGLGGQP GFAGGAGGKG GAGGSSGAGG TNGSGGAGGA GGQGGAGGAG 1081 ISFSNGSNGG TGGTGGVGGT GGDGGNAGTG AGDPGKGGTG GTGGTGGSGG AGGSGGANFN 1141 GGTGGTGGTG GTGGKGGMGG IAGDGGPGGD GGNAGVGGKG GTNGNGGSGG TGGTGGAGGN 1201 AGAGGLANTG GTAGNAGIGG DGGQGGNGGQ GDSGSGLGGQ PGFAGGPGGK GGAGGNAGTG 1261 GTNGSGAGGA GGQGGAGGAG ISFSNGSNGG TGGTGGVGGT GGDGGNAGTG AGDPGKGGTG 1321 GTGGTGGSGG AGGSGGANFN GGTGGTGGTG GTGGKGGMGG IAGDGGPGGD GGNAGVGGKG 1381 GTNGNGGSGG TGGTGGPGGS GGAPTGSGTG GKGGAGGDGG DGADGGAATG VGDGGDGGNG 1441 GNGGNGGTGV GSPGGLGGAG GTGGLGGAGA GGGADGDDGD DGQPGNNGS //