LOCUS CCP46329.1 1381 aa PRT BCT 27-FEB-2015 DEFINITION Mycobacterium tuberculosis H37Rv PE-PGRS family protein PE_PGRS53 protein. ACCESSION AL123456-3607 PROTEIN_ID CCP46329.1 SOURCE Mycobacterium tuberculosis H37Rv ORGANISM Mycobacterium tuberculosis H37Rv Bacteria; Actinobacteria; Corynebacteriales; Mycobacteriaceae; Mycobacterium; Mycobacterium tuberculosis complex. REFERENCE 1 AUTHORS Cole S.T., Brosch R., Parkhill J., Garnier T., Churcher C., Harris D., Gordon S.V., Eiglmeier K., Gas S., Barry C.E.III., Tekaia F., Badcock K., Basham D., Brown D., Chillingworth T., Connor R., Davies R., Devlin K., Feltwell T., Gentles S., Hamlin N., Holroyd S., Hornsby T., Jagels K., Krogh A., McLean J., Moule S., Murphy L., Oliver K., Osborne J., Quail M.A., Rajandream M.A., Rogers J., Rutter S., Seeger K., Skelton J., Squares R., Squares S., Sulston J.E., Taylor K., Whitehead S., Barrell B.G. TITLE Deciphering the biology of Mycobacterium tuberculosis from the complete genome sequence JOURNAL Nature 393(6685), 537-544(1998). PUBMED 9634230 REMARK Erratum:[Nature 1998 Nov 12;396(6707):190] REFERENCE 2 AUTHORS Camus J.C., Pryor M.J., Medigue C., Cole S.T. TITLE Re-annotation of the genome sequence of Mycobacterium tuberculosis H37Rv JOURNAL Microbiology (Reading, Engl.) 148(Pt 10), 2967-2973(2002). PUBMED 12368430 REFERENCE 3 AUTHORS Lew J.M., Kapopoulou A., Jones L.M., Cole S.T. TITLE TubercuList--10 years after JOURNAL Tuberculosis (Edinb) 91(1), 1-7(2011). PUBMED 20980199 REFERENCE 4 (bases 1 to 4411529) AUTHORS Parkhill J. JOURNAL Submitted (11-JUN-1998) to the INSDC. Submitted on behalf of the Mycobacterium tuberculosis sequencing and mapping teams, Sanger Centre, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA Unite de Genetique Moleculaire Bacterienne, Institut Pasteur, 28 rue du Docteur Roux, 75724 Paris Cedex 15, France E-mail: parkhill@sanger.ac.uk REFERENCE 5 (bases 1 to 4411532) AUTHORS Lew J.M. JOURNAL Submitted (18-DEC-2012) to the INSDC. Lew J., Ecole Polytechnique Federale de Lausanne, CH-1015, Lausanne, Switzerland, and the Swiss Institute of Bioinformatics, CMU - Rue Michel-Servet 1, 1211 Geneva 4, SWITZERLAND COMMENT On or before Feb 1, 2013 this sequence version replaced gi:41352722, gi:38490165, gi:38490207, gi:41353619, gi:38490250, gi:38684030, gi:38490288, gi:41353667, gi:41353422, gi:41352756, gi:38490319, gi:41352785, gi:38490370, gi:41353971. Note: This annotation is from the TubercuList website, Release 26, Dec 2012 (URL: http://tuberculist.epfl.ch) (email: tuberculist@epfl.ch). FEATURES Qualifiers source /organism="Mycobacterium tuberculosis H37Rv" /strain="H37Rv" /mol_type="genomic DNA" /db_xref="taxon:83332" protein /transl_table=11 /gene="PE_PGRS53" /locus_tag="Rv3507" /note="Rv3507, (MTV023.14), len: 1381 aa. PE_PGRS53, Member of the Mycobacterium tuberculosis PE protein family, PGRS subfamily of gly-rich proteins (see citation below),similar to others from Mycobacterium tuberculosis strains H37Rv and CDC1551 e.g. O06810|Rv1450c|MTCY493.04 (1329 aa),FASTA scores: opt: 2173, E(): 1.4e-135, (51.15% identity in 1412 aa overlap). Equivalent to AAK47970 from Mycobacterium tuberculosis strain CDC1551 (1384 aa) but with some minor differences between the proteins. Contains two PS00583 pfkB family of carbohydrate kinases signatures 1." /db_xref="EnsemblGenomes-Gn:Rv3507" /db_xref="EnsemblGenomes-Tr:CCP46329" /db_xref="InterPro:IPR000084" /db_xref="UniProtKB/TrEMBL:Q6MWW9" /inference="protein motif:PROSITE:PS00583" /experiment="EXISTENCE: identified in proteomics study" BEGIN 1 MSFVLVSPET VAAVATDLKR IGASLAHENA SAAASTTAVV SAAADEVSTA VAALFSQHAQ 61 GYQAAAAQVA AFHSRFVQAL TAGAGAYAFA EAANASPLQS AMGAVSASAQ TLLSRPLIGN 121 GANATTPGGN GGDGGWLFGS GGNGAPGAAG QSGGNGGSAG LWGNGGAGGA GGSGGAAGGN 181 GGNGGWLFGA GGTGGIGGTG APGAMGGTGG NGGNGALLIG GGGLGGAGGM GGTGGGTGGT 241 GGNGGNGALL IGAGGVGGAG GIGGQGTGAG GAAGAGGTGG NGGAGGLFMN GGDGGAGGQG 301 GDGAAGDAAA SAGGTGGKGG QGGDGGTGGA GGAGPVLFGH GGAGGMGGQG GTGGMGGAGG 361 DGTTVIAAGT GGEGGTGGAA GAGGAAGARG ALTSGGLAGG VGAGGTGGTG GTGGNGADAA 421 AVVGFGANGD PGFAGGKGGN GGIGGAAVTG GVAGDGGTGG KGGTGGAGGA GNDAGSTGNP 481 GGKGGDGGIG GAGGAGGAAG TGNGGHAGNT GDGGDGGTGG NGGNGTGGVN GADNTLNPDT 541 PGGAGEPGGA GGAGGAGGAA GGPGGTGGTG GNGGNGGNGG NGGNGGNGGN GGNAGNNSTN 601 APVGGEGGAG GDGGAGGAGG AANGGTAGSQ GTGGVGGDGG AGGNGGGGKA GTGNSGNFGV 661 DGEAGFSGGA GGNGGVGGAA GANGGTGGSG GNGGDGGAGG IGGAGGNGIP GTGTEPAGGT 721 GAKGGDGGDG GAGGAGGNAG GAGGQGGNAG QGGAGGAGGN AVIPGDGVGK APHGDAGGSG 781 GDGGKGGQGG SGGTGGSGAP IGGGAGGTGG SGGHAGKGGA GGIGAQGTTI TVPGNGGNAG 841 DGGNGGNAGA GGNGGSGDFG GNTTSGASGS GGNGGNAGTA GSGGAGGTGG TGLSGGNGGN 901 GGNGGNGGDG GNGAHGTVGA QFVPATSLPT PNGGAGGNGG TGSNGGAPGP AGAPGPTTGG 961 NAGSQGIGGD GGNGGDGGKG GDGADAVNVV FMPTEPQAAT GTAGSAGDPT GGNGGPGTPG 1021 SPMVAPPPPT PITQVQQGGD GGAGGTGSTN ANDGTATGGK GGEGGVGSIL GGPGGNGGTG 1081 GNASATGTNG VANAGNGGKG GDGGQFGAGG NGGAGGSVTD GSAGSTAGNG GNGGNATNGT 1141 IAGQPAGGNG SAGGKGGDGG NIAAGATGTA GNGGNGGNGN DGAVNAGTGG SGGNGGNAGG 1201 GGANGGDGGA GGAGGAGGRG GKGIDGGFGG DGGNGGSNNG TGAGGNGGNG GTGGVGSVGA 1261 AGGDGGNGGT GGFAGFGGTA GNGGSGGTGG AGGDGGTGGD GGNGVIAGGG GTGGNGGASG 1321 AGGAGGTGGF AGNGNAGGNG GTGGASEDGD NGNAGSGATG GTGGNGGTGG DGGAAGLGGV 1381 A //