LOCUS SIU02005.1 1337 aa PRT BCT 25-MAY-2020 DEFINITION Mycobacterium tuberculosis variant bovis AF2122/97 pe- pgrs family protein pe_pgrs49 protein. ACCESSION LT708304-3419 PROTEIN_ID SIU02005.1 SOURCE Mycobacterium tuberculosis variant bovis AF2122/97 ORGANISM Mycobacterium tuberculosis variant bovis AF2122/97 Bacteria; Actinobacteria; Corynebacteriales; Mycobacteriaceae; Mycobacterium; Mycobacterium tuberculosis complex. REFERENCE 1 AUTHORS Malone K.M. JOURNAL Submitted (06-DEC-2016) to the INSDC. School of Veterinary Medicine, Tuberculosis Molecular Microbiology Research Group, University College Dublin, Tuberculosis Molecular Microbiology Research Group, School of Veterinary Medicine, University College Dublin, D4, Ireland REFERENCE 2 AUTHORS Malone M K., Farrell D., Malone K. JOURNAL Submitted (15-APR-2020) to the INSDC. School of Veterinary Medicine, Tuberculosis Molecular Microbiology Research Group, University College Dublin, Tuberculosis Molecular Microbiology Research Group, School of Veterinary Medicine,, University College Dublin, D4, Ireland FEATURES Qualifiers source /organism="Mycobacterium tuberculosis variant bovis AF2122/97" /chromosome="Mycobacterium_bovis_AF212297" /isolate="AF2122/97" /mol_type="genomic DNA" /isolation_source="Mycobacterium bovis subsp. bovis strain AF2122/97. This strain is a fully virulent strain that was isolated in 1997 in the UK from a cow suffering necrotic lesions in lung and bronchomediastinal lymph nodes. The strain was also reported to infect and persist in badgers that are considered to be a significant source of bovine infection." /db_xref="taxon:233413" protein /transl_table=11 /gene="PE_PGRS50b" /locus_tag="BQ2027_MB3376C" /note="Mb3376c, PE_PGRS50b, len: 1337 aa. Equivalent to middle part of Rv3345c (PE_PGRS50) and Rv3344c (PE_PGRS49), len: 1538 aa and 484 aa, from Mycobacterium tuberculosis strain H37Rv, (81.45% identity in 992 aa overlap and 100.000% identity in 477 aa overlap). Rv3345c: Member of the Mycobacterium tuberculosis PE family, PGRS subfamily of gly-rich proteins. Similar to AAK47791 from strain CDC1551 but with some big gaps (after residues 501 and 1419; and for AAK47791 after residue 991). Similar to many from Mycobacterium tuberculosis strains H37Rv and CDC1551 e.g. O53559|Rv3514|MTV023.21 (1489 aa), FASTA scores: opt: 4508, E(): 7e-161, (52.1% identity in 1529 aa overlap); MTV004_1, MTV023_21, MTV023_15, MTCY493_4, MTV039_16, MTV008_46, MTV023_14, MTV023_19, MTV043_26, MTCY493_2, MTCY441_4; etc. Rv3344c: Member of the Mycobacterium tuberculosis PE family, PGRS subfamily of gly-, ala-rich proteins. Appears to be a gene fragment, should be in-frame with following ORF, MTV016.45c, frameshift required around 49595 but could not be found on checking BAC and cosmid clones. Similar to many from M. tuberculosis strains H37Rv and CDC1551 e.g. O53557|Rv3512|MTV023.19 (1079 aa), FASTA scores: opt: 1595, E(): 1.8e-54, (52.0% identity in 544 aa overlap). REMARK-M.bovis-M.tuberculosis: In Mycobacterium tuberculosis strain H37Rv, PE_PGRS50 exists as a single gene. In Mycobacterium bovis, a single base deletion (g-*) splits PE_PGRS50 into 2 parts, PE_PGRS50a and PE_PGRS50b. Also in Mycobacterium tuberculosis strain H37Rv, PE_PGRS49 and PE_PGRS50 exist as 2 genes. In Mycobacterium bovis, a single base deletion (c-*) leads to PE_PGRS49 and PE_PGRS50b existing as a single product." /db_xref="UniProtKB/TrEMBL:A0A1R3Y5N3" BEGIN 1 MTLAVNQGAG GDGGNGGEVG VGGKGGAGGV SANPALNGSA GANGTAPTSG GNGGNGGAGA 61 TPTVAGENGG AGGNGGHGGS VGNGGAGGAG GNGVAGTGLA LNGGNGGIGG NGGSAAGTGG 121 DGGKGGNGGA GANGQDFSAS ANGANGGQGG NGGNGGIGGK GGDAFATFAK AGNGGAGGNG 181 GAAGNGGGGA AGDVTLAINQ GAGGAGGNGG NVGVAGQGGA GGKGAIPAMK GATGADGTAP 241 TSGGDGGNGG NGASPTVAGG NGGDGGKGGS GGNVGNGGNG GAGGNGAAGQ AGTPGPTSGD 301 SGTSGTDGGA GGNGGAGGAG GTLAGHGGNG GKGVNGGQGG IGGAGERGAD GAGPNANGAN 361 GENGGSGGNG GDGGAGGNGG AGGKAQAAGY TDGATGTGGD GGNGGDGGKA GDGGAGANGL 421 NSGAMLPGGG TVGNPGTGGN GGNGGNAGVG GTGGKAGTGS LTGLDGTDGI TPNGGNGGNG 481 GNGGKGGTAG NGSGAAGGNG GNGGSGLNGG DAGNGGNGGG ALNQAGFFGT GGKGGNGGNG 541 GAGMINGGLG GFGGAGGGGA VDVAATTGGA GGNGGAGGFA STGLGGPGGA GGPGGAGDFA 601 SGVGGVGGAG GDGGAGGVGG FGGQGGIGGE GRTGGNGGSG GDGGGGISLG GQGGNGGFGG 661 AGGNGGIGTD AGGAGGAGGA GGNGGSSKST TTGNAGSGGA GGNGGTGLNG AGGAGGAGGN 721 AGVAGVSFGN AVGGDGGNGG NGGHGGDGTT GGAGGKGGNG SSGAASGSGV VNVTAGHGGN 781 GGNGGNSGNS TGVAGLAGGA AGAGGNGGGT SSAAGHGGSG GNGGSGGSGG SGTTGGAGAA 841 GGNGGAGAGG GSLSTGQSGG HGGSGGAGGN GGAGSAGNGG AGGAGGNGGA GGNGGGGDAG 901 NAGSGGNGGK GGDGVGPGST GGAGGKGGAG ANGGSSNGNA RGGNAGNGGH GGAGGSGDTG 961 GAGGAGGQGG FGGTGGSGSG IGGGAGGNGG NGGAGGTGVV LGGKGGDGGN GDHGGPATNP 1021 GSGSRGGAGG SGGNGGAGGN ATGSGGKGGA GGNGGDGSFG ATSGPASIGV TGAPGGNGGK 1081 GGAGGSNPNG SGGDGGKGGN GGAGGNGGSI GANSGIVGGS GGAGGAGGAG GNGSLSSGEG 1141 GKGGDGGHGG DGVGGNSSVT QGGSGGGGGA GGAGGSGFFG GKGGFGGDGG QGGPNGGGTV 1201 GTVAGGGGNG GVGGRGGDGV FAGAGGQGGL GGQGGNGGGS TGGNGGLGGA GGGGGNAPDG 1261 GFGGNGGKGG QGGIGGGTQS ATGLGGDGGD GGDGGNGGNS GAKAGGAGGK GQAGQPNSGT 1321 EPGFGGDGGL GGAGATP //