LOCUS CCP45750.1 1616 aa PRT BCT 27-FEB-2015 DEFINITION Mycobacterium tuberculosis H37Rv Probable polyketide synthase Pks1 protein. ACCESSION AL123456-3028 PROTEIN_ID CCP45750.1 SOURCE Mycobacterium tuberculosis H37Rv ORGANISM Mycobacterium tuberculosis H37Rv Bacteria; Actinobacteria; Corynebacteriales; Mycobacteriaceae; Mycobacterium; Mycobacterium tuberculosis complex. REFERENCE 1 AUTHORS Cole S.T., Brosch R., Parkhill J., Garnier T., Churcher C., Harris D., Gordon S.V., Eiglmeier K., Gas S., Barry C.E.III., Tekaia F., Badcock K., Basham D., Brown D., Chillingworth T., Connor R., Davies R., Devlin K., Feltwell T., Gentles S., Hamlin N., Holroyd S., Hornsby T., Jagels K., Krogh A., McLean J., Moule S., Murphy L., Oliver K., Osborne J., Quail M.A., Rajandream M.A., Rogers J., Rutter S., Seeger K., Skelton J., Squares R., Squares S., Sulston J.E., Taylor K., Whitehead S., Barrell B.G. TITLE Deciphering the biology of Mycobacterium tuberculosis from the complete genome sequence JOURNAL Nature 393(6685), 537-544(1998). PUBMED 9634230 REMARK Erratum:[Nature 1998 Nov 12;396(6707):190] REFERENCE 2 AUTHORS Camus J.C., Pryor M.J., Medigue C., Cole S.T. TITLE Re-annotation of the genome sequence of Mycobacterium tuberculosis H37Rv JOURNAL Microbiology (Reading, Engl.) 148(Pt 10), 2967-2973(2002). PUBMED 12368430 REFERENCE 3 AUTHORS Lew J.M., Kapopoulou A., Jones L.M., Cole S.T. TITLE TubercuList--10 years after JOURNAL Tuberculosis (Edinb) 91(1), 1-7(2011). PUBMED 20980199 REFERENCE 4 (bases 1 to 4411529) AUTHORS Parkhill J. JOURNAL Submitted (11-JUN-1998) to the INSDC. Submitted on behalf of the Mycobacterium tuberculosis sequencing and mapping teams, Sanger Centre, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA Unite de Genetique Moleculaire Bacterienne, Institut Pasteur, 28 rue du Docteur Roux, 75724 Paris Cedex 15, France E-mail: parkhill@sanger.ac.uk REFERENCE 5 (bases 1 to 4411532) AUTHORS Lew J.M. JOURNAL Submitted (18-DEC-2012) to the INSDC. Lew J., Ecole Polytechnique Federale de Lausanne, CH-1015, Lausanne, Switzerland, and the Swiss Institute of Bioinformatics, CMU - Rue Michel-Servet 1, 1211 Geneva 4, SWITZERLAND COMMENT On or before Feb 1, 2013 this sequence version replaced gi:41352722, gi:38490165, gi:38490207, gi:41353619, gi:38490250, gi:38684030, gi:38490288, gi:41353667, gi:41353422, gi:41352756, gi:38490319, gi:41352785, gi:38490370, gi:41353971. Note: This annotation is from the TubercuList website, Release 26, Dec 2012 (URL: http://tuberculist.epfl.ch) (email: tuberculist@epfl.ch). FEATURES Qualifiers source /organism="Mycobacterium tuberculosis H37Rv" /strain="H37Rv" /mol_type="genomic DNA" /db_xref="taxon:83332" protein /transl_table=11 /gene="pks1" /locus_tag="Rv2946c" /note="Rv2946c, (MTCY24G1.03), len: 1616 aa. Probable pks1,polyketide synthase, similar to many e.g. ML035|AL583917|Q9CD81 putative polyketide synthase from Mycobacterium leprae (2103 aa), Fasta scores: opt: 8761,E(): 0, (82.6% identity in 1620 aa overlap); etc. Almost identical in part to G560507|Q50470 PKS002C protein from Mycobacterium tuberculosis (fragment) (950 aa), Fasta scores: opt: 5685, E(): 0, (95.3% identity in 927 aa overlap). Also similar to Mycobacterium tuberculosis polyketide synthases pks7|Rv1661|P94996 (2126 aa) (54.6% identity in 1632 aa); pks12|Rv2048c|O53490 (4151 aa) (58.0% identity in 1606 aa); pks8|rv1662|O65933 (1602 aa) (59.7% identity in 1144 aa). Contains a PS00012 Phosphopantetheine attachment site. Note pks1 has been shown to be involved in the biosynthesis of phthiocerol. pks15/pks1 has been shown to be involved in the biosynthesis of phenolphthiocerol glycolipids." /db_xref="EnsemblGenomes-Gn:Rv2946c" /db_xref="EnsemblGenomes-Tr:CCP45750" /db_xref="GOA:P96285" /db_xref="InterPro:IPR001227" /db_xref="InterPro:IPR006162" /db_xref="InterPro:IPR009081" /db_xref="InterPro:IPR011032" /db_xref="InterPro:IPR013154" /db_xref="InterPro:IPR013968" /db_xref="InterPro:IPR014043" /db_xref="InterPro:IPR016035" /db_xref="InterPro:IPR016036" /db_xref="InterPro:IPR020801" /db_xref="InterPro:IPR020806" /db_xref="InterPro:IPR020807" /db_xref="InterPro:IPR020843" /db_xref="InterPro:IPR036291" /db_xref="InterPro:IPR036736" /db_xref="InterPro:IPR042104" /db_xref="UniProtKB/Swiss-Prot:P96285" /inference="protein motif:PROSITE:PS00012" /experiment="EXISTENCE: identified in proteomics study" BEGIN 1 MISARSAEAL TAQAGRLMAH VQANPGLDPI DVGCSLASRS VFEHRAVVVG ASREQLIAGL 61 AGLAAGEPGA GVAVGQPGSV GKTVVVFPGQ GAQRIGMGRE LYGELPVFAQ AFDAVADELD 121 RHLRLPLRDV IWGADADLLD STEFAQPALF AVEVASFAVL RDWGVLPDFV MGHSVGELAA 181 AHAAGVLTLA DAAMLVVARG RLMQALPAGG AMVAVAASED EVEPLLGEGV GIAAINAPES 241 VVISGAQAAA NAIADRFAAQ GRRVHQLAVS HAFHSPLMEP MLEEFARVAA RVQAREPQLG 301 LVSNVTGELA GPDFGSAQYW VDHVRRPVRF ADSARHLQTL GATHFIEAGP GSGLTGSIEQ 361 SLAPAEAMVV SMLGKDRPEL ASALGAAGQV FTTGVPVQWS AVFAGSGGRR VQLPTYAFQR 421 RRFWETPGAD GPADAAGLGL GATEHALLGA VVERPDSDEV VLTGRLSLAD QPWLADHVVN 481 GVVLFPGAGF VELVIRAGDE VGCALIEELV LAAPLVMHPG VGVQVQVVVG AADESGHRAV 541 SVYSRGDQSQ GWLLNAEGML GVAAAETPMD LSVWPPEGAE SVDISDGYAQ LAERGYAYGP 601 AFQGLVAIWR RGSELFAEVV APGEAGVAVD RMGMHPAVLD AVLHALGLAV EKTQASTETR 661 LPFCWRGVSL HAGGAGRVRA RFASAGADAI SVDVCDATGL PVLTVRSLVT RPITAEQLRA 721 AVTAAGGASD QGPLEVVWSP ISVVSGGANG SAPPAPVSWA DFCAGSDGDA SVVVWELESA 781 GGQASSVVGS VYAATHTALE VLQSWLGADR AATLVVLTHG GVGLAGEDIS DLAAAAVWGM 841 ARSAQAENPG RIVLIDTDAA VDASVLAGVG EPQLLVRGGT VHAPRLSPAP ALLALPAAES 901 AWRLAAGGGG TLEDLVIQPC PEVQAPLQAG QVRVAVAAVG VNFRDVVAAL GMYPGQAPPL 961 GAEGAGVVLE TGPEVTDLAV GDAVMGFLGG AGPLAVVDQQ LVTRVPQGWS FAQAAAVPVV 1021 FLTAWYGLAD LAEIKAGESV LIHAGTGGVG MAAVQLARQW GVEVFVTASR GKWDTLRAMG 1081 FDDDHIGDSR TCEFEEKFLA VTEGRGVDVV LDSLAGEFVD ASLRLLVRGG RFLEMGKTDI 1141 RDAQEIAANY PGVQYRAFDL SEAGPARMQE MLAEVRELFD TRELHRLPVT TWDVRCAPAA 1201 FRFMSQARHI GKVVLTMPSA LADRLADGTV VITGATGAVG GVLARHLVGA YGVRHLVLAS 1261 RRGDRAEGAA ELAADLTEAG AKVQVVACDV ADRAAVAGLF AQLSREYPPV RGVIHAAGVL 1321 DDAVITSLTP DRIDTVLRAK VDAAWNLHQA TSDLDLSMFA LCSSIAATVG SPGQGNYSAA 1381 NAFLDGLAAH RQAAGLAGIS LAWGLWEQPG GMTAHLSSRD LARMSRSGLA PMSPAEAVEL 1441 FDAALAIDHP LAVATLLDRA ALDARAQAGA LPALFSGLAR RPRRRQIDDT GDATSSKSAL 1501 AQRLHGLAAD EQLELLVGLV CLQAAAVLGR PSAEDVDPDT EFGDLGFDSL TAVELRNRLK 1561 TATGLTLPPT VIFDHPTPTA VAEYVAQQMS GSRPTESGDP TSQVVEPAAA EVSVHA //