LOCUS CAA19143.1 336 aa PRT BCT 16-APR-2005 DEFINITION Mycobacterium leprae hypothetical protein MLCB2407.03c protein. ACCESSION AL023596-3 PROTEIN_ID CAA19143.1 SOURCE Mycobacterium leprae ORGANISM Mycobacterium leprae Bacteria; Actinobacteria; Corynebacteriales; Mycobacteriaceae; Mycobacterium. REFERENCE 1 (bases 1 to 35615) AUTHORS Seeger K., Harris D. JOURNAL Unpublished. REFERENCE 2 (bases 1 to 35615) AUTHORS Parkhill J., Barrell B.G., Rajandream M.A. JOURNAL Submitted (18-MAY-1998) to the INSDC. Mycobacterium leprae sequencing project, Sanger Centre, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA E-mail: barrell@sanger.ac.uk Cosmids supplied by Dr. Stewart T. Cole, [3] Unite de Genetique Moleculaire Bacterienne, Institut Pasteur, 28 rue du Docteur Roux, 75724 Paris Cedex 15, France Requests for cosmids should be sent to Karin Eiglmeier (kei@pasteur.fr). REFERENCE 3 (bases 1 to 35615) AUTHORS Eiglmeier K., Honore N., Woods S.A., Caudron B., Cole S.T. TITLE Use of an ordered cosmid library to deduce the genomic organization of Mycobacterium leprae JOURNAL Mol. Microbiol. 7(2), 197-206(1993). PUBMED 8446027 COMMENT Notes: The Sanger Centre is funded to complete the sequence of M. leprae by the Heiser Program for Research in Leprosy and Tuberculosis of The New York Community Trust. Work in Paris is supported by the Heiser Trust, the Association Francaise Raoul Follereau and the Groupement de Recherches et des Etudes des Genomes (GIP-GREG). Details of M. leprae sequencing at the Sanger Centre are available on the World Wide Web. (URL, http://www.sanger.ac.uk/Projects/) CDS are numbered using the following system eg MLCB33.01c. ML (M. leprae), cB33 (cosmid name), .01 (first CDS), c (complementary strand). The more significant matches with motifs in the PROSITE database are also included but some of these may be fortuitous. The length in codons is given for each CDS. Usually the highest scoring match found by fasta -o is given for CDS which show significant similarity to other CDS in the database. The position of possible ribosome binding site sequences are given where these have been used to deduce the initiation codon. All CDS over 100 codons have been analysed. Gene prediction is based on positional base preference in codons especially where there is an increase in the observed/expected third position G + C. CAUTION: We may not have predicted the correct initiation codon. Where possible we choose an initiation codon (atg, gtg, or ttg) which is preceded by an upstream ribosome binding site sequence (optimally 5-13bp before the initiation codon). If this cannot be identified we choose the most upstream initiation codon. IMPORTANT: This sequence MAY NOT be the entire insert of the sequenced clone. It may be shorter because we only sequence overlapping sections once, or longer, because we arrange for a small overlap between neighbouring submissions. Cosmid B2407 is overlapped by EMBL:ML023 L518 at the 5' end and EMBL:MLB577COS B577 at the 3' end. FEATURES Qualifiers source /organism="Mycobacterium leprae" /mol_type="genomic DNA" /clone="cosmid B2407" /db_xref="taxon:1769" protein /transl_table=11 /gene="MLCB2407.03c" /note="MLCB2407.03c, unknown, len: 336 aa; identical to ORF's TR:Q49930 L518_C3_195 (213 aa) and TR:Q49928 L518_C1_89 (122 aa) (EMBL;U00023). This sequence has a single ORF, due an additional base in EMBL:U00023" /db_xref="InterPro:IPR029063" /db_xref="UniProtKB/TrEMBL:O69510" BEGIN 1 MYSIELRKMG TLGCVKIISK ENVIIVAAKI SWTLRCLVFL LRFGFAFLFY SGIVNPIIIW 61 HGFAALLRYF LVKQGIPSGT AKHLAQIRAE FEEQSVQGQF KELYFDMTDR NIVVWSDIFS 121 RVFDRKAPVR ILEIGSWEGR STLFLLTYFT QGHLTAVDTW AGSDEHQHNA PSDLRSLEAR 181 FDSNLTPCAA RLTKRKGSSL HVLPQLLGEE QKFDVIYVDG SHFGDDVLAD GITAWRLLEK 241 GGILIFDDFL WPGYLRARAN PAWAINLFLK YHAGEYKILN VSYQVILQKK LVFNDRVLTS 301 LATVQYDDRR RFTKEFIKNL SSASQVMSGN MRPLER //