LOCUS       CAA19143.1               336 aa    PRT              BCT 16-APR-2005
DEFINITION  Mycobacterium leprae hypothetical protein MLCB2407.03c protein.
ACCESSION   AL023596-3
PROTEIN_ID  CAA19143.1
SOURCE      Mycobacterium leprae
  ORGANISM  Mycobacterium leprae
            Bacteria; Actinobacteria; Corynebacteriales; Mycobacteriaceae;
            Mycobacterium.
REFERENCE   1  (bases 1 to 35615)
  AUTHORS   Seeger K., Harris D.
  JOURNAL   Unpublished.
REFERENCE   2  (bases 1 to 35615)
  AUTHORS   Parkhill J., Barrell B.G., Rajandream M.A.
  JOURNAL   Submitted (18-MAY-1998) to the INSDC. Mycobacterium leprae
            sequencing project, Sanger Centre, Wellcome Trust Genome Campus,
            Hinxton, Cambridge CB10 1SA E-mail: barrell@sanger.ac.uk Cosmids
            supplied by Dr. Stewart T. Cole, [3] Unite de Genetique Moleculaire
            Bacterienne, Institut Pasteur, 28 rue du Docteur Roux, 75724 Paris
            Cedex 15, France Requests for cosmids should be sent to Karin
            Eiglmeier (kei@pasteur.fr).
REFERENCE   3  (bases 1 to 35615)
  AUTHORS   Eiglmeier K., Honore N., Woods S.A., Caudron B., Cole S.T.
  TITLE     Use of an ordered cosmid library to deduce the genomic organization
            of Mycobacterium leprae
  JOURNAL   Mol. Microbiol. 7(2), 197-206(1993).
   PUBMED   8446027
COMMENT     Notes:
            
            The Sanger Centre is funded to complete the sequence of M. leprae
            by the Heiser Program for Research in Leprosy and Tuberculosis of
            The New York Community Trust.
            
            Work in Paris is supported by the Heiser Trust, the Association
            Francaise Raoul Follereau and the Groupement de Recherches et des
            Etudes des Genomes (GIP-GREG).
            
            Details of M. leprae sequencing at the Sanger Centre
            are available on the World Wide Web.
            (URL, http://www.sanger.ac.uk/Projects/)
            
            CDS are numbered using the following system eg MLCB33.01c.
            ML (M. leprae), cB33 (cosmid name), .01 (first CDS),
            c (complementary strand).
            
            The more significant matches with motifs in the PROSITE
            database are also included but some of these may be fortuitous.
            
            The length in codons is given for each CDS.
            
            Usually the highest scoring match found by fasta -o is given for
            CDS which show significant similarity to other CDS in the database.
            The position of possible ribosome binding site sequences are
            given where these have been used to deduce the initiation codon.
            
            All CDS over 100 codons have been analysed.  Gene prediction
            is based on positional base preference in codons especially
            where there is an increase in the observed/expected third
            position G + C.  CAUTION:  We may not have predicted the
            correct initiation codon.  Where possible we choose an
            initiation codon (atg, gtg, or ttg) which is preceded by an
            upstream ribosome binding site sequence (optimally 5-13bp
            before the initiation codon).  If this cannot be identified
            we choose the most upstream initiation codon.
            
            IMPORTANT: This sequence MAY NOT be the entire insert of
            the sequenced clone.  It may be shorter because we only
            sequence overlapping sections once, or longer, because we
            arrange for a small overlap between neighbouring submissions.
            
            Cosmid B2407 is overlapped by EMBL:ML023 L518 at the 5' end
            and EMBL:MLB577COS B577 at the 3' end.
FEATURES             Qualifiers
     source          /organism="Mycobacterium leprae"
                     /mol_type="genomic DNA"
                     /clone="cosmid B2407"
                     /db_xref="taxon:1769"
     protein         /transl_table=11
                     /gene="MLCB2407.03c"
                     /note="MLCB2407.03c, unknown, len: 336 aa; identical to
                     ORF's TR:Q49930 L518_C3_195 (213 aa) and TR:Q49928
                     L518_C1_89 (122 aa) (EMBL;U00023). This sequence has a
                     single ORF, due an additional base in EMBL:U00023"
                     /db_xref="InterPro:IPR029063"
                     /db_xref="UniProtKB/TrEMBL:O69510"
BEGIN
        1 MYSIELRKMG TLGCVKIISK ENVIIVAAKI SWTLRCLVFL LRFGFAFLFY SGIVNPIIIW
       61 HGFAALLRYF LVKQGIPSGT AKHLAQIRAE FEEQSVQGQF KELYFDMTDR NIVVWSDIFS
      121 RVFDRKAPVR ILEIGSWEGR STLFLLTYFT QGHLTAVDTW AGSDEHQHNA PSDLRSLEAR
      181 FDSNLTPCAA RLTKRKGSSL HVLPQLLGEE QKFDVIYVDG SHFGDDVLAD GITAWRLLEK
      241 GGILIFDDFL WPGYLRARAN PAWAINLFLK YHAGEYKILN VSYQVILQKK LVFNDRVLTS
      301 LATVQYDDRR RFTKEFIKNL SSASQVMSGN MRPLER
//