LOCUS       CCD68906.1              1758 aa    PRT              CON 06-FEB-2024
DEFINITION  Caenorhabditis elegans Collagen alpha-2(IV) chain protein.
ACCESSION   BX284606-3864
PROTEIN_ID  CCD68906.1
SOURCE      Caenorhabditis elegans
  ORGANISM  Caenorhabditis elegans
            Eukaryota; Metazoa; Ecdysozoa; Nematoda; Chromadorea; Rhabditida;
            Rhabditina; Rhabditomorpha; Rhabditoidea; Rhabditidae; Peloderinae;
            Caenorhabditis.
REFERENCE   1  (bases 1 to 17718942)
  AUTHORS   WormBase.
  CONSRTM   WormBase Consortium
  JOURNAL   Submitted (04-FEB-2024) to the INSDC. WormBase Group, European
            Bioinformatics Institute, Cambridge, CB10 1SA, UK. Email:
            help@wormbase.org
REFERENCE   2  (bases 1 to 17718942)
  AUTHORS   Sulson J.E., Waterston R.
  JOURNAL   Submitted (03-MAR-2003) to the INSDC. Nematode Sequencing Project:
            Sanger Institute, Hinxton, Cambridge CB10 1SA, UK and The Genome
            Institute at Washington University, St. Louis, MO 63110, USA.
REFERENCE   3  (bases 1 to 17718942)
  AUTHORS   Sulson J.E., Waterston R.
  CONSRTM   Caenorhabditis elegans Sequencing Consortium
  TITLE     Genome sequence of the nematode C. elegans: a platform for
            investigating biology
  JOURNAL   Science 282(5396), 2012-2018(1998).
COMMENT     Annotated features correspond to WormBase release WS292.
            Protein-coding gene structures below are the result of integration
            and manual review of the following types of data: ab initio
            predictions by Genefinder (P. Green and L. Hillier, pers. comm.);
            alignments to published proteins and cDNAs; genome sequence
            conservation with other nematodes (e.g. to C. briggsae using WABA:
            Genome Res. 2000. 10:1115-1125); sequence features (such as
            trans-splice and polyA sites).
            Sources of data: large-scale EST projects of Yuji Kohara
            (http://www.ddbj.nig.ac.jp/c-elegans/html/CE_INDEX.html); ORFeome
            cloning project (http://worfdb.dfci.harvard.edu); RST large-scale
            sequencing project (Genome Res. 2009. 19:2334-2342); IST library
            (Science. 2004. 303:540-3); RT-PCR EST set (Ewing B. Green P. 2010
            Unpublished); UTRome EST data submission (UTRome v1 Mangone M.
            Piano F. 2009); TEC-RED data (PNAS 2004. 101:1650-1655); RNA Deep
            sequencing data (454 read clusters - Makedonka Mitreva,
            unpublished; Illumina sequence data, Genome Res. 2009. 19:657-66);
            Numerous data sets from the modENCODE project (Science. 2010.
            330:1775-87); Individual C. elegans Nucleotide Database
            submissions; Personal communications with C. elegans researchers;
            Non-Coding gene structures below are derived using the following
            methods and data: ab initio prediction of tRNAs by tRNAscan-SE
            (Nucl. Acids. Res., 25, 955-964); integration and appraisal of
            miRNAs from miRBase (http://www.mirbase.org); integration and
            appraisal of RFAM predictions (rfam.sanger.ac.uk); 21U-RNAs (Cell.
            2006. 127:1193-1207); modENCODE data (Science. 2010. 330:1775-87);
            manual curation of novel published ncRNAs from the literature.
FEATURES             Qualifiers
     source          /organism="Caenorhabditis elegans"
                     /chromosome="X"
                     /strain="Bristol N2"
                     /mol_type="genomic DNA"
                     /db_xref="taxon:6239"
     protein         /transl_table=1
                     /gene="let-2"
                     /locus_tag="CELE_F01G12.5"
                     /standard_name="F01G12.5a"
                     /note="Confirmed by transcript evidence"
                     /db_xref="WormBase:WBGene00002280"
     intron_pos      8:2 (1/17)
     intron_pos      27:0 (2/17)
     intron_pos      48:0 (3/17)
     intron_pos      81:1 (4/17)
     intron_pos      144:1 (5/17)
     intron_pos      203:0 (6/17)
     intron_pos      229:0 (7/17)
     intron_pos      265:0 (8/17)
     intron_pos      304:0 (9/17)
     intron_pos      359:0 (10/17)
     intron_pos      449:2 (11/17)
     intron_pos      737:0 (12/17)
     intron_pos      961:1 (13/17)
     intron_pos      1233:0 (14/17)
     intron_pos      1310:1 (15/17)
     intron_pos      1574:1 (16/17)
     intron_pos      1729:0 (17/17)
BEGIN
        1 MKQRAALGPV LRLAILALLA VSYVQSQATC RDCSNRGCFC VGEKGSMGAP GPQGPPGTQG
       61 IRGFPGPEGL AGPKGLKGAQ GPPGPVGIKG DRGAVGVPGF PGNDGGNGRP GEPGPPGAPG
      121 WDGCNGTDGA PGIPGRPGPP GMPGFPGPPG MDGLKGEPAI GYAGAPGEKG DGGMPGMPGL
      181 PGPSGRDGYP GEKGDRGDTG NAGPRGPPGE AGSPGNPGIG SIGPKGDPGD LGSVGPPGPP
      241 GPREFTGSGS IVGPRGNPGE KGDKGEPGEG GQRGYPGNGG LSGQPGLPGM KGEKGLSGPA
      301 GPRGKEGRPG NAGPPGFKGD RGLDGLGGIP GLPGQKGEAG YPGRDGPKGN SGPPGPPGGG
      361 TFNDGAPGPP GLPGRPGNPG PPGTDGYPGA PGPAGPIGNT GGPGLPGYPG NEGLPGPKGD
      421 KGDGGIPGAP GVSGPSGIPG LPGPKGEPGY RGTPGQSIPG LPGKDGKPGL DGAPGRKGEN
      481 GLPGVRGPPG DSLNGLPGAP GQRGAPGPNG YDGRDGVNGL PGAPGTKGDR GGTCSACAPG
      541 TKGEKGLPGY SGQPGPQGDR GLPGMPGPVG DAGDDGLPGP AGRPGSPGPP GQDGFPGLPG
      601 QKGEPTQLTL RPGPPGYPGL KGENGFPGQP GVDGLPGPSG PVGPPGAPGY PGEKGDAGLP
      661 GLSGKPGQDG LPGLPGNKGE AGYGQPGQPG FPGAKGDGGL PGLPGTPGLQ GMPGEPAPEN
      721 QVNPAPPGQP GLPGLPGTKG EGGYPGRPGE VGQPGFPGLP GMKGDSGLPG PPGLPGHPGV
      781 PGDKGFGGVP GLPGIPGPKG DVGNPGLPGL NGQKGEPGVG VPGQPGSPGF PGLKGDAGLP
      841 GLPGTPGLEG QRGFPGAPGL KGGDGLPGLS GQPGYPGEKG DAGLPGVPGR EGSPGFPGQD
      901 GLPGVPGMKG EDGLPGLPGV TGLKGDLGAP GQSGAPGLPG APGYPGMKGN AGIPGVPGFK
      961 GDGGLPGLPG LNGPKGEPGV PGMPGTPGMK GNGGLPGLPG RDGLSGVPGM KGDRGFNGLP
     1021 GEKGEAGPAA RDGQKGDAGL PGQPGLRGPQ GPSGLPGVPG FKGETGLPGY GQPGQPGEKG
     1081 LPGIPGKAGR QGAPGSPGQD GLPGFPGMKG ESGYPGQDGL PGRDGLPGVP GQKGDLGQSG
     1141 QPGLSGAPGL DGQPGVPGIR GDKGQGGLPG IPGDRGMDGY PGQKGENGYP GQPGLPGLGG
     1201 EKGFAGTPGF PGLKGSPGYP GQDGLPGIPG LKGDSGFPGQ PGQEGLPGLS GEKGMGGLPG
     1261 MPGQPGQSIA GPVGPPGAPG LQGKDGFPGL PGQKGESGLS GLPGAPGLKG ESGMPGFPGA
     1321 KGDLGANGIP GKRGEDGLPG VPGRDGQPGI PGLKGEVGGA GLPGQPGFPG IPGLKGEGGL
     1381 PGFPGAKGEA GFPGTPGVPG YAGEKGDGGL PGLPGRDGLP GADGPVGPPG PSGPQNLVEP
     1441 GEKGLPGLPG APGLRGEKGM PGLDGPPGND GPPGLPGQRG NDGYPGAPGL SGEKGMGGLP
     1501 GFPGLDGQPG GPGAPGLPGA PGAAGPAYRD GFVLVKHSQT TEVPRCPEGQ TKLWDGYSLL
     1561 YIEGNEKSHN QDLGHAGSCL QRFSTMPFLF CDFNNVCNYA SRNDKSYWLS TSEAIPMMPV
     1621 NEREIEPYIS RCAVCEAPAN TIAVHSQTIQ IPNCPAGWSS LWIGYSFAMH TGAGAEGGGQ
     1681 SLSSPGSCLE DFRATPFIEC NGARGSCHYF ANKFSFWLTT IDNDSEFKVP ESQTLKSGNL
     1741 RTRVSRCQVC VKSTDGRH
//