LOCUS CCD72623.1 2314 aa PRT CON 06-FEB-2024 DEFINITION Caenorhabditis elegans Carboxypeptidase protein. ACCESSION BX284606-1476 PROTEIN_ID CCD72623.1 SOURCE Caenorhabditis elegans ORGANISM Caenorhabditis elegans Eukaryota; Metazoa; Ecdysozoa; Nematoda; Chromadorea; Rhabditida; Rhabditina; Rhabditomorpha; Rhabditoidea; Rhabditidae; Peloderinae; Caenorhabditis. REFERENCE 1 (bases 1 to 17718942) AUTHORS WormBase. CONSRTM WormBase Consortium JOURNAL Submitted (04-FEB-2024) to the INSDC. WormBase Group, European Bioinformatics Institute, Cambridge, CB10 1SA, UK. Email: help@wormbase.org REFERENCE 2 (bases 1 to 17718942) AUTHORS Sulson J.E., Waterston R. JOURNAL Submitted (03-MAR-2003) to the INSDC. Nematode Sequencing Project: Sanger Institute, Hinxton, Cambridge CB10 1SA, UK and The Genome Institute at Washington University, St. Louis, MO 63110, USA. REFERENCE 3 (bases 1 to 17718942) AUTHORS Sulson J.E., Waterston R. CONSRTM Caenorhabditis elegans Sequencing Consortium TITLE Genome sequence of the nematode C. elegans: a platform for investigating biology JOURNAL Science 282(5396), 2012-2018(1998). COMMENT Annotated features correspond to WormBase release WS292. Protein-coding gene structures below are the result of integration and manual review of the following types of data: ab initio predictions by Genefinder (P. Green and L. Hillier, pers. comm.); alignments to published proteins and cDNAs; genome sequence conservation with other nematodes (e.g. to C. briggsae using WABA: Genome Res. 2000. 10:1115-1125); sequence features (such as trans-splice and polyA sites). Sources of data: large-scale EST projects of Yuji Kohara (http://www.ddbj.nig.ac.jp/c-elegans/html/CE_INDEX.html); ORFeome cloning project (http://worfdb.dfci.harvard.edu); RST large-scale sequencing project (Genome Res. 2009. 19:2334-2342); IST library (Science. 2004. 303:540-3); RT-PCR EST set (Ewing B. Green P. 2010 Unpublished); UTRome EST data submission (UTRome v1 Mangone M. Piano F. 2009); TEC-RED data (PNAS 2004. 101:1650-1655); RNA Deep sequencing data (454 read clusters - Makedonka Mitreva, unpublished; Illumina sequence data, Genome Res. 2009. 19:657-66); Numerous data sets from the modENCODE project (Science. 2010. 330:1775-87); Individual C. elegans Nucleotide Database submissions; Personal communications with C. elegans researchers; Non-Coding gene structures below are derived using the following methods and data: ab initio prediction of tRNAs by tRNAscan-SE (Nucl. Acids. Res., 25, 955-964); integration and appraisal of miRNAs from miRBase (http://www.mirbase.org); integration and appraisal of RFAM predictions (rfam.sanger.ac.uk); 21U-RNAs (Cell. 2006. 127:1193-1207); modENCODE data (Science. 2010. 330:1775-87); manual curation of novel published ncRNAs from the literature. FEATURES Qualifiers source /organism="Caenorhabditis elegans" /chromosome="X" /strain="Bristol N2" /mol_type="genomic DNA" /db_xref="taxon:6239" protein /transl_table=1 /gene="ctsa-2" /locus_tag="CELE_K10C2.1" /standard_name="K10C2.1" /note="Confirmed by transcript evidence" /db_xref="EnsemblGenomes-Gn:WBGene00019617" /db_xref="EnsemblGenomes-Tr:K10C2.1" /db_xref="GOA:Q94269" /db_xref="InterPro:IPR001563" /db_xref="InterPro:IPR018202" /db_xref="InterPro:IPR029058" /db_xref="InterPro:IPR033124" /db_xref="UniProtKB/TrEMBL:Q94269" /db_xref="WormBase:WBGene00019617" intron_pos 59:2 (1/20) intron_pos 234:2 (2/20) intron_pos 259:0 (3/20) intron_pos 298:2 (4/20) intron_pos 363:0 (5/20) intron_pos 569:1 (6/20) intron_pos 694:0 (7/20) intron_pos 742:0 (8/20) intron_pos 782:2 (9/20) intron_pos 928:2 (10/20) intron_pos 1252:0 (11/20) intron_pos 1300:0 (12/20) intron_pos 1340:2 (13/20) intron_pos 1401:2 (14/20) intron_pos 1457:0 (15/20) intron_pos 1872:2 (16/20) intron_pos 1981:0 (17/20) intron_pos 2029:2 (18/20) intron_pos 2055:2 (19/20) intron_pos 2167:0 (20/20) BEGIN 1 MLRSLGILCL LGAALAAPSR IASKDTDLVN DLPGLSFTPT FKQYSGYLDG SQGNHLHYWL 61 VESQTNPQTA PIVLWLNGGP GCSSLLGLLS ENGPYRIQKD GVTVIENVNS WNKAANVLFL 121 ESPRDVGFSY REKSATPDLL YNDDKTATDN ALALVQFFQR FPEYQGRDFY ITGESYGGVY 181 VPTLTKLVVQ MIQNNTTPYI NLKGFAVGNG ALSRKHLTNS GIDLLYYRGM LGTTQWENLR 241 QCCPDTLNNP LVDCDYSKYV VFDNFGNPSP RNDTNDAQAI ACGKMVINLS LNSIWETYND 301 VYNSYQDCYN FDSSVFGAAE ERHAKVHQQT MRKIMRTTLS TTGANDAYNL FSNGFNPFID 361 QGSLYNKMST DALNNYPCYI DDATTAWLGR TDVRSALHIP AAAPVWQECS DDINAKYYIQ 421 QYPDTTPVFQ FLVDSGYPLK VLIYNGDVDL ACNYLGDQWF VENLATVSYQ MTLTTPRQQW 481 NFTRAGTQNK YIPTLAGYLK SWNYQQFSID LLTVKGAGHM VPMDRPGPAL QIFYNYLYNT 541 NGGYSNQVPY DLTATPLRSQ FLAPPQKTWT RKQADRVWNL PGITYGLNFK QYSGYLNGVT 601 GNYLHYWFVE SQGNPTTDPL VLWLTGGPGC SGLMAMLTEL GPFHPNPDGK TLFENVYSWN 661 KAANVIFLES PRGVGFSVQD PSLNNDTIWD DQRTATDTYL ALKDFLTVYP EYINRPFFVT 721 GESYGGVYVP TITSLLIDKI QSGDFAQLNL VGMSIGNGEL SAIQQFNSAI MMSYFHGLFS 781 KDDFDSLQPC CNQTKTSSQW FEYCNFAQYI HLGPDGTAIP NDKSFCANKV ADLGQQRFWN 841 SLNDVYNIYQ DCYQQADRAF GSRMSIKQKK EHMRGFIDQG AKISTSSTDN QGGLACYGTT 901 QAANWINLPD VRSALHVSSA AGAWSACNDT INGLYVQQHN DTTSVFQHIL DSKYPLRVLI 961 YNGDVDQACN YLGDQWFIEA FALKNQLPVT KPRADWRYMT QIAGYAKKFD NNAGFSVDLI 1021 TVKGAGHLVP TDRPGPALQM IANFFRNQDY SNPTVIDTTL HPLKNTYVVA EQLAASLNRS 1081 STGVAHNGNR VHVKVHKVVR AGNFAKTGEQ DTVRQPTERD APPPPPTQTK AQDEVTNLPG 1141 LTFTPNFKQY SGYLNASAGN YLHYWLVESQ LNATYDPLIL WLNGGPGCSS IGGFLEELGP 1201 FHVNADGKTL FENTFSWNKA GNVLFLEAPR DVGYSFRSNE FAPDTMYNDT YTASDTVLAL 1261 ASFFNKFPEY QNRPFYITGE SYGGIYVPTL TRALINAIQT GTIKNVNLVG VAIGNGELSG 1321 IQQINSAVSL LYFRGERDKS DWDAISKCCD TSVPQAYCDY IKYVNIDTSG NVWPKVNDNS 1381 LAGQCGQLVT QQGFLDVWTT DNDVYNTFAD CYTAPGAGDS KLNELASGIR RVQNRRSKRA 1441 ADVSPFLPST LFVDQAKKIN YQSTDANGGF TCFSGASSEN YMNLPEVRTA LHIPTSLPYW 1501 TDCNDNMNEN YIQQHNDTSS VFTDIFATGY PLRFLIYNGD VDMACQFLGD QWFLEKLAKD 1561 NGLAVTRQHG PWNYTQGQFL PRVGGYWKQF TYTNTAKNTK VVFDQLTVKG AGHFVPQDRP 1621 GPALQMIYNF VNQLDYNRNL TLDYSRKQLL PQYQPAPVTV PRRKADHIFS LPGVTWNVNF 1681 MQHSGYLQAT RGNKLFYWFV ESQSGNEGDP IILWLQGGPG CASTGGLFSE IGPFFVNPDG 1741 ETLFENIYSW NKAAHILIID SPRGVGFSYQ DKNVNNDTTW DDDKTALDTY TALEDFFVTY 1801 PPHRNSELYI TGESYGGVYV PTLTRLLIQK IQAGQSNIQL RGMGIGNGMV SAVNDVRTLP 1861 DFLYFHGIYD KPMWEKLRAC CPSADSSGDC NYDYYITIDS GVNVIAKQFP NNQTLQDCAN 1921 LVENLSYDRN WKALYDQYNL YQDCYVTPRD QANPFAMKEK FSRLDVDHKL KTSIPQAITK 1981 TAPQDPLSTD ATGGYSCWSL GAINNYLSLS HVRDALHIPD SVPRWGFCNK INYANLYNDT 2041 TQVFTDILNS GYNLKVLIYN GDVDSVCSMF EAESMINNFA AAQTFVSNQP RGSWMYGGQI 2101 GGYVQKFQKN NMTIDLLTVK GAGHMSPTDR PGPVLQMMNN FVHGQGNYNT SIAVSMVRQP 2161 LLAQFLEQGI GPVGTSAPVA STSTTNTPSP TNQSPVTQAP PVTLPPPSVA TAGPTGPILT 2221 VVPVSSAPTS GAVSSTTNTP SPTNQSPVTL PPPSVATAGP TGPILTVVPV SSAPTSGAVS 2281 STATAAPVIT TTKTSSVLSV SFSMFIVLIT KFLL //