LOCUS CAB60334.3 1343 aa PRT CON 06-FEB-2024 DEFINITION Caenorhabditis elegans Homeobox domain-containing protein protein. ACCESSION BX284602-4406 PROTEIN_ID CAB60334.3 SOURCE Caenorhabditis elegans ORGANISM Caenorhabditis elegans Eukaryota; Metazoa; Ecdysozoa; Nematoda; Chromadorea; Rhabditida; Rhabditina; Rhabditomorpha; Rhabditoidea; Rhabditidae; Peloderinae; Caenorhabditis. REFERENCE 1 (bases 1 to 15279421) AUTHORS WormBase. CONSRTM WormBase Consortium JOURNAL Submitted (04-FEB-2024) to the INSDC. WormBase Group, European Bioinformatics Institute, Cambridge, CB10 1SA, UK. Email: help@wormbase.org REFERENCE 2 (bases 1 to 15279421) AUTHORS Sulson J.E., Waterston R. JOURNAL Submitted (03-MAR-2003) to the INSDC. Nematode Sequencing Project: Sanger Institute, Hinxton, Cambridge CB10 1SA, UK and The Genome Institute at Washington University, St. Louis, MO 63110, USA. REFERENCE 3 (bases 1 to 15279421) AUTHORS Sulson J.E., Waterston R. CONSRTM Caenorhabditis elegans Sequencing Consortium TITLE Genome sequence of the nematode C. elegans: a platform for investigating biology JOURNAL Science 282(5396), 2012-2018(1998). COMMENT Annotated features correspond to WormBase release WS292. Protein-coding gene structures below are the result of integration and manual review of the following types of data: ab initio predictions by Genefinder (P. Green and L. Hillier, pers. comm.); alignments to published proteins and cDNAs; genome sequence conservation with other nematodes (e.g. to C. briggsae using WABA: Genome Res. 2000. 10:1115-1125); sequence features (such as trans-splice and polyA sites). Sources of data: large-scale EST projects of Yuji Kohara (http://www.ddbj.nig.ac.jp/c-elegans/html/CE_INDEX.html); ORFeome cloning project (http://worfdb.dfci.harvard.edu); RST large-scale sequencing project (Genome Res. 2009. 19:2334-2342); IST library (Science. 2004. 303:540-3); RT-PCR EST set (Ewing B. Green P. 2010 Unpublished); UTRome EST data submission (UTRome v1 Mangone M. Piano F. 2009); TEC-RED data (PNAS 2004. 101:1650-1655); RNA Deep sequencing data (454 read clusters - Makedonka Mitreva, unpublished; Illumina sequence data, Genome Res. 2009. 19:657-66); Numerous data sets from the modENCODE project (Science. 2010. 330:1775-87); Individual C. elegans Nucleotide Database submissions; Personal communications with C. elegans researchers; Non-Coding gene structures below are derived using the following methods and data: ab initio prediction of tRNAs by tRNAscan-SE (Nucl. Acids. Res., 25, 955-964); integration and appraisal of miRNAs from miRBase (http://www.mirbase.org); integration and appraisal of RFAM predictions (rfam.sanger.ac.uk); 21U-RNAs (Cell. 2006. 127:1193-1207); modENCODE data (Science. 2010. 330:1775-87); manual curation of novel published ncRNAs from the literature. FEATURES Qualifiers source /organism="Caenorhabditis elegans" /chromosome="II" /strain="Bristol N2" /mol_type="genomic DNA" /db_xref="taxon:6239" protein /transl_table=1 /gene="ceh-100" /locus_tag="CELE_Y38E10A.6" /standard_name="Y38E10A.6a" /note="Confirmed by transcript evidence" /db_xref="EnsemblGenomes-Gn:WBGene00012584" /db_xref="EnsemblGenomes-Tr:Y38E10A.6a" /db_xref="GOA:Q9U2M6" /db_xref="InterPro:IPR001356" /db_xref="InterPro:IPR009057" /db_xref="UniProtKB/TrEMBL:Q9U2M6" /db_xref="WormBase:WBGene00012584" intron_pos 85:0 (1/17) intron_pos 147:0 (2/17) intron_pos 205:0 (3/17) intron_pos 285:0 (4/17) intron_pos 325:0 (5/17) intron_pos 382:0 (6/17) intron_pos 440:0 (7/17) intron_pos 485:0 (8/17) intron_pos 525:0 (9/17) intron_pos 579:0 (10/17) intron_pos 637:0 (11/17) intron_pos 745:0 (12/17) intron_pos 803:0 (13/17) intron_pos 1057:0 (14/17) intron_pos 1115:0 (15/17) intron_pos 1292:0 (16/17) intron_pos 1317:0 (17/17) BEGIN 1 MFHNRPSTSG PPPQKRKYVR KQLPQPPQFV SSAVTSGPMH HRMHEQHQQL DDEREFLRDH 61 LARNDPFFPR ELEPQQDVVE QQLQDADWTP SPAKKKGARA GIPIGSHPPW PEYITRVLDD 121 VFINCQFINE ETKKELAKEL DLTTTQVKDW FNRRRRHALE AHQKNSVELP EQMRILNEAY 181 ERNPILDTDT KNRLLDVTKL SANLISNYFI KRYKKEQKEA GIPYEAGHYV RRRRVYEEEE 241 EFENNGIGTP HYDEDGNYVG DYVEVVEEIP DWDQEVYNQA ERKRRAELKE EEDKLLDEQF 301 AKNPFTTGEE LTEFAKKLNR PRPFVLKWFW KKRKRLGIER EVPPQQDGKN ELLENIFEKH 361 QFVNQKIKEK IAERTNMKPL AVQMWFKKRR DAAVYAHEHH GYELPCQMKL LDEAYNKCRI 421 ISDKDRRKLM KATKLTSNYI SSYFMHRDKI ASRQETENDT TIQMDDLNEE DDGEGADGGS 481 PKKKRVLTKL EEERILEEFF QKNAWCEGQD LIDMAKKINR PTVFLSKWFS KARRKSGVKR 541 KDPEPYGIHL EPYFQKHQFV DGAKYNELAE IAGVNPTTVK SWYMRRRLRA QKAHDEHGEE 601 LPCQMKELQE AFAGHRKLSK GHFQYIGDKI GLTANFVSNY FTIRKRMSKG AKRGERLSDS 661 DDDDDEPEDD GEEKNEEKNE EEARKAAEKA EKAQKMTEML EAMTEEERQK YQDNVLYTIF 721 QKTQFVDDRR YQEISCETGM PAEEIRDWFQ KKRVSSVEEY RKDGKELPKQ MKLLHESYER 781 CPMLGDDVRY NLALKTFLPP KSITNFFINR NRKTAIKMRA ESSSVDGEAE PPNNDGFYEF 841 NLEEPAPILR KSLRKPVPRI LGDFGSYPEK NGSETAEKAQ ILKKNGSETA EIAEKAQISE 901 KKPVNRLVID EVLKSPQKSE KIPEKAQEIE EIEESPKKSE KAPEKPQEIQ EIPKKSEKAP 961 EKPQEIEKSP KKSEKRQEIQ EIPQKSEKTS EKRPEIEELP TFFKSSAPAQ TPEIISDEDI 1021 FAQYESLLNA VFSEFQFVDD RTNYQLSKQI RIGMPQIREW FRKKRESSVE EHRTNGTELP 1081 KQMKLLHEAY QRCPLLDEDA RRDLVEKTQL LPKWVTNFFI NRSRKAQKAA ENAAEPSTSD 1141 ATTTSDDGFF DFNIENPVAL TQTRKSARRP APKNYDDFFG EDADLDELLR ATTTNPPAAP 1201 VATAFTKIGS HIIISPSQKH KSTQTTDFGP KEAPKEAPKA VVETLEPDTS DEEFVADELF 1261 EFNLDDFEPR PDRKRPAGPP IYRPPPKQPK LSISKTRILE EAYANLLKDG KLTAHEIATF 1321 FANRNNNTEI DQKPRIHMVP KKE //