LOCUS QJW38016.1 1297 aa PRT BCT 29-DEC-2022 DEFINITION Cellulosimicrobium protaetiae HAMP domain-containing protein protein. ACCESSION CP052757-3825 PROTEIN_ID QJW38016.1 SOURCE Cellulosimicrobium protaetiae ORGANISM Cellulosimicrobium protaetiae Bacteria; Actinobacteria; Micrococcales; Promicromonosporaceae; Cellulosimicrobium. REFERENCE 1 (bases 1 to 4631595) AUTHORS Le Han,H., Nguyen,T.T.H., Li,Z., Shin,N.R. and Kim,S.G. TITLE Cellulosimicrobium protaetiae sp. nov., isolated from the gut of the larva of Protaetia brevitarsis seulensis JOURNAL Int J Syst Evol Microbiol 72 (3) (2022) PUBMED 35348452 REFERENCE 2 (bases 1 to 4631595) AUTHORS Le Ho,H. and Kim,S.-G. TITLE Direct Submission JOURNAL Submitted (06-NOV-2019) Korean Collection for Type Cultures (KCTC), Korea Research Institute of Bioscience & Biotechnology (KRIBB), 181 Ipsin-gil, Jeongeup-si, Jeollabuk-do 56212, Republic of Korea REFERENCE 3 (bases 1 to 4631595) AUTHORS Ho,H. and Kim,S.-G. TITLE Direct Submission JOURNAL Submitted (17-APR-2020) Korean Collection for Type Cultures (KCTC), Korea Research Institute of Bioscience & Biotechnology (KRIBB), 181 Ipsin-gil, Jeongeup-si, Jeollabuk-do 56212, Korea, Republic of COMMENT The annotation was added by the NCBI Prokaryotic Genome Annotation Pipeline (PGAP). Information about PGAP can be found here: https://www.ncbi.nlm.nih.gov/genome/annotation_prok/ ##Genome-Assembly-Data-START## Assembly Date :: JUL-2019 Assembly Method :: HGAP v. 3.0 Genome Representation :: Full Expected Final Version :: Yes Genome Coverage :: 178.0x Sequencing Technology :: PacBio RSII; Illumina HiSeq ##Genome-Assembly-Data-END## ##Genome-Annotation-Data-START## Annotation Provider :: NCBI Annotation Date :: 04/28/2020 06:45:21 Annotation Pipeline :: NCBI Prokaryotic Genome Annotation Pipeline (PGAP) Annotation Method :: Best-placed reference protein set; GeneMarkS-2+ Annotation Software revision :: 4.11 Features Annotated :: Gene; CDS; rRNA; tRNA; ncRNA; repeat_region Genes (total) :: 4,140 CDSs (total) :: 4,077 Genes (coding) :: 4,001 CDSs (with protein) :: 4,001 Genes (RNA) :: 63 rRNAs :: 3, 3, 3 (5S, 16S, 23S) complete rRNAs :: 3, 3, 3 (5S, 16S, 23S) tRNAs :: 51 ncRNAs :: 3 Pseudo Genes (total) :: 76 CDSs (without protein) :: 76 Pseudo Genes (ambiguous residues) :: 0 of 76 Pseudo Genes (frameshifted) :: 11 of 76 Pseudo Genes (incomplete) :: 66 of 76 Pseudo Genes (internal stop) :: 2 of 76 Pseudo Genes (multiple problems) :: 3 of 76 CRISPR Arrays :: 1 ##Genome-Annotation-Data-END## FEATURES Qualifiers source /organism="Cellulosimicrobium protaetiae" /mol_type="genomic DNA" /strain="BI34" /isolation_source="intestine from larvae" /host="wax moth" /type_material="type strain of Cellulosimicrobium protaetiae" /db_xref="taxon:2587808" /country="South Korea: Jeongeup" /collection_date="2019-04" protein /locus_tag="FIC82_019440" /inference="COORDINATES: protein motif:HMM:NF012876.1,HMM:NF014567.1,HMM:NF019974.1" /note="Derived by automated computational analysis using gene prediction method: Protein Homology." /transl_table=11 BEGIN 1 MSVRGKILAA LAVPVLVLFV AAAIISAQAI GSARDASQTS ALVGALAAQD AAGTEIAAER 61 TYSFLDAFAG SEDSEAQMMA QREKTDKALD ARDRAYEKLN TSALDPRVRD AVADTIADRS 121 DLQSVREAID RSSIGQLQRN SRYNTLIDDA LNVPRVLADT TPDRSLAQYL STYVLLDELL 181 SQLALEQPFA GAVLSAAQVG QESTTTSQQA AVLVTTGDTL AERTRTAVRQ LPGDYVLATP 241 TAAYQQVRQN LIGSRPGATP ANLAAQWPDL SQADRDQTTP VRDGIRNDTA EKASDLASAA 301 TTRAVVTILA TLAAVFASIL VAGFIARAIV NPLRRLTDAA EDVRDQLPRL VEQVAVPGQG 361 PGIDLTPITV ESTDEVGQLA TAFNDVNQTT IKVAREQAAL RGSIAEMFVN VARRDQVLLN 421 RQLAFLDDLE RSEEDAGTLS NLFRLDHLAT RMRRNAESLL VLAGIDSGRR VRQPMPASDV 481 IRTASSEIEL YDRVRLNLVV DPLMLGHNAL NAAHLLAELL ENATMFSEPH TPVEVTTGRD 541 EHFVYVTVRD HGLGMTPEEI AEANRKVATH AASDVVGAQR LGLFVVGRLA DRLGAKVRFS 601 TNGDDQGTEV VVSFPGVLFV PDSNVPLPQP TDPLDTSTQA AAQQLAGTGV LPALPAAGAA 661 GPAAPSLPAP AATASFPTVE PEAPVAVPVD LDALTDGTTQ TGMPRRRSRT VDPAAAAPSA 721 SFASGPQTGA IVLPPLATPS LPDQLPAADE AWTPPAEVAA AGSALPSRSR PAAGPVEPVS 781 AEIPVLDVST RSALFSSFRP VGDRPAGENP VELPTAPDVT ATDIPLVAET PTDHVAAQGV 841 WTPQETVEPA ASTWADPASE PEAWSPEPAW APEQPAEQTW NPEPAASSWE PSSVSDAPLD 901 ATRVVPAVPA EPVDESTVAR VPLARRTPVA PEPAAPSAPV APAAPTDVPV VASGPEAGTT 961 AEEIPEELTF EALPRFEELM ADLPTRRSLR ESQARKRGIF GRRPRTTATP QAARPTGPSP 1021 EALASRPAGA PSAPSPSVPA PGAAPLAPTP SALRREPEAP ARASAFAPRA AEQPSSAASP 1081 FAPEAPVAPQ PEAHAPAWAA EHRPVGDVHA VPGAGFAAPE PTYPPVETRP EQVAQPVGHA 1141 AAGSGPEGYG PPSPLVRRQV SDAIEPLEAG YIADSVEARS DWMASAVLYE EMSTLLQGST 1201 DFQEATLADP NDGIYQPLTV DATTTSGLAR RARGEEREGY VDRFTASIDR DPEQLRARLS 1261 AFQSATARGR VEGQDETSST WNPQAMDYVP DSAPQAR //