LOCUS UJP41675.1 1130 aa PRT BCT 25-JAN-2022 DEFINITION Cellulomonas palmilytica UvrD-helicase domain-containing protein protein. ACCESSION CP062221-1741 PROTEIN_ID UJP41675.1 SOURCE Cellulomonas palmilytica ORGANISM Cellulomonas palmilytica Bacteria; Actinobacteria; Micrococcales; Cellulomonadaceae; Cellulomonas. REFERENCE 1 (bases 1 to 3834012) AUTHORS Siriatcharanon,A.-k., Sutheeworapong,S., Pason,P., Waeonukul,R., Kosugi,A., Ratanakhanokchai,K. and Tachaapaikoon,C. TITLE Cellulomonas coenopalmateriei, sp. nov., a novel species of genus Cellulomonas for degradation of raw lignocellulose biomass; oil palm empty fruit bunch JOURNAL Unpublished REFERENCE 2 (bases 1 to 3834012) AUTHORS Siriatcharanon,A.-k., Sutheeworapong,S., Pason,P., Waeonukul,R., Kosugi,A., Ratanakhanokchai,K. and Tachaapaikoon,C. TITLE Direct Submission JOURNAL Submitted (16-SEP-2020) Pilot Plant Development and Training Institute, King Mongkut's University of Technology Thonburi, Bangkuntien-Chaitalay, Bangkok 10150, Thailand COMMENT Bacteria and source DNA available from Enzyme Technology Laboratory, Pilot Plant Development and Training Institute, King Mongkut's University of Technology Thonburi. The annotation was added by the NCBI Prokaryotic Genome Annotation Pipeline (PGAP). Information about PGAP can be found here: https://www.ncbi.nlm.nih.gov/genome/annotation_prok/ ##Genome-Assembly-Data-START## Assembly Date :: 19-MAY-2020 Assembly Method :: unicycler v. v0.4.8 Assembly Name :: EW123_hybrid Genome Representation :: Full Expected Final Version :: Yes Genome Coverage :: 10.0x Sequencing Technology :: Oxford Nanopore MinION; Illumina HiSeq ##Genome-Assembly-Data-END## ##Genome-Annotation-Data-START## Annotation Provider :: NCBI Annotation Date :: 09/30/2020 19:03:08 Annotation Pipeline :: NCBI Prokaryotic Genome Annotation Pipeline (PGAP) Annotation Method :: Best-placed reference protein set; GeneMarkS-2+ Annotation Software revision :: 4.13 Features Annotated :: Gene; CDS; rRNA; tRNA; ncRNA; repeat_region Genes (total) :: 3,454 CDSs (total) :: 3,400 Genes (coding) :: 3,351 CDSs (with protein) :: 3,351 Genes (RNA) :: 54 rRNAs :: 2, 2, 2 (5S, 16S, 23S) complete rRNAs :: 2, 2, 2 (5S, 16S, 23S) tRNAs :: 45 ncRNAs :: 3 Pseudo Genes (total) :: 49 CDSs (without protein) :: 49 Pseudo Genes (ambiguous residues) :: 0 of 49 Pseudo Genes (frameshifted) :: 8 of 49 Pseudo Genes (incomplete) :: 41 of 49 Pseudo Genes (internal stop) :: 2 of 49 Pseudo Genes (multiple problems) :: 2 of 49 ##Genome-Annotation-Data-END## FEATURES Qualifiers source /organism="Cellulomonas palmilytica" /mol_type="genomic DNA" /strain="EW123" /isolation_source="Earthworm bio-fertilizer soil" /type_material="type strain of Cellulomonas palmilytica" /db_xref="taxon:2608402" /country="Thailand: Bangkok" /lat_lon="13.6771 N 100.4591 E" /collection_date="Nov-2015" /collected_by="Enzyme Technology Laboratory, KMUTT" protein /locus_tag="F1D97_08835" /inference="COORDINATES: similar to AA sequence:RefSeq:WP_013884283.1" /note="Derived by automated computational analysis using gene prediction method: Protein Homology." /transl_table=11 BEGIN 1 MFDVPVFDVR GPLPGVETVV LEASAGTGKT WTIASLAVRY VAELDVRLPE LMLVTFGRAA 61 TTELRDRVRS RLVEVERALR APATARSSSD EVVALLADVP PGEAERRRAR VARAVAEFDA 121 ATITTTHGFC QQMLAMLGTT ADMDHDAVLV PDVADLEADV VDDLYLRTYA AQAEPLVTVP 181 EARELMHAAI SDHAAALEPT DAEPGTRAFH RFKLADVARA EMRRRKRASG VVDYDDLLTL 241 LRDALTDPVT GPAARERVRG AYRVVMVDEF QDTDPVQWQI LEAAFAGERT LVLIGDPKQA 301 IYAFRGADVA TYLVAAQHAR RATLAHNHRA DPDVLAGLRP VLEGAALGDP RIVVRPVQAR 361 RTGRRLRGGP PVVLRQVTRE ALRTSPTKAP PVGAVRDFLY ADVAAQVVET LRTHRVVEDD 421 GERPVRPGDV AVLVRRNRDA QAVQRTLAEA GVPAVISGLA SVFGTRSAQH WLTLLSALAR 481 PDDARLAAGA ALTPFLGWTA ADLALADDEA RDRLAERVRT WSATLASHGV AALLEAAAAA 541 GIRERVMRRA DGERELTDVR HVGEALHAVA VQEGLGASAL LEWLRARIDE AARDYAEERS 601 RRLQTDAAAV QVITVHASKG LQFPVTLVPF AWDTWVPSDP PVLRFHRDDA RTLHVGGPGS 661 PGYAEGLRAH QDDEAGEQLR LTYVALTRAH SQVVVWWAPS RNSERSPLSR LLLATRDADG 721 TPAAVVPVPS DDDAAQAFAG LAATTGGALV HELVTQARPR ARWSPPPVPA PRLELARFDR 781 ALDTAWRRTS YSGLTAAAYD AHHHAPSSEP ETTGVQDEPE GAPLASSDTA QADDGSEEAA 841 LRGLGSPMAD LPAGTAFGTL VHHVLENVDT RAPDLTAEVA LRCAEADPGR ALPVGPDELA 901 ARLLPALRTP LGTLADGLTL ADVAPADRLP ELDFELPLAG GDRVRAATDA PAAASLRDVA 961 ALWRAHVADD DALAGYADRL ADPVLAASAL RGYLTGSIDA VLRVPSDDAP GGHRYLVVDY 1021 KTNRLGPWEE PLSAWHYRPA SMLAAMVEHH YPLQLVLYSV ALHRLLRWRV PGYDPDAHLG 1081 GGLYLFVRGM LGPETPVLDG PAGPAPTGVL AWRAPTPLVL ALSDLLAGSR //