LOCUS UJP41565.1 1460 aa PRT BCT 25-JAN-2022 DEFINITION Cellulomonas palmilytica ATP-dependent RNA helicase HrpA protein. ACCESSION CP062221-839 PROTEIN_ID UJP41565.1 SOURCE Cellulomonas palmilytica ORGANISM Cellulomonas palmilytica Bacteria; Actinobacteria; Micrococcales; Cellulomonadaceae; Cellulomonas. REFERENCE 1 (bases 1 to 3834012) AUTHORS Siriatcharanon,A.-k., Sutheeworapong,S., Pason,P., Waeonukul,R., Kosugi,A., Ratanakhanokchai,K. and Tachaapaikoon,C. TITLE Cellulomonas coenopalmateriei, sp. nov., a novel species of genus Cellulomonas for degradation of raw lignocellulose biomass; oil palm empty fruit bunch JOURNAL Unpublished REFERENCE 2 (bases 1 to 3834012) AUTHORS Siriatcharanon,A.-k., Sutheeworapong,S., Pason,P., Waeonukul,R., Kosugi,A., Ratanakhanokchai,K. and Tachaapaikoon,C. TITLE Direct Submission JOURNAL Submitted (16-SEP-2020) Pilot Plant Development and Training Institute, King Mongkut's University of Technology Thonburi, Bangkuntien-Chaitalay, Bangkok 10150, Thailand COMMENT Bacteria and source DNA available from Enzyme Technology Laboratory, Pilot Plant Development and Training Institute, King Mongkut's University of Technology Thonburi. The annotation was added by the NCBI Prokaryotic Genome Annotation Pipeline (PGAP). Information about PGAP can be found here: https://www.ncbi.nlm.nih.gov/genome/annotation_prok/ ##Genome-Assembly-Data-START## Assembly Date :: 19-MAY-2020 Assembly Method :: unicycler v. v0.4.8 Assembly Name :: EW123_hybrid Genome Representation :: Full Expected Final Version :: Yes Genome Coverage :: 10.0x Sequencing Technology :: Oxford Nanopore MinION; Illumina HiSeq ##Genome-Assembly-Data-END## ##Genome-Annotation-Data-START## Annotation Provider :: NCBI Annotation Date :: 09/30/2020 19:03:08 Annotation Pipeline :: NCBI Prokaryotic Genome Annotation Pipeline (PGAP) Annotation Method :: Best-placed reference protein set; GeneMarkS-2+ Annotation Software revision :: 4.13 Features Annotated :: Gene; CDS; rRNA; tRNA; ncRNA; repeat_region Genes (total) :: 3,454 CDSs (total) :: 3,400 Genes (coding) :: 3,351 CDSs (with protein) :: 3,351 Genes (RNA) :: 54 rRNAs :: 2, 2, 2 (5S, 16S, 23S) complete rRNAs :: 2, 2, 2 (5S, 16S, 23S) tRNAs :: 45 ncRNAs :: 3 Pseudo Genes (total) :: 49 CDSs (without protein) :: 49 Pseudo Genes (ambiguous residues) :: 0 of 49 Pseudo Genes (frameshifted) :: 8 of 49 Pseudo Genes (incomplete) :: 41 of 49 Pseudo Genes (internal stop) :: 2 of 49 Pseudo Genes (multiple problems) :: 2 of 49 ##Genome-Annotation-Data-END## FEATURES Qualifiers source /organism="Cellulomonas palmilytica" /mol_type="genomic DNA" /strain="EW123" /isolation_source="Earthworm bio-fertilizer soil" /type_material="type strain of Cellulomonas palmilytica" /db_xref="taxon:2608402" /country="Thailand: Bangkok" /lat_lon="13.6771 N 100.4591 E" /collection_date="Nov-2015" /collected_by="Enzyme Technology Laboratory, KMUTT" protein /gene="hrpA" /locus_tag="F1D97_04255" /EC_number="3.6.4.13" /inference="COORDINATES: similar to AA sequence:RefSeq:WP_013772843.1" /note="Derived by automated computational analysis using gene prediction method: Protein Homology." /transl_table=11 BEGIN 1 MFRGDLGGWD PVTTDQRDER RPPRRGRDRA RGTGEARGTG EARRARAAAA REAAVLPPLT 61 YPEELPVSAR RADIAAAIRD HQVVVVAGET GSGKTTQLPK IALELGRGRA GQIGHTQPRR 121 IAARSVAERI AQEIGTPLGE LVGYQVRFTD TSSEHTLVKV MTDGILLAAI QRDPLLRQYD 181 TLIIDEAHER SLNIDFLLGY LTRLLPRRPD LKLVITSATI DSQRFARHFA GPPTPEHPDG 241 VPAPVVEVTG RTYPVEIRYR PLSPDALGVD DDELDDERPA RRTPPPARQP VRERDLMTGI 301 TEAVDELCAE GPGDVLVFLS GEREIRDATD ALRASLGARV SDPRHPQAVE LLPLYSRLSA 361 AEQHRVFSSH PGRRIVLATN VAETSLTVPG IRYVVDPGTA RISRWSKATK VQRLPIEPIS 421 QASANQRSGR CGRVADGIAI RLYSQEDFER RDRFTEPEIL RTSLASVILQ MIAVGVAASP 481 QDVVDFPFVD PPDVRAVRDG VQLLTELGAI ESGPDGTRLT DTGRALAQLP VDPRLARMVV 541 EASRRGVLRE VVVVAAALSI QDPRERPAEE REKADALHAR FADPSSDLLS YLNLWTYLRE 601 QRRELSGNAF RRACRAEHLN YLRVREWQDV VTQLKELAKG LPGTKGRGPD ENRAHADDGG 661 GWPALAQGPD EDAEAGGGPR RGELRRDWDS DAIHQSVLSG LLSHVGMQEA TEVAAPARRG 721 DKRPARPDRR GRNEYLGARG ARFAVFPGSG LARKPPAWVM AAELVETSRL WARDAARIDP 781 AWAEELGAHL VKRTYSDPAW STKQGAATCT ERVLLYGLPV VAGRRVLYAK VDPEAARELF 841 LRHALVQGEW TTHHQFFHEN RRLLAEASEL AARARRADLL ADEDALFDFY DERVPADVVS 901 ARHFDQWWKS ARRRDPDLLT YSRELLVSSD ADAIDESSFP SRWPQGDLSL PLTYQFEPGT 961 EADGVTVHIP LPQLPRLTPD GFGWMVPGLL DELVVATIRA LPKPVRVQLV PAPDVGRAVA 1021 AWLRSNTASW ADTVRAGDAA PSFHDAFRSA VRAVRDMDVP ADAFDDDRLP AHLRMTFRVV 1081 GDRGGVVDEG KDLAALQHRL ADRAQDAVDT AVKAALRQAM REAGLAADEP ARPGPRARQD 1141 RRGARSPGGS SDARSSGQAA TEPAESVTRG GLTAWPDDLT LPEVVEVRSG ATVVRAYPTL 1201 VDARTSVSVR TLADASARPA AARAGLRRLL LLDVGLAPAR VTSRWTGAQA LALAANPYPS 1261 REALVEDVQL AAVDALVDEH LRTRPADAGR TPADVRTPED YAALRAYLRD RLEDRVHALV 1321 GVLVQVLTAW RELEVELKGT SSLALLATAS DVREQSARLV HAGFVVETGA ERLVHLVRYL 1381 RAARHRLAKA AENPARDADL AWQVHDLENA LAAAPGADPA QVAAVRWQLE ELRVSLFAQQ 1441 LGTPTPVSAQ RIRKTLATLT //