LOCUS WQZ30520.1 1614 aa PRT BCT 29-DEC-2023
DEFINITION Helicobacter pylori DEAD/DEAH box helicase protein.
ACCESSION CP079244-640
PROTEIN_ID WQZ30520.1
SOURCE Helicobacter pylori
ORGANISM Helicobacter pylori
Bacteria; Campylobacterota; Epsilonproteobacteria;
Campylobacterales; Helicobacteraceae; Helicobacter.
REFERENCE 1 (bases 1 to 1570870)
AUTHORS Thorell,K., Munoz-Ramirez,Z.Y., Wang,D., Sandoval-Motta,S., Boscolo
Agostini,R., Ghirotto,S., Torres,R.C., Falush,D., Camargo,M.C. and
Rabkin,C.S.
CONSRTM HpGP Research Network
TITLE The Helicobacter pylori Genome Project: insights into H. pylori
population structure from analysis of a worldwide collection of
complete genomes
JOURNAL Nat Commun 14 (1), 8184 (2023)
PUBMED 38081806
REMARK Publication Status: Online-Only
REFERENCE 2 (bases 1 to 1570870)
AUTHORS Camargo,M.C. and Rabkin,C.S.
TITLE Direct Submission
JOURNAL Submitted (15-JUL-2021) IIB, National Cancer Institute, 9609
Medical Center Dr., Rm. 6E110, Bethesda, MD 20892, USA
COMMENT The annotation was added by the NCBI Prokaryotic Genome Annotation
Pipeline (PGAP). Information about PGAP can be found here:
https://www.ncbi.nlm.nih.gov/genome/annotation_prok/
##Genome-Assembly-Data-START##
Assembly Method :: HGAP v. 4
Assembly Name :: HpGP-TWN-021
Genome Representation :: Full
Expected Final Version :: Yes
Genome Coverage :: 2661x
Sequencing Technology :: PacBio Sequel II
##Genome-Assembly-Data-END##
##Genome-Annotation-Data-START##
Annotation Provider :: NCBI
Annotation Date :: 07/16/2021 08:12:09
Annotation Pipeline :: NCBI Prokaryotic Genome
Annotation Pipeline (PGAP)
Annotation Method :: Best-placed reference protein
set; GeneMarkS-2+
Annotation Software revision :: 5.2
Features Annotated :: Gene; CDS; rRNA; tRNA; ncRNA;
repeat_region
Genes (total) :: 1,497
CDSs (total) :: 1,452
Genes (coding) :: 1,391
CDSs (with protein) :: 1,391
Genes (RNA) :: 45
rRNAs :: 2, 2, 2 (5S, 16S, 23S)
complete rRNAs :: 2, 2, 2 (5S, 16S, 23S)
tRNAs :: 36
ncRNAs :: 3
Pseudo Genes (total) :: 61
CDSs (without protein) :: 61
Pseudo Genes (ambiguous residues) :: 0 of 61
Pseudo Genes (frameshifted) :: 44 of 61
Pseudo Genes (incomplete) :: 11 of 61
Pseudo Genes (internal stop) :: 18 of 61
Pseudo Genes (multiple problems) :: 12 of 61
##Genome-Annotation-Data-END##
FEATURES Qualifiers
source /organism="Helicobacter pylori"
/mol_type="genomic DNA"
/strain="HpGP-TWN-021"
/isolation_source="Biopsy"
/host="Homo sapiens"
/db_xref="taxon:210"
/geo_loc_name="Taiwan"
/lat_lon="23.30 N 121.00 E"
/collected_by="Maria Camargo and Charles Rabkins"
protein /locus_tag="E5P95_03270"
/inference="COORDINATES: similar to AA
sequence:RefSeq:WP_001290156.1"
/note="Derived by automated computational analysis using
gene prediction method: Protein Homology."
/transl_table=11
BEGIN
1 MSEISTYKLI KEKLQAIPNQ RLKGSWFEKV SKRFLKEHDS ADEYESIDLW SDWKLRGNEG
61 DRGIDMVITT ASKEYIAVQC KFHQDSVSLN DLSTFLLKLQ SGVGEVGFKK GIIISTSNLS
121 SNALEEIEQI RKSKGIDIVE ISEEDFIYSQ IDWEKFDPTQ TQGELPLCDK KKPRPHQIEA
181 IKATKEYFSN PKNTRGKLIM ACGTGKTYTS LKIMEALEPK ITLFLAPSIA LLSQTFREYA
241 QEKSDPFYAS IVCSDDKVGK GKKNKNDDDA DDINFSELPN KPSTRPEDIL SVCEKVQKEN
301 KRFIIFSTYQ SALRIKEAQE VGLGEIDLII CDEAHRTVGA MYSSNERDDK NAFTLCHSDG
361 NIKAKKRLYM TATPKVYSES SKARAKESDN AIYSMDDEGI FGEEIYTLNF TRAIALDLLT
421 DYKVMILAVR KENLSGVTNS VNEKISRLEA KGTKLDKKLI NNEFVCKIIG THKGLAKQDL
481 IALDDENKKD HNLQNKNDTT PSQRAISFCK SINTSKRIKD SFETIMECYN EELKKKSFKN
541 LTISIDHIDG TMNCKVRLDK LEELNAFKPN TCKVLSNARC LSEGVDVPAL DSIVFFDGKS
601 AMVDIIQAVG RVMRKAKHKK RGYIILPIAL EESEIQNLDE AVNNTNFKNI WKVIKALRSH
661 DPSLVDEATF REKIKIFGSD DNNNDETNQD DEEPTKDKTD KTDKTEQDPK QAQKTLFDAI
721 LLQDLANAVY NVMPTKLGDR NYWENFTKKT GNIARTLNNR LKDIFEKNPE FFHGFLDSLK
781 GNIHSNIKED EALDMITSHI ITKPIFDAIF GDNIKNPISK ALDKMVEKLS TLGLQGETKD
841 LKNLYESVKT EAMHAKSQKS QQELIKNLYN TFFKVAFRKQ SEKLGIVYTP IEVVDFILRA
901 TNGILKKHFN TDFNDKNITI FDPFTGTGSF IARLLSKEND LISDEALKEK FLNNLFAFDI
961 VLLAYYIALI NITQAAQSRD SSLKNFKNIA LTDSLDYLEE KTNKGALPLY EDLEENKEIK
1021 STIEKQNIRV IIGNPPYSAG AKSENDNNQN LSHPKLEKRV YEKYGQNSTA KVGATTRDTL
1081 IQSIYMASEL LKDRGVLGFV VNGGFIDSKS GDGFRKCVAK DFAHLYVLNL RGNARTSGET
1141 RKKEGDGIFD SGSRATIAII FFVKDTSVKN SMIHYYDIGD YLKREEKLNR LAHFTDLDKI
1201 PFETIIPNNK GDWINQREDG FEKLIPLKRD KNSKSVFDIN SGGVASGRDS WVYNFSKDAL
1261 MLSVQKCIDA YNADLKRFNT HFREAFKQRT KGVKSGQLYK QLNDKEITTD KTKIAWTGGL
1321 KNHLIKNKNL QESHKDRIRL ALYRPFNKQW LYWDKDWINR QREFSKIFPD KDAYNVVINT
1381 GVGNGKNFSA LVSDCISSCD LIMHNQAYPL YYYDDLGNRH YAISGYALNL FRRHYEDSSI
1441 AEEEIFYYIY AILHHKGYLE KYKNSLTKEE PRIALSKDFK ELSVLGKELG ELHLNYESEE
1501 MHISVEYKTL MNAEEKGYYD VETMKKIGDR IHYNNHIAIT KIPKKAFDYA LNGKSAIDWV
1561 IERYKKTRDK ESLIENNPND YKGGKYVFEL LCRIIKLSEK SVDLIEKISI KRFE
//