LOCUS       WQZ30623.1              3165 aa    PRT              BCT 29-DEC-2023
DEFINITION  Helicobacter pylori vacuolating cytotoxin domain-containing
            protein protein.
ACCESSION   CP079244-582
PROTEIN_ID  WQZ30623.1
SOURCE      Helicobacter pylori
  ORGANISM  Helicobacter pylori
            Bacteria; Campylobacterota; Epsilonproteobacteria;
            Campylobacterales; Helicobacteraceae; Helicobacter.
REFERENCE   1  (bases 1 to 1570870)
  AUTHORS   Thorell,K., Munoz-Ramirez,Z.Y., Wang,D., Sandoval-Motta,S., Boscolo
            Agostini,R., Ghirotto,S., Torres,R.C., Falush,D., Camargo,M.C. and
            Rabkin,C.S.
  CONSRTM   HpGP Research Network
  TITLE     The Helicobacter pylori Genome Project: insights into H. pylori
            population structure from analysis of a worldwide collection of
            complete genomes
  JOURNAL   Nat Commun 14 (1), 8184 (2023)
   PUBMED   38081806
  REMARK    Publication Status: Online-Only
REFERENCE   2  (bases 1 to 1570870)
  AUTHORS   Camargo,M.C. and Rabkin,C.S.
  TITLE     Direct Submission
  JOURNAL   Submitted (15-JUL-2021) IIB, National Cancer Institute, 9609
            Medical Center Dr., Rm. 6E110, Bethesda, MD 20892, USA
COMMENT     The annotation was added by the NCBI Prokaryotic Genome Annotation
            Pipeline (PGAP). Information about PGAP can be found here:
            https://www.ncbi.nlm.nih.gov/genome/annotation_prok/
            
            ##Genome-Assembly-Data-START##
            Assembly Method        :: HGAP v. 4
            Assembly Name          :: HpGP-TWN-021
            Genome Representation  :: Full
            Expected Final Version :: Yes
            Genome Coverage        :: 2661x
            Sequencing Technology  :: PacBio Sequel II
            ##Genome-Assembly-Data-END##
            
            ##Genome-Annotation-Data-START##
            Annotation Provider               :: NCBI
            Annotation Date                   :: 07/16/2021 08:12:09
            Annotation Pipeline               :: NCBI Prokaryotic Genome
                                                 Annotation Pipeline (PGAP)
            Annotation Method                 :: Best-placed reference protein
                                                 set; GeneMarkS-2+
            Annotation Software revision      :: 5.2
            Features Annotated                :: Gene; CDS; rRNA; tRNA; ncRNA;
                                                 repeat_region
            Genes (total)                     :: 1,497
            CDSs (total)                      :: 1,452
            Genes (coding)                    :: 1,391
            CDSs (with protein)               :: 1,391
            Genes (RNA)                       :: 45
            rRNAs                             :: 2, 2, 2 (5S, 16S, 23S)
            complete rRNAs                    :: 2, 2, 2 (5S, 16S, 23S)
            tRNAs                             :: 36
            ncRNAs                            :: 3
            Pseudo Genes (total)              :: 61
            CDSs (without protein)            :: 61
            Pseudo Genes (ambiguous residues) :: 0 of 61
            Pseudo Genes (frameshifted)       :: 44 of 61
            Pseudo Genes (incomplete)         :: 11 of 61
            Pseudo Genes (internal stop)      :: 18 of 61
            Pseudo Genes (multiple problems)  :: 12 of 61
            ##Genome-Annotation-Data-END##
FEATURES             Qualifiers
     source          /organism="Helicobacter pylori"
                     /mol_type="genomic DNA"
                     /strain="HpGP-TWN-021"
                     /isolation_source="Biopsy"
                     /host="Homo sapiens"
                     /db_xref="taxon:210"
                     /geo_loc_name="Taiwan"
                     /lat_lon="23.30 N 121.00 E"
                     /collected_by="Maria Camargo and Charles Rabkins"
     protein         /locus_tag="E5P95_02980"
                     /inference="COORDINATES: similar to AA
                     sequence:RefSeq:WP_001919785.1"
                     /note="Derived by automated computational analysis using
                     gene prediction method: Protein Homology."
                     /transl_table=11
BEGIN
        1 MMDKNDKTDL KNKRLKNRSF KGVKKKIAKK YKIKNSSLTI YPLKTRSNFS ASFNKKIFLG
       61 LGFVSALSAE DYNSSVYWLN SVNENNSNKS YYVSPLRTWA GGNRSFTQNY NNSKLYIGTK
      121 NASATPNNSS VWFGEKGYIG FITGVFKARD IFITGAVGSG NEFKTGGGAI LVFESSNDLT
      181 TDGAHFKNDK AGTQTSWINL ISNNSVNLTN TDFGNQTPNG GFNVMGREIT YNGGIVNGGN
      241 FGFDNVDSNG TTTISGVTFN NNGALTYKGG NGIGGSITFT NSNINHYKLN LNANSVTFNN
      301 SALGSMPNGS ANTVGNAYIL NASNITFNNL TFNGGWFVFM RPDSKIDFQG TTTINNPTSP
      361 FVNMSAKVTI NPNAIFNIQN YTPTIGSTYT LFSMKNGSIT YNDANNLWNI IRLKNTQATK
      421 DNSKNATSNN NTHTYYVTYN LGGTLYNFRQ IFSPDSIVLQ SVYYGANNIY YTNSVNIYDN
      481 VFNLKNINDD RADAIFYLNG LNTWNYTNVR FSQTYGGKNS ALVFNATTPW ANGSIPKSNS
      541 TVRFGGYEGV NWGKTGYITG TFTADRVYIT GNMMSGNGAQ TGGGATLNFV GATEINIAGA
      601 TFKNLKTTSQ NSYMTFMALG DSSRSGKINV SQSDFYDWTG GGYDFTGNGA FDSVNFNKAY
      661 YKFQGAKNSY TFKNTNFLAG NFKFQGKTTI EKSVLDDASY TFDGVNNAFN EDKFNGGSFS
      721 FNAKQVDFSG NSFNGGVFDF NNTPKVSFTD DTFNVNNQFK INGAQTTFTF NKGVVFNMQG
      781 LLNSLSVGTT YQLLNAKSVD YKNNNALYQM LHWTSGENPS GKLVDENKTA PSSAKIYNVQ
      841 FIDNGLTYYI KENFNNGITL TRLCTLGYTH CVNINNDVFH LKNINNNASN TVFYLNGMTT
      901 WKNAGTGVFT QDYSGANSVL VFNQTTPFLN GANPTSNSVV SFGKTSGAEW GLVGYIKGVF
      961 KANQIDITGT IRSGNGAQTG GGATLVFNAQ KRLNIANASL NNDKAGLQDS WMNFIVNNGN
     1021 LNATNANFSN QTPHGGFNLK ANNITWDKGS VNGGGNFGVD NANSNGTTTI SGVTFNNNGT
     1081 LIYKGGENSA GNSLTLENNT FNSYNINAKV QNLIFNNNSF SGGSYSFNDT KNTTFKGTNT
     1141 LINSDPFSRL QGSIAIDNNS IFNIERDLTD KTTYTLLSGN NIKYNNQALA DNAFSKNLWN
     1201 LIHYGGERGT LLRTEKNTYF VQFTQSNGQK FVFEETFNSG SITYKYLTLN SSPFHTDADS
     1261 KDIWSQVRKQ FDFIPGKTPV CVGVCYIAPY KNQDLIGSSA FAWSLNFGAT VVGTLLLGNA
     1321 QEKANNNGGS IWFGKNNLLY LHGNFKATNI FLTNNFNVGN PNAGGGATIN FNADETLNAD
     1381 GLNYTNFQTV AMGLQTSASQ HSWANFNSKF SMDIKNSNFR DFTWGGFNFN SGRITFENTT
     1441 FSGWTNINGA TESGSSYVNM VANTDLIFTN SILGGGIRYD LKANNIIFNN SQMVIDVSKN
     1501 VNQSSLNGNV TFNNSRLSIK PNAAINIGDS QTQTTLENAS SLSFYNNSVA NFNGTTAFNG
     1561 VSYLNLNPNA QLSFNQANFN NANVTFYGIP LFGKTPDFGN SVRLINFKGN TNFNQATLNL
     1621 RAKNIHINFQ GASTFENNST MNLAESSQAS FNTLIVEGET DFNLNGSSLL NFNGDSVFNA
     1681 PVSFYANNSQ ISFTKLATFN ADASFDLGNN STLNFQSVLL NGTLNLLGNS ANALSVNASG
     1741 NFSFGSKGVL NLSNVNLFDA KNKPLVYNIL QAQNIQGLMG NNGYEKIRFY GIQIDKADYS
     1801 FNNGVYSWSF TNPLNTTETI TETLHNNRLK VQISQNGSSN NEMFNLAPSL YDYQKNPYDE
     1861 SANSYNYTSG KAGTYYLTSN IKGFSQNNEI LGTYNAQNQP LQALHIYNQA ITKQDLSIIA
     1921 NLGKEFLPKI ANLLSSGALD SLNLNSPNSF ETILGIFEKY GITLNQENWK SLLKIINGFS
     1981 NTANYHFSQG NLVVGAIKEG QTNTNSVVWF GGDGYKEPCA VGNNTCQMFR QTNLGQLLHS
     2041 TSPYLGYINA NFRAKNIYIT GTIGSGNAWG SGGSANVSFE SGTNLVLNQA NIDAQGTDKI
     2101 FSYLGQGGIE KLFGEKGLGN ILSNIIYEES LNDNAIPKDL ASMIPKDFGY KTLSSLLSPT
     2161 EVNNLLGVNA FKNAIMEILN SKTVGDVFGE NGLLNALDPI KRKEIDQMLL EQIQAHSSGF
     2221 EKFIVKTLGI ENVENFINNW YGKQSLSSFA NNFVPGGLNQ ALDKIGSSSD AKDLQSFLDK
     2281 TTFGDILNQM ISQAPLINKL ISWLGPQDLS VLVNIALNSI TNPSKELTST ISSIGEKVLN
     2341 DLLGEGVVNK IMSNQVLGQM INKIIADKGF GGVYNQGLGS ILPKSLQKEL EQFGLGSLLG
     2401 SRGLHNLWQK GNFNFLAKDY VFVNNSSFSN ATGGELNFVA GKSIIFNGKN TINFTQYQGR
     2461 LSFISQDFSN ISLDTLNATN GLTLNAPRND ISVQKGQICV NVLNCMGEKK ANPSNTSAPT
     2521 DETLEVNANN FAFLGTIKAN GLVDFSKVLQ NTTIGTLDLG SNATFKANNL IVNSAFNNNS
     2581 NYRVNISGNF NVVKGATLGT NENGLNVGGD FKSEGPLIFN LNNPTHQTII NVTGASTIMS
     2641 YNNQALINLN TQLKQGAYTL INAKRMVYGY DNQMILGGSL SDYLKLYTLI DFNGKRMQLN
     2701 GDSLSYDNQP VNIKDGGLVV SFKDNQGQMV YSSILYDKVQ VTVSDKPINI QAPSLEYYIK
     2761 YIQGSAGLNA IKSAGINSLM WLNALFVAKG GNPLFAPYYL QDNSTEHIVT LMKDITSALG
     2821 MLSNSHLKNN STDVLQLNTY TQQMGRLAKL SNFASFDSTD FSERLSSLKN QRFADAIPNA
     2881 MDVILKYSQR DKLKNNLWAT GVGGVSFVEN GTGTLYGINV GYDRFIKGVI VGGYAAYGYS
     2941 GFYERITSSK SDNVDVGLYA RAFIKKSELT FSVNETWGAN KTQISSNDAL LSMINQSYQY
     3001 STWTTNARVN YGYDFMFKNK SVIVKPQIGL RYYYIGMTGL DGVMNNALYN QFKANADPSK
     3061 KSVLMIDFAF ENRHYFNKNS YFYAIGGIGR DLLVRSMGDK LVRFIGDNIL SYRKGELYNT
     3121 FANITTGGEI RLFKSFYVNA GVGARFGLDY KMINITGNIG MRLAF
//