LOCUS DVAA01000082 6417 bp DNA linear ENV 30-MAR-2021 DEFINITION TPA_asm: Candidatus Naiadarchaeales archaeon SRR2090153.bin461 isolate MAG_bin461 SRR2090153.bin461_genomic_k141_1654249, whole genome shotgun sequence. ACCESSION DVAA01000082 DVAA01000000 VERSION DVAA01000082.1 DBLINK BioProject: PRJNA609027 BioSample: SAMN14218520 Sequence Read Archive: SRR2090153 KEYWORDS WGS; Third Party Data; TPA; TPA:assembly. SOURCE Candidatus Naiadarchaeales archaeon SRR2090153.bin461 (groundwater metagenome) ORGANISM Candidatus Naiadarchaeales archaeon SRR2090153.bin461 Archaea; Candidatus Undinarchaeota; Candidatus Undinarchaeia; Candidatus Naiadarchaeales. REFERENCE 1 (bases 1 to 6417) AUTHORS Dombrowski,N., Williams,T.A., Sun,J., Woodcroft,B.J., Lee,J.H., Minh,B.Q., Rinke,C. and Spang,A. TITLE Undinarchaeota illuminate DPANN phylogeny and the impact of gene transfer on archaeal evolution JOURNAL Nat Commun 11 (1), 3939 (2020) PUBMED 32770105 REMARK Publication Status: Online-Only REFERENCE 2 (bases 1 to 6417) AUTHORS Rinke,C. TITLE Direct Submission JOURNAL Submitted (27-FEB-2020) Australian Centre for Ecogenomics, School of Chemistry and Molecular Biosciences, The University of Queensland, Cooper Road, Brisbane, QLD 4072, Australia COMMENT Organism name changed from archaeon to Candidatus Naiadarchaeales archaeon SRR2090153.bin461 (MAR-2021). The annotation was added by the NCBI Prokaryotic Genome Annotation Pipeline (PGAP). Information about PGAP can be found here: https://www.ncbi.nlm.nih.gov/genome/annotation_prok/ ##Genome-Assembly-Data-START## Assembly Method :: megahit v. 1.1.3 Expected Final Version :: Yes Genome Coverage :: Not Applicable Sequencing Technology :: Illumina ##Genome-Assembly-Data-END## ##Genome-Annotation-Data-START## Annotation Provider :: NCBI Annotation Date :: 07/22/2020 17:21:44 Annotation Pipeline :: NCBI Prokaryotic Genome Annotation Pipeline (PGAP) Annotation Method :: Best-placed reference protein set; GeneMarkS-2+ Annotation Software revision :: 4.12 Features Annotated :: Gene; CDS; rRNA; tRNA; ncRNA; repeat_region Genes (total) :: 894 CDSs (total) :: 865 Genes (coding) :: 862 CDSs (with protein) :: 862 Genes (RNA) :: 29 rRNAs :: 1 (16S) complete rRNAs :: 1 (16S) tRNAs :: 28 ncRNAs :: 0 Pseudo Genes (total) :: 3 CDSs (without protein) :: 3 Pseudo Genes (ambiguous residues) :: 0 of 3 Pseudo Genes (frameshifted) :: 0 of 3 Pseudo Genes (incomplete) :: 3 of 3 Pseudo Genes (internal stop) :: 0 of 3 ##Genome-Annotation-Data-END## FEATURES Location/Qualifiers source 1..6417 /organism="Candidatus Naiadarchaeales archaeon SRR2090153.bin461" /mol_type="genomic DNA" /submitter_seqid="SRR2090153.bin461_genomic_k141_1654249" /isolate="MAG_bin461" /isolation_source="Rifle well FP-101 under high O2 conditions; 0.1 micron filter" /db_xref="taxon:2756138" /environmental_sample /geo_loc_name="USA: Rifle, CO" /lat_lon="39.5369 N 107.7828 W" /collection_date="2013-07-07" /metagenome_source="groundwater metagenome" /note="metagenomic" gene complement(27..1703) /locus_tag="H1009_02240" CDS complement(27..1703) /locus_tag="H1009_02240" /EC_number="6.5.1.1" /inference="COORDINATES: protein motif:HMM:TIGR00574.1" /note="Derived by automated computational analysis using gene prediction method: Protein Homology." /codon_start=1 /transl_table=11 /product="ATP-dependent DNA ligase" /protein_id="HIJ97887.1" /translation="MDFKKLAKCYEALVSTAKKLEKRDIVAKFIKETPTDLLAVVVTL LEGSVFPAWMETELGVAENIMFKALAKVTGLTEDEVKNEAKKAGDIGTAAETILKKKK QKTLFAKPLTVESVYKNLTKIPTLTGGGSTGQKVSLIAELVSNATPEEGKYVVRLILE EMRIGVGEGTIREALAVAYEIEPDLIDKAYSVRNDYGEVAKLIRSGGKKALEKVELEV GRPLKPMLAQKVDTAAEGLEEMKGNAAFQYKYDGMRVQIHKDRNKISVFTRRLDNITK QFPELIEAAKKLIRADTAIIEGEAVGIDPKSRKPQPFQKLSQRIKRKYGIEEMQKQIP VEMNIFDLLYVNGINLLDTPYEKRWKKLTEIVKETPDFHLAENLVTADSKKADEFYKK ALNLGNEGLMIKNLGAKYMPGSRVKYMYKLKQERETLDLAIIGAIWGEGRRAKWLGSF VLGVRDSDSGEFLEVGKVATGLTDEDLANLTNLIKPLITKEHIKDVEVRPKIVVEIGF EEIQKSPHYRSGYALRFPRVKRIRDDKGVEDADDLERLSRLYEGQRKAKK" gene 1765..1935 /locus_tag="H1009_02245" CDS 1765..1935 /locus_tag="H1009_02245" /inference="COORDINATES: ab initio prediction:GeneMarkS-2+" /note="Derived by automated computational analysis using gene prediction method: GeneMarkS-2+." /codon_start=1 /transl_table=11 /product="hypothetical protein" /protein_id="HIJ97888.1" /translation="MGENMNEYSYMGQGNAPSYAKKQVELRTCRDCGYATFSKYVACP KCKSEKWMTEYK" gene complement(1962..2327) /locus_tag="H1009_02250" CDS complement(1962..2327) /locus_tag="H1009_02250" /inference="COORDINATES: similar to AA sequence:RefSeq:WP_012960648.1" /note="Derived by automated computational analysis using gene prediction method: Protein Homology." /codon_start=1 /transl_table=11 /product="four helix bundle protein" /protein_id="HIJ97889.1" /translation="MQSFRNFKIMVEIDNLVLEIYAITKNYPREEIYGVVQQMRRAAT SIGANIAEGAGRKTDADFQRFLFNSMGSLKELEYFVELSKKLGYLKEEEYSKLCKKAE IVGRMLNNFIKSLSSANGQ" gene complement(2489..2779) /locus_tag="H1009_02255" CDS complement(2489..2779) /locus_tag="H1009_02255" /inference="COORDINATES: ab initio prediction:GeneMarkS-2+" /note="Derived by automated computational analysis using gene prediction method: GeneMarkS-2+." /codon_start=1 /transl_table=11 /product="hypothetical protein" /protein_id="HIJ97890.1" /translation="MVTIGSTEPLWVATFILVGIVLLKYSGYKDKLNRSLGFAVASVL FMYLGSVTAMGFWNRPELLGAQEALTIIWQVISWILLLVSAFLAASDLAKVK" gene complement(2821..3459) /locus_tag="H1009_02260" CDS complement(2821..3459) /locus_tag="H1009_02260" /inference="COORDINATES: ab initio prediction:GeneMarkS-2+" /note="Derived by automated computational analysis using gene prediction method: GeneMarkS-2+." /codon_start=1 /transl_table=11 /product="hypothetical protein" /protein_id="HIJ97891.1" /translation="MAPTLEFVLDAHEYARKNEVNFTPNEFCLCFLYRNPERRIAADT ISWEMQSALYEKIADRKADPEDKMWEGPKAFVHDFDHRLFKPAYKALPSYVRGKVSQP DYEDNFTSIFFILELPESNHRFHLLFKPKTVNLGFGNFKDGGVRVELSLTDSTKALKI VHYVSGRGWVDERMEALGTSMAETKTFKPAAKQKTKLKAKKLTPKKPKKRKK" gene 3815..4174 /locus_tag="H1009_02265" CDS 3815..4174 /locus_tag="H1009_02265" /inference="COORDINATES: protein motif:HMM:TIGR02436.1" /note="Derived by automated computational analysis using gene prediction method: Protein Homology." /codon_start=1 /transl_table=11 /product="four helix bundle protein" /protein_id="HIJ97892.1" /translation="MRDYTKYKIYQEAYALTKEIYKITQKWPPHELYSLTAQIRRSAH SVNTNICEGLSRDSDADCRRFIFNAYASLKETENHLQMAYDVGYIAKDDYDIYFKKLD LLCKMIYRFIAKLTADG" gene complement(4179..4760) /locus_tag="H1009_02270" CDS complement(4179..4760) /locus_tag="H1009_02270" /inference="COORDINATES: protein motif:HMM:NF003142.0" /note="Derived by automated computational analysis using gene prediction method: Protein Homology." /codon_start=1 /transl_table=11 /product="30S ribosomal protein S3ae" /protein_id="HIJ97893.1" /translation="MVKAAVDKWKQKKWFEIIAPAIFNDKKIGETMAADYRMILGRDC EIPLSDVSGDPKQQRIKLLFKIRDVKGERALTDYLGHKITQDYERSLARRRVSKLYSN QAVETKDGKKVAVKVIVVTFGKVNESIKGAIRKKLVEVINHTAKDEALVDFIHGVLQG RLTAKLKKELHKVHPIRHAVIQKAEIIREMPAE" gene complement(4845..5264) /locus_tag="H1009_02275" CDS complement(4845..5264) /locus_tag="H1009_02275" /inference="COORDINATES: protein motif:HMM:NF012533.1" /note="Derived by automated computational analysis using gene prediction method: Protein Homology." /codon_start=1 /transl_table=11 /product="30S ribosomal protein S15" /protein_id="HIJ97894.1" /translation="MISARKPKAKKEKKSIELTLSQKEAEDAIIELGRSGIAAEKIGQ ILKDEHGVQSVQELTGKKISKILSENGASPKLPADMDSLMKSALAIKKHLIRHKSDTA ARYGLLLTESKIRKLSRYYRRSRVLPPDWKYESEVQV" gene complement(5257..5817) /gene="rdgB" /locus_tag="H1009_02280" CDS complement(5257..5817) /gene="rdgB" /locus_tag="H1009_02280" /inference="COORDINATES: protein motif:HMM:TIGR00042.1" /note="Derived by automated computational analysis using gene prediction method: Protein Homology." /codon_start=1 /transl_table=11 /product="RdgB/HAM1 family non-canonical purine NTP pyrophosphatase" /protein_id="HIJ97895.1" /translation="MSKTFYFATSNKGKFAEAKEKFRAAGLKLKRKPVDLLEIQDDDL VKISKISAVHLSKTFKKPFFVEDAGFFINALNGFPGPYTKYAHYHIKPKGILKLMQGK KNRSAYFVSAIVFRDGSREKVFKGICRGNVTHSAKGTKGFGFDPIFVPRGERKTFASD LSLKQRVSHRSLALKQLIKYLSKNYD" gene complement(5814..6233) /locus_tag="H1009_02285" CDS complement(5814..6233) /locus_tag="H1009_02285" /inference="COORDINATES: protein motif:HMM:NF015444.1" /note="Derived by automated computational analysis using gene prediction method: Protein Homology." /codon_start=1 /transl_table=11 /product="DNA-binding protein" /protein_id="HIJ97896.1" /translation="MKLSKQTDNDFFGRYEENEDLIEALLKFAEENGIKTGYFSIIGA VKEVSISFYDQKTKKYLQMDLDEEAEILNCTGNIAQKDGKSIIHAHITLGDRDGRAFG GHLVSGKVFGAEIYLKKFEKVISRKPDKSTGLNLLEI" BASE COUNT 1663 a 1454 c 1224 g 2076 t ORIGIN 1 ccaggtacta tctaccatat actaatttac ttttttgcct tcctttgccc ctcgtatagg 61 cgtgaaagac gttccaaatc gtctgcatcc tctacgcctt tgtcatcacg gatgcgtttt 121 acgcgcggaa accgcaatgc gtagccactc cggtaatgag gtgatttttg gatctcctcg 181 aaaccaattt ccacaacgat tttcggccga acttcaacat cttttatgtg ttcttttgta 241 attaaaggtt ttatcaaatt agtcaagttc gccaaatcct cgtctgtgag gcctgtcgcc 301 acttttccca cttccagaaa ttcgcctgaa tcagaatctc ttacgcccaa cacaaaactt 361 ccgagccatt ttgcacgcct tccctcgccc caaatcgcgc cgataattgc caaatccaga 421 gtttcccgct cctgttttaa tttatacata tatttcacac ggctgccggg catgtatttt 481 gcccccaaat ttttaatcat caggccctcg ttccccaaat tcagggcttt tttgtaaaat 541 tcatccgctt tttttgaatc cgctgtgacc aaattttccg caagatgaaa atcaggtgtt 601 tctttaacta tttcagtgag ctttttccag cgtttttcat aaggagtgtc aagcaaattt 661 attccattca catacaacag gtcaaaaata ttcatttcaa ccggaatttg cttttgcatt 721 tcctcaatcc catattttct cttgattctt tgcgacaatt tttggaacgg ctgcggtttt 781 cttgattttg ggtcaattcc aacagcttcg ccttctataa ttgcggtgtc ggccctgatt 841 aattttttag cagcctcaat taattcaggg aattgttttg taatgttgtc cagcctgcgc 901 gtaaacacag aaattttatt tctgtctttg tgaatctgca cgcgcattcc gtcgtatttg 961 tactggaaag ctgcgtttcc cttcatttcc tcaagccctt cggcagctgt atcaactttt 1021 tgcgcaagca tcggcttcag cggccttccg acctccagct caactttctc aagcgctttt 1081 tttccgccgc tgcgaataag ttttgcaacc tctccgtaat catttcggac gctgtaagcc 1141 ttgtcaatta aatcaggttc aatttcatag gcaactgcaa gcgcctcccg tatagttcct 1201 tctccaacgc cgattcgcat ttcctctaaa atcaggcgca caacatattt cccctcttcg 1261 ggcgttgcgt ttgaaacaag ttctgcaatc aaagaaactt tctgaccggt tgagccgccg 1321 cctgtcaagg tcgggatttt tgtaaggttt ttgtaaacgc tttcaactgt gagcggcttt 1381 gcaaaaagag ttttctgctt tttctttttc aaaattgttt cagctgcagt tccaatatcg 1441 cctgctttct ttgcctcgtt tttgacttcg tcttccgtta agcctgtaac ttttgcaagc 1501 gccttgaaca ttatattttc agcaacaccg agttcggttt ccatccacgc cggaaaaaca 1561 ctgccctcca gcaaagtgac aacaactgcc agaaggtctg tgggcgtttc cttgatgaat 1621 ttggctacaa tatcgcgctt ttcgagtttt ttagcggttg atacaagggc ttcgtagcac 1681 ttagccagct ttttaaagtc catattttcc cttcttttat taaatccagc aacttatata 1741 tatccgccta atttacacaa attagtggga gagaatatga atgagtattc atacatggga 1801 cagggcaatg ccccgtctta tgcaaaaaaa caggttgaat taaggacttg cagggactgc 1861 ggatatgcaa ctttttcaaa gtatgtcgca tgcccgaaat gcaagagcga aaagtggatg 1921 actgaataca aataaacgct gtaatagcag ttagcttttg gctattggcc gttagcgcta 1981 cttaaacttt ttatgaaatt attgagcata cgaccgacga tttctgcctt tttgcataat 2041 ttgctgtatt cctcttcttt caaatagccg agtttttttg aaagctcaac aaaatattcc 2101 aactctttta gtgagcccat cgaattaaat aaaaaccttt gaaaatccgc atcagttttt 2161 cggccagctc cctctgcaat attagcgcct attgatgttg ctgcgcgcct catttgttgc 2221 actacgccat aaatttcttc tcgcggatag tttttagtga ttgcataaat ttctaaaact 2281 aaattatcaa tttccaccat tattttaaaa ttcctaaagc tttgcataag agaaaaactg 2341 aattcagcct ttaaatattt tatttaaccc tataaccctc actatggtta ataaaaacgc 2401 taatagctaa cagccaacag ctaatagccg ttcattaaaa gcttagctat ttcagacaaa 2461 aaagaaaaat tcaaaatgtc tatagtttct attttacctt agcaagatcg cttgctgcaa 2521 ggaatgctga caccagcagc agaatccagg aaattacctg ccagattatt gtcagggctt 2581 cctgcgcccc gagcagctct ggcctgttcc agaatcccat tgcagttacc gagccaaggt 2641 acatgaaaag cactgatgcg actgcgaatc cgaggctcct gttcagcttg tccttgtagc 2701 cactgtattt cagcaataca atgcctacta aaatgaaggt tgcgacccac aaaggctctg 2761 ttgatccgat ggttaccatt gtgatttcaa aattagataa tgcgccttgc tttatataag 2821 ttattttttg cgcttttttg gttttttagg agttaatttt ttagctttta gctttgtctt 2881 ttgtttggca gcgggtttga aagttttggt ttcagccata gaagtgccga gcgcttccat 2941 tctctcatca acccagcctc ttccggaaac ataatggaca attttcaatg cttttgtgct 3001 gtctgtcaag cttagctcaa cccgcacccc gccatccttg aaattgccaa agcccaaatt 3061 gacagttttt ggcttgaaaa gcaaatggaa cctgtggttg ctttcaggca gctcaagaat 3121 aaagaaaatc gaagtgaaat tatcctcgta atccggctgc gaaactttgc cgcgcacgta 3181 agaaggcaat gccttgtatg caggcttgaa caatctgtgg tcaaaatcat ggacaaaggc 3241 ctttggcccc tcccacatct tgtcttcagg gtccgccttc ctgtctgcaa ttttttcgta 3301 gagtgcggac tgcatttccc agcttattgt atctgctgca attctgcgct ccggattcct 3361 atacaagaaa cataggcaaa actcgttcgg tgtaaaatta acctcgtttt ttcgggcata 3421 ttcgtgcgcg tcaagaacaa attccagcgt tggagccatt aacagattgt attacaaagg 3481 gcggatttaa ataggttatt aaattggctt gaggcttgtg gtttgaggct tgaggtttaa 3541 aaccacatgc cccacgcctc atgccctaat ttactataca gaaaaattgc tccgaagcta 3601 ccacctatca tcatatcggg cggcgatcag tgggtgttca cttcgtcatt gctaaaggat 3661 agtgattggg aaggtataat taactatcga aagatgccaa agtatgacta atggcggtcg 3721 gctttcggcg atcagcaatc agccatgttt actactcact actacccagt tgggttaaat 3781 taaatttata tatagcaaaa acagcgctct ctgtatgaga gattatacaa agtacaaaat 3841 ctatcaagaa gcttatgctc taaccaaaga aatttataaa atcacgcaaa aatggccacc 3901 acatgaactt tatagcctta ccgcacagat taggagaagt gcccatagcg tcaatacaaa 3961 tatatgtgaa ggattgtcaa gagatagcga cgcagattgc aggcgattta tctttaatgc 4021 atatgctagt ttgaaagaaa ctgaaaatca tttacaaatg gcttatgacg tagggtatat 4081 cgcaaaagat gactacgata tttattttaa aaagttagat ttgctttgca aaatgattta 4141 tagatttatt gctaaactga ccgctgatgg ctgatcgctt attcagctgg catttccctt 4201 ataatttcag ccttctgtat gacagcgtgc cttatcggat gcaccttgtg cagctccttt 4261 ttgagctttg cagtgagcct gccctgcaga acgccgtgga taaaatcaac aagagcctca 4321 tctttcgcag tgtgatttat tacttcaaca agtttctttc tgatcgcgcc ttttattgat 4381 tcgtttacct tcccgaatgt gacaacaata accttgaccg cgactttctt tccgtctttt 4441 gtttcaactg cctggtttga gtagagtttt gatacgcgcc ttcttgcaag gctgcgctcg 4501 tagtcctgtg taattttgtg gccaagataa tccgtcaatg cgcgctcgcc ttttacatcg 4561 cggattttga aaagcagttt tatgcgctgc tgtttcgggt cgccgcttac atccgaaagc 4621 ggaatttcgc agtcgcggcc gagaatcatc ctgtaatcag cggccattgt ctcgccgatt 4681 tttttgtcgt taaagattgc cggagctatg atttcaaacc attttttctg cttccatttg 4741 tcaactgcgg ctttaaccat ttatgaaatt ggttcggttc gtagtttgtg gtttgtagtt 4801 cgtagtgcaa aaccacgaac caccaactac taactacaaa ctaactatac ttgcacctcg 4861 ctttcgtatt tccagtcagg cggcaaaacc cttgagcgcc tgtaatagcg ggagagtttt 4921 cgtatttttg attctgtcaa aagcaatccg tagcgcgctg cggtatctga cttgtgccta 4981 atcaggtgct tttttattgc aagcgcactt ttcattaaag agtccatgtc agccggcaat 5041 tttggagatg cgccgttttc agaaagtatt tttgagattt ttttccctgt caattcctgc 5101 acgctctgaa ccccgtgttc gtcctttaaa atctgcccta ttttttcagc tgcgattccg 5161 cttcgaccaa gttcaataat tgcgtcctct gcctcttttt gagaaagcgt cagctcaata 5221 gatttctttt ctttctttgc ttttggtttt ctagcgctaa tcatagttct tactaaggta 5281 ttttatgagt tgttttaaag ccagcgaccg atgcgaaacc ctttgcttca aactgagatc 5341 ggaagcgaaa gttttccgct cgcctctagg gacaaatatc ggatcaaacc cgaaaccttt 5401 tgtgcccttt gcgctgtgcg tgacattacc tcggcatatg cctttaaata ctttttcgcg 5461 gctgccatcg cggaaaacaa tggctgaaac aaaataagcc gagcggttct tttttccctg 5521 cattaatttc agaatgcctt ttggctttat gtgataatgg gcatattttg tgtaggggcc 5581 tgggaatccg tttaaggcat ttatgaaaaa gcccgcatcc tcaacaaaaa acggcttttt 5641 aaatgtttta gaaagatgca cagcgcttat ttttgaaatt ttaactaaat catcatcctg 5701 aatttccagc aaatcaactg gctttcgctt taattttaag ccggctgcgc ggaatttctc 5761 cttggcttct gcgaattttc ctttgtttga ggttgcgaag tagaaagttt tgctcatatc 5821 tctaaaagat tcaaccccgt acttttatca ggctttctgc ttattacttt ctcaaacttt 5881 ttcaaataaa tctctgcgcc gaaaactttt ccgcttacta aatgcccgcc gaatgctctt 5941 ccatcgcggt cgcctaaagt tatgtgcgcg tggattattg acttgccgtc cttttgtgca 6001 atgttccctg tgcagttcag aatttccgct tcctcatcca agtccatttg cagatatttt 6061 tttgtttttt ggtcgtaaaa tgaaatgcta acttccttga ctgcgccgat tattgaaaaa 6121 tagccggttt taattccgtt ttcttctgcg aatttcaaaa gcgcctctat caaatcttcg 6181 ttttcttcat atcggccgaa aaaatcattg tcggtttgtt ttgagagttt catcattcac 6241 atttgtctgg aaaggatagc agaaatatac gagaggaaag gggataggga atttgcacaa 6301 gttcttcggg gtgctataaa agagactgaa acgaaaatga acgatacaga gaaagaacag 6361 aaaaaaccca aaagagatta tctttcccgt ggtaaatttt aatgataaaa ctctcaa //