LOCUS QHXC01000359 4480 bp DNA linear ENV 12-JUN-2018 DEFINITION Acidobacteria bacterium isolate gp5 AA64 14_0929_02_30cm_scaffold_22904_curated, whole genome shotgun sequence. ACCESSION QHXC01000359 QHXC01000000 VERSION QHXC01000359.1 DBLINK BioProject: PRJNA449266 BioSample: SAMN08912149 KEYWORDS WGS. SOURCE Acidobacteria bacterium (soil metagenome) ORGANISM Acidobacteria bacterium Bacteria; Acidobacteria. REFERENCE 1 (bases 1 to 4480) AUTHORS Crits-Christoph,A., Diamond,S., Butterfield,C.N., Thomas,B.C. and Banfield,J.F. TITLE Novel soil bacteria possess diverse genes for secondary metabolite biosynthesis JOURNAL Nature (2018) In press PUBMED 29899444 REMARK Publication Status: Available-Online prior to print REFERENCE 2 (bases 1 to 4480) AUTHORS Diamond,S. and Banfield,J. TITLE Direct Submission JOURNAL Submitted (29-MAY-2018) Earth and Planetary Science, University of California, Berkeley, University of California, Berkeley, CA 94720, USA COMMENT Annotation was added by the NCBI Prokaryotic Genome Annotation Pipeline (released 2013). Information about the Pipeline can be found here: https://www.ncbi.nlm.nih.gov/genome/annotation_prok/ ##Genome-Assembly-Data-START## Assembly Method :: IDBA_UD v. 1.1.1 Genome Representation :: Full Expected Final Version :: Yes Genome Coverage :: 6x Sequencing Technology :: Illumina HiSeq ##Genome-Assembly-Data-END## ##Genome-Annotation-Data-START## Annotation Provider :: NCBI Annotation Date :: 05/31/2018 15:17:02 Annotation Pipeline :: NCBI Prokaryotic Genome Annotation Pipeline Annotation Method :: Best-placed reference protein set; GeneMarkS+ Annotation Software revision :: 4.5 Features Annotated :: Gene; CDS; rRNA; tRNA; ncRNA; repeat_region Genes (total) :: 5,618 CDS (total) :: 5,576 Genes (coding) :: 5,390 CDS (coding) :: 5,390 Genes (RNA) :: 42 rRNAs :: 1, 1 (5S, 23S) complete rRNAs :: 1 (5S) partial rRNAs :: 1 (23S) tRNAs :: 37 ncRNAs :: 3 Pseudo Genes (total) :: 186 Pseudo Genes (ambiguous residues) :: 20 of 186 Pseudo Genes (frameshifted) :: 112 of 186 Pseudo Genes (incomplete) :: 51 of 186 Pseudo Genes (internal stop) :: 8 of 186 Pseudo Genes (multiple problems) :: 5 of 186 ##Genome-Annotation-Data-END## FEATURES Location/Qualifiers source 1..4480 /organism="Acidobacteria bacterium" /mol_type="genomic DNA" /isolate="gp5 AA64" /isolation_source="meadow soil" /db_xref="taxon:1978231" /environmental_sample /geo_loc_name="USA: Angelo Coast Range Reserve, CA" /lat_lon="39.74 N 123.63 W" /collection_date="2014-09-29" /note="metagenomic; derived from metagenome: soil metagenome" gene <1..212 /locus_tag="DMG12_15650" CDS <1..212 /locus_tag="DMG12_15650" /inference="COORDINATES: ab initio prediction:GeneMarkS+" /note="Derived by automated computational analysis using gene prediction method: GeneMarkS+." /codon_start=3 /transl_table=11 /product="IS91 family transposase" /protein_id="PYS01299.1" /translation="YSWSELLKRLFEFDILVCERCGGPVRVIAAIQEPETAQQILNYL GLPSRPPPITPARYERRFHPDDFAS" gene 240..935 /locus_tag="DMG12_15655" CDS 240..935 /locus_tag="DMG12_15655" /inference="COORDINATES: protein motif:HMM:PF07969.9" /note="Derived by automated computational analysis using gene prediction method: Protein Homology." /codon_start=1 /transl_table=11 /product="hypothetical protein" /protein_id="PYS01300.1" /translation="MSLVPTELLVKPAGCPPKTLEFSCRPPRKSSRPLESLQKTRYAT SRNPRNGAAILAVPRLNLLCSPGGFRVRDDAVILQRPVLRPAYRCLPQRNDDAQRSRS AGKDAEQRGSPNHQAGAAAGQPQLKVLADAGAPIALGTDTGTNLGQWQGYFEHVELEM MVKAGLTPMQALVAATGGAARVMKLDQQLGTVHPGKWADLLVLNANPLGDIRNTRQID SVWIAGRRLPNLP" gene complement(1563..2583) /locus_tag="DMG12_15660" /pseudo CDS complement(1563..2583) /locus_tag="DMG12_15660" /inference="COORDINATES: similar to AA sequence:RefSeq:WP_012536382.1" /note="frameshifted; Derived by automated computational analysis using gene prediction method: Protein Homology." /pseudo /codon_start=1 /transl_table=11 /product="hypothetical protein" gene 2789..3073 /locus_tag="DMG12_15665" CDS 2789..3073 /locus_tag="DMG12_15665" /inference="COORDINATES: ab initio prediction:GeneMarkS+" /note="Derived by automated computational analysis using gene prediction method: GeneMarkS+." /codon_start=1 /transl_table=11 /product="hypothetical protein" /protein_id="PYS01301.1" /translation="MYCSSIAIADSRDSLYGECMARLISTTIRLDEEDVRALKRARAA GHSASDLVRKGLRVVASRYYTGRRPPSTRLFESVDTKLGDESELFRNLED" gene 3070..3501 /locus_tag="DMG12_15670" CDS 3070..3501 /locus_tag="DMG12_15670" /inference="COORDINATES: ab initio prediction:GeneMarkS+" /note="Derived by automated computational analysis using gene prediction method: GeneMarkS+." /codon_start=1 /transl_table=11 /product="VapC toxin family PIN domain ribonuclease" /protein_id="PYS01302.1" /translation="MSGPIIADTGGLLRALARTAEGKSSFPEYESALTAARLVLVPGL VLAEVDYFLRERRSAMRKLIAEIFDPATRYQYELPLPSDLVRALELDAKFKDLSLGLV DGTVAALAERRQVYRVLTTDRRDFGALRVGPRLTRALELLP" gene complement(3858..>4480) /locus_tag="DMG12_15675" CDS complement(3858..>4480) /locus_tag="DMG12_15675" /inference="COORDINATES: ab initio prediction:GeneMarkS+" /note="Derived by automated computational analysis using gene prediction method: GeneMarkS+." /codon_start=3 /transl_table=11 /product="hypothetical protein" /protein_id="PYS01303.1" /translation="FRIGQQVDWSLTPESDAQWRVYSTVVKDPKTGRNTAAEQEGVKV RTTPRTLLSLLWAASLTPPEDACFLGQVEYMSQERAAQRVANEVARERLNAFGGGLGH AETLLVKRDPFKHETEVRLVYVEHRDGCGSDHIFAVPFDPNSTFEEVMLDPRLHPDDV KEREAEFSSLGFKHPVKKSNLYQRVLYEIVLDERGWSNALTLPVAA" BASE COUNT 898 a 1290 c 1305 g 987 t ORIGIN 1 attattcatg gtcagagctt ctgaaacggc tgttcgaatt tgacattttg gtatgtgagc 61 gatgtggcgg acccgtccgt gtgatcgccg cgatccagga accggaaacg gcgcaacaaa 121 ttctaaatta cctcggcctt ccatcaaggc cgcctccgat cacgccggcc cgttacgaac 181 gccgtttcca tcccgacgac ttcgcttcct aaagccccgc caggtcgtgg tgtgcctcat 241 tgagcctcgt tccaacggaa ttgctcgtga aacccgccgg ctgccccccg aaaacgcttg 301 aattctcgtg ccgtcccccc agaaaatcgt cccgtcctct agaaagttta cagaagaccc 361 gttatgcgac gtccaggaac cctcggaacg gagccgccat acttgcggta cctcgcttga 421 atcttctatg ctcacctggc ggttttcgtg tacgagacga cgccgtcatt cttcaaagac 481 ccgttcttcg tccggcatat cgatgcctac cgcagagaaa tgacgatgct cagcgatccc 541 gctctgcagg aaaagacgcg gaacagcgag gaagcccgaa ccatcaagca ggcgctgcag 601 cagggcagcc gcaactgaag gtcctggcgg atgcgggcgc tccgatcgct ctgggaaccg 661 acacgggcac gaatctggga caatggcagg gatactttga gcatgtggag ctcgagatga 721 tggtcaaggc tggcttgacg ccgatgcagg cactcgtggc cgcaacaggc ggggccgcgc 781 gtgtcatgaa gctcgatcaa caattgggta cggttcaccc gggaaaatgg gctgaccttc 841 tcgtgttgaa tgccaatccg ctcggggaca tccgcaacac gcggcagatt gactcggtgt 901 ggattgccgg acggcggctg ccgaatctgc cctagagctg agaggggaag ccgctgaaca 961 aaaaacttcc gggtcgcgga aaaggtgttg atggctgcgg gaactttccg cggatgtttt 1021 aaaaggcata cacgcctgcg gaaactttct taggacggcg aaaatgcttc gcaggtctgc 1081 agggactttt ttcgaacggc gcaaaaggca tgcatgtctg cggaaacatc tttcggacgg 1141 caaaatcttg tagcatcttt cccggcgcta aagatagcgt gggctatgtt cgccgatcgg 1201 ccatcgccga tcgccactac ccaacaaaga tccggtagtg gtctagtttg gaacaaggtt 1261 gcgaaaacgt ttctggcaag ctcgtagccg cgggctttat gcccgcgctt aaattccagc 1321 aggataattc ttggattgtg gcttggacgc gggcataaag cccgcggcta cgagcttacc 1381 tttggatttc tccaaactag accactacca aagatccgca atatccttag cgatagtccc 1441 cgaattattg ggcaaagctc ctcgtgaggt gagcgcgggg gccatctgca attggaccgt 1501 gtgagcttgc tgcgtcgcct tgattcctgg ctcaaagcga atttcaaact ccgcccgtca 1561 cctcatttga gcgtcttggt gatctcgtaa gtgcgggacg cgataggttg cgcggcggcg 1621 agggaactcg ggtgcgtggc gtgtctgcgt ggtagttgct ccgcaggaca gttccggcca 1681 cgattaccgc caacggcggt gctttcatgg cgaatcttcg agatgtgggt ggggcgtaga 1741 agatcgtggc cgagattctc aatgcagaga atcgcctgtt tgctctggca cgtcacgacc 1801 agcgcctgcc ttccgcagtg acggcgattg ccgtcgtgct ttgcaccggg aaaaccagcg 1861 cccttcgaga agcggcacag tccgtcggcg tcgtagtttc tttgaaccag tcctgattcc 1921 acatgggatt ggacacgacg cggtcgaacg tctggaggcg gttgcccacg cggaacttcg 1981 gattcttgaa cgtgtcgccg atctcgatcg tcccttccaa gtcgtggatg atcatgttca 2041 tgttcgccat cgcccacgtg gtggcgacgt actcctgtcc gtgcatcttg agcggcgcgt 2101 acttgatctt gctgcgcagc ttcatctggt catccagcac gtgctcgcca agcgatgagc 2161 aggcgaggcc gatccgcagc acggatcgta gacgctcatg cccggctcag gatccatgat 2221 cctggccatg attcgagcga cttccggagg cgtaaagtac tcgccggcgg attggcagct 2281 cccttcggcg aacttgcgag gtactcgtag ctgcgaccga ttatatcggc ttcgacgtct 2341 ttgaggccga gccgcttctc gctgaggatc ctgaggcttg agcggcaggt agaaggggac 2401 gagcttgtga tccttctcgg cgagcttgaa tgccttcgcg cgcgagccga cctcttgagc 2461 gatgcggttc agttcatcgt cgaaaacgtc acagagccgc ttagtgaaga tgagcgggag 2521 gatgaagtcc ttgaacttcg gtgcctcctg cgccccgcga atgctgcatg cggcgtccca 2581 aatccaggac tcgagtgatt tgcgctaccg ttgttcataa atcctgccgg gctcggatct 2641 tacatcgagg ctatggattc ccgcacccga aaatatacgc tctccgccga agcgcacata 2701 gacacatcag ccgtcactgg catgcagatc gtttccactt cggcgccttg aagttcgagc 2761 atcacgacca acatagcgag gttatccgat gtattgttct tccatagcga ttgcggattc 2821 gcgcgattca ctgtatggtg agtgtatggc acgactgatt tccactacga ttcgcttgga 2881 cgaagaggat gtacgcgcgt tgaagcgggc gcgcgcggca ggccattccg cctctgacct 2941 cgttcgcaag ggtttgcggg tagttgcttc acgctactat acgggtcggc gtccaccgtc 3001 gacgcgtctc tttgagtccg tggacacgaa gctcggagat gaatctgagt tgttccggaa 3061 ccttgaagat tgagtggacc cataatcgcg gacacaggcg ggcttctccg ggcgcttgct 3121 cgcactgcgg agggcaaatc cagtttcccg gagtacgaat cagctctgac ggcggcgcgc 3181 ctcgtcctcg tgccaggatt ggtgctcgcc gaagtcgact acttcctgcg agagcgacgg 3241 agcgcgatgc gcaagctcat tgccgagatc ttcgatccgg caacccgtta ccagtacgaa 3301 ctgcccctcc cgtcggatct cgttcgcgca cttgaactgg acgccaaatt caaggatctc 3361 agtcttggat tggtcgatgg cacggtggca gctctcgccg agcggcgtca ggtctaccgg 3421 gtgctgacaa cagaccgccg tgacttcggc gcgctgcgcg tcggtccgcg tctgacacgg 3481 gcgcttgaat tgctgccata gagttgcccg atccagacaa ggcttcaatg tcgcaatcgg 3541 ggcgttatgt tggctgggga gcccgccggc ggaaccggct gggtggcgtg gccttggtcg 3601 atattcacgc atggcaatct taccgaagtt gctgttatgg gctactccgt ggaggaacgt 3661 ctgcacttat gtcgttcatc cttgaggacc agccaccatt ccttccacac gccaatgctg 3721 aatggttctc gttgatcgaa ggcctccggg ccatttctgt cgatgcccgt ccgagagcgg 3781 agcttccgag cggcgcgaga gacgcctcgt gacgacgtgc cgcacctata tgcggccatc 3841 gtgggtgcta ggcagtgcta ggcggcgacc ggcaaggtca gcgcgttgga ccaacctcgc 3901 tcgtcgagga cgatctcgta cagaaccctc tgatacaggt tcgacttctt cacaggatgt 3961 ttgaaaccga gcgagctgaa ctcggcctct cgttccttga catcgtcagg atgcaaccgg 4021 ggatcgagca tcacctcctc gaacgtggag ttcggatcga acggaacagc aaagatgtgg 4081 tcactgccgc acccgtcacg atgttcaacg tatacgagcc tcacttcagt ttcatgcttg 4141 aagggatcgc gtttgacgag caacgtttcc gcatggccca acccaccacc gaacgcattg 4201 aggcgctcac gggcaacctc gttcgcgacc ctctgggctg cccgttcttg cgacatgtac 4261 tcgacctgac ccagaaagca agcgtcttct ggcggagtta ggctcgccgc ccagagaagg 4321 cttagaagtg tgcgcggcgt ggtccgaacc ttaaccccct cctgttccgc tgctgtgttt 4381 cgaccagttt ttggatcctt cacgactgtg gaatagactc gccactgtgc gtctgactca 4441 ggcgtcaaag accaatcgac ttgctgcccg attcgaaaac //