LOCUS SCJN01000450 2489 bp DNA linear BCT 21-JAN-2019 DEFINITION Escherichia coli strain 29_CAASB NODE_450_length_2489_cov_4.978856, whole genome shotgun sequence. ACCESSION SCJN01000450 SCJN01000000 VERSION SCJN01000450.1 DBLINK BioProject: PRJNA514354 BioSample: SAMN10722944 KEYWORDS WGS. SOURCE Escherichia coli ORGANISM Escherichia coli Bacteria; Proteobacteria; Gammaproteobacteria; Enterobacterales; Enterobacteriaceae; Escherichia. REFERENCE 1 (bases 1 to 2489) AUTHORS Potter,R., Zou,Z., Henderson,J. and Dantas,G. TITLE Genomic analysis of febrile catheter-associated UTI E. coli isolates JOURNAL Unpublished REFERENCE 2 (bases 1 to 2489) AUTHORS Potter,R., Zou,Z., Henderson,J. and Dantas,G. TITLE Direct Submission JOURNAL Submitted (15-JAN-2019) Center for Genome Sciences & Stems Biology, Washington University, 5121 Robert Potter, 4523 Clayton Avenue, St. Louis, MO 63110, USA COMMENT Annotation was added by the NCBI Prokaryotic Genome Annotation Pipeline (released 2013). Information about the Pipeline can be found here: https://www.ncbi.nlm.nih.gov/genome/annotation_prok/ ##Genome-Assembly-Data-START## Assembly Method :: SPAdes v. 3.13.0 Genome Representation :: Full Expected Final Version :: Yes Genome Coverage :: 130.3504x Sequencing Technology :: Illumina NextSeq ##Genome-Assembly-Data-END## ##Genome-Annotation-Data-START## Annotation Provider :: NCBI Annotation Date :: 01/16/2019 17:34:50 Annotation Pipeline :: NCBI Prokaryotic Genome Annotation Pipeline Annotation Method :: Best-placed reference protein set; GeneMarkS-2+ Annotation Software revision :: 4.7 Features Annotated :: Gene; CDS; rRNA; tRNA; ncRNA; repeat_region Genes (total) :: 7,415 CDSs (total) :: 7,332 Genes (coding) :: 6,745 CDSs (with protein) :: 6,745 Genes (RNA) :: 83 rRNAs :: 1, 3, 4 (5S, 16S, 23S) complete rRNAs :: 1 (5S) partial rRNAs :: 3, 4 (16S, 23S) tRNAs :: 69 ncRNAs :: 6 Pseudo Genes (total) :: 587 CDSs (without protein) :: 587 Pseudo Genes (ambiguous residues) :: 331 of 587 Pseudo Genes (frameshifted) :: 178 of 587 Pseudo Genes (incomplete) :: 127 of 587 Pseudo Genes (internal stop) :: 65 of 587 Pseudo Genes (multiple problems) :: 109 of 587 CRISPR Arrays :: 2 ##Genome-Annotation-Data-END## FEATURES Location/Qualifiers source 1..2489 /organism="Escherichia coli" /mol_type="genomic DNA" /strain="29_CAASB" /isolation_source="urine" /host="Homo sapiens" /db_xref="taxon:562" /geo_loc_name="USA: St. Louis, MO" /lat_lon="38.6357 N 90.2648 W" /collection_date="2010" /collected_by="Jonas Marshall" gene complement(123..857) /locus_tag="EPS76_26830" CDS complement(123..857) /locus_tag="EPS76_26830" /inference="COORDINATES: similar to AA sequence:RefSeq:WP_001217110.1" /note="Derived by automated computational analysis using gene prediction method: Protein Homology." /codon_start=1 /transl_table=11 /product="colibactin biosynthesis phosphopantetheinyl transferase ClbA" /protein_id="RXD02788.1" /translation="MRIDILIGHTSFFHQTSRDNFLHYLNEEEIKRYDQFHFVSDKEL YILSRILLKTALKRYQPDVSLQSWQFSTCKYGKPFIVFPQLAKKIFFNLSHTIDTVAV AISSHCELGVDIEQIRDLDNSYLNISQHFFTPQEATNIVSLPRYEGQLLFWKMWTLKE AYIKYRGKGLSLGLDCIEFHLTNKKLTSKYRGSPVYFSQWKICNSFLALASPLITPKI TIELFPMQSQLYHHDYQLIHSSNGQN" gene complement(858..1070) /locus_tag="EPS76_26835" CDS complement(858..1070) /locus_tag="EPS76_26835" /inference="COORDINATES: similar to AA sequence:RefSeq:WP_000357141.1" /note="Derived by automated computational analysis using gene prediction method: Protein Homology." /codon_start=1 /transl_table=11 /product="colibactin biosynthesis LuxR family transcriptional regulator ClbR" /protein_id="RXD02789.1" /translation="MDKFKEKNPLSLRERQVLRMLAQGDEYSQISHNLNISINTVKFH VKNIKHKIQARNTNHAIHIANRNEII" gene complement(1277..1465) /locus_tag="EPS76_26840" CDS complement(1277..1465) /locus_tag="EPS76_26840" /inference="COORDINATES: similar to AA sequence:RefSeq:WP_001304252.1" /note="Derived by automated computational analysis using gene prediction method: Protein Homology." /codon_start=1 /transl_table=11 /product="hypothetical protein" /protein_id="RXD02790.1" /translation="MIILNGTPIRKYSFLLRIHIIILHFITYRNTPKKSLYSDYMFFF NELDSVCKQKHVSHAVHT" gene 1504..>2489 /locus_tag="EPS76_26845" CDS 1504..>2489 /locus_tag="EPS76_26845" /inference="COORDINATES: similar to AA sequence:RefSeq:WP_001518711.1" /note="Derived by automated computational analysis using gene prediction method: Protein Homology." /codon_start=1 /transl_table=11 /product="hypothetical protein" /protein_id="RXD02791.1" /translation="MDNTSGDFPCNKMDTRKQLPLTPSQQGFLFHSLKDKKRSNYHEH FTCIFSQHVDSAHFKWALETLFRKHECFRTDYNWEIDERPCQVVKTDVLPDIYVLDCE QEEIRFLLANDDIIIPVPQDDGIDAIIPQLLQADLKYPFSLKTIPVRAYLIQSTKESA FILSYHHIVMDGWSLSLFIKQLLQLCGAAVVSGVRDDSAIIPSSLKPLVDTLSARRHT FQHDYWAAYLREGTPTCIVPLSQYHTDTEAENNSYVNQTNHVEINLSPDVCQKIQTLC SDYRITPAVIFYVAWGILLQRWCYADDVLFGATISGRNIPIDGIEETLGLFI" BASE COUNT 729 a 439 c 516 g 805 t ORIGIN 1 tgattatgcc ggatgatctc taaatgtgaa tggcacgatt atgcgggata cttacaccac 61 cgacggaata tgaaaatcaa tattatcgac ggctcagaag tgtctagatt atccgtggcg 121 attcaattct gcccatttga cgaatgaatt agctgatagt cgtggtgata aagttgggac 181 tgcataggaa atagctcaat agttatttta ggggtgatga gtggagaggc taatgcgaga 241 aatgagttac atattttcca ttgagagaaa taaacaggtg aacctctata ttttgaagtt 301 agttttttat ttgttaaatg aaattcaata caatccagtc ctaaagatag gcctttacct 361 cgatatttga tgtaagcttc tttgagcgtc cacattttcc aaaaaagtaa ttgaccttca 421 taacgaggaa gtgaaactat gttagtagct tcctgtggag taaaaaaatg ctgactgata 481 ttcagataag agttgtctaa atctcttatt tgttcaatat cgacaccaag ctcgcagtga 541 gaactaatag caacggctac tgtatctata gtatgggaaa ggttaaaaaa aatctttttt 601 gccaactgag gaaaaactat aaatggtttg ccatatttgc acgtactaaa ttgccatgat 661 tgtaatgaga catcaggttg atatcttttt agtgctgttt tgagcaggat acggcttaaa 721 atatagagtt ctttatcact cacaaaatga aactgatcat agcgttttat ttcttcctca 781 ttgagatagt gaaggaagtt atctctactg gtttgatgaa aaaaactagt atgtccaatt 841 aatatatcaa tcctcattta gataatctca ttcctgttag caatgtgtat agcgtgattc 901 gtattccgag cttgtatttt atgtttgatg tttttcacat gaaactttac tgtgtttatt 961 gatatgttaa gattatgtga tatttgagag tactcatcac cttgtgccag catgcgcaat 1021 acttgtcttt cacgcagaga taacgggttt ttttctttga acttatccat gtttcccccc 1081 atcctgaatg gtatctgtgt atctgtgtat ctgtgtatct gtgtatctgt gtatctgtgt 1141 atctgtgtat ctgtgtatct gtgtatctgt tgttttggca gtatttaaga ggaatttacg 1201 atacaacggt ttttatgtaa atgggaatta cgcattattt tctatgtggt ggctgtatca 1261 attcataccc gctacatcat gtatgtactg catgacttac atgtttttgc ttacagacag 1321 aatctaattc attgaagaaa aacatgtaat cagaatataa ggattttttg ggggtattcc 1381 tgtaagtgat aaaatgcaat atgattatat gaatacgcaa taaaaaacta tacttgcgga 1441 taggtgtgcc atttagaata atcatgttaa ataatctata aatccgataa taaggtgatg 1501 gttatggata atacctctgg agattttcca tgtaataaga tggacacgcg taagcagtta 1561 ccgctaacac caagtcaaca ggggttttta ttccattcct taaaggataa gaaaaggagt 1621 aactaccatg agcattttac atgcattttt tctcagcatg tagatagcgc ccacttcaag 1681 tgggcgctgg aaacgttatt tcgaaagcat gagtgttttc gcactgatta taactgggag 1741 attgatgagc gcccttgtca ggtggtgaag accgatgtgt tgccggatat atatgtgtta 1801 gactgtgagc aagaggaaat acgttttcta ctagctaatg atgacattat cattcctgtc 1861 ccgcaggatg acggtattga tgctataatt cctcaactgc tacaggctga tttaaaatac 1921 ccattttcct tgaaaacgat cccagtccgg gcctacctta ttcagtcaac gaaagaaagt 1981 gcttttatac tatcatacca tcatattgtg atggatggct ggagcttatc ccttttcatt 2041 aaacagttgc tccaactctg tggagcggct gtggtcagtg gtgtgaggga tgatagcgcc 2101 attatcccct catctctgaa accccttgta gacacactgt cggcccgacg tcacaccttt 2161 cagcacgact attgggctgc atatcttcgg gagggaacac caacttgtat cgtgccgctg 2221 tcacaatatc acacagatac tgaagccgag aacaattctt acgttaatca aacaaatcat 2281 gtggagatca atttgtctcc ggatgtgtgt cagaaaatac agacgctatg cagcgattat 2341 cgtatcaccc ccgcagtaat cttctatgtg gcctggggca tcctgctaca acgttggtgc 2401 tatgctgacg atgtgttatt cggcgcgaca atatcagggc gaaatatacc aattgatggt 2461 atagaagaaa cactagggct atttattaa //