LOCUS KN550008 65767 bp DNA linear CON 15-DEC-2014 DEFINITION Oesophagostomum dentatum strain OD-Hann unplaced genomic scaffold O_dentatum_1.0_Cont809, whole genome shotgun sequence. ACCESSION KN550008 JOOK01000000 VERSION KN550008.1 DBLINK BioProject: PRJNA72579 BioSample: SAMN02866218 KEYWORDS WGS; HIGH_QUALITY_DRAFT. SOURCE Oesophagostomum dentatum ORGANISM Oesophagostomum dentatum Eukaryota; Metazoa; Ecdysozoa; Nematoda; Chromadorea; Rhabditida; Rhabditina; Rhabditomorpha; Strongyloidea; Strongylidae; Oesophagostomum. REFERENCE 1 (bases 1 to 65767) AUTHORS Mitreva,M. TITLE Draft genome of the hookworm Oesophagostomum dentatum JOURNAL Unpublished REFERENCE 2 (bases 1 to 65767) AUTHORS Mitreva,M., Pepin,K.H., Abubucker,S., Martin,J., Minx,P., Warren,C., Palsikar,V.B., Zhang,X. and Wilson,R.K. TITLE Direct Submission JOURNAL Submitted (29-MAR-2014) The Genome Institute, Washington University School of Medicine, 4444 Forest Park, St. Louis, MO 63108, USA COMMENT Oesophagostomum dentatum, a Nodule worm, infects various livestock species and can also be found in humans. The larval stages invade the large intestinal wall and cause granulomatous inflammatory reactions. O. dentatum infects pigs in all production systems worldwide. Due to the lack of effective immune reactions worm burdens can be high and the parasite is found in pigs of all age groups. O. dentatum can be cultivated in vitro in all life cycle stages and serves as a model for strongylid nematodes which constitute a group of parasites most closely related to C. elegans. The sequenced strain (OD-Hann) was collected in Lower Saxony, Germany, around 1990 by Prof. Dr. Arwid Daugschies, and is since maintained in pigs, currently at the Institute of Parasitology, Vetmeduni Vienna, Austria. Material for sequencing was obtained from Prof Dr Anja Joachim (Anja.Joachim@vetmeduni.ac.at) of the Vetmeduni Vienna where it has been maintained since 2003. Worm isolation and extraction of nucleic acids was done by Prof Joachim and coworkers at the Institute of Parasitology, Vetmeduni Vienna, or the Genome Institute production team. This assembly consists of fragments, 3kb and 8kb insert whole genome shotgun libraries. The sequences were generating on the Roch/454 platform and assembled using Newbler. To improve scaffolding, in house tools CIGA (Cdna tool for Improving Genome Assembly) and Pygap (Gap closure tool) were used to map 454 cDNA reads using blat to the genomic assembly to link genomic contigs based on cDNA evidence. Only joins confirmed by additional independent data typing were accepted and used to close gaps, followed by the Pyramid assembler using Illumina paired reads to close gaps and extending contigs. The repeat library was generated using Repeatmodeler (A. Smit, R. Hubley http://www.systemsbiology.org/). The Ribosomal RNA genes were identified using RNAmmer ((http://www.cbs.dtu.dk/cgi-bin/nph-sw_request?rnammer ) and transfer RNA's were identified with tRNAscan-SE (Lowe and Eddy, 1997). Non-coding RNAs, such as microRNAs, were identified by sequence homology search of the Rfam database (http://selab.janelia.org/software.html). Repeats and predicted RNA's were then masked using RepeatMasker (A. Smit, R. Hubley & P. Green http://repeatmasker.org). Protein-coding genes were predicted using a combination of ab initio programs Snap (I. Korf, 2004), Fgenesh (Softberry, Corp) and Augustus (M. Stanke, et. Al 2008) and the annotation pipeline tool Maker (M. Yandell et. al., 2007) which aligns mRNA, EST and protein information from same species or cross-species to aid in gene structure determination and modifications. A consensus gene set from the above prediction algorithms was generated, using a logical, hierarchical approach developed at the Genome institute. Gene product naming was determined by BER (JCVI: http://ber.sourceforge.net). Our goal is to explore this WGS draft sequence of Oesophagostomum dentatum to better define proteins involved in nematode parasitism that impact health and disease and are relevant to both host-parasite relationships and basic biological processes. For information regarding this assembly or project, or any other GSC genome project, please visit our Genome Groups web page (http://genome.wustl.edu/genome_group_index.cgi) and email the designated contact person. For specific questions regarding the Oesophagostomum dentatum genome project contact Makedonka Mitreva (mmitreva@genome.wustl.edu) at Washington University School of Medicine. The National Human Genome Research Institute (NHGRI) of the National Institutes of Health (NIH) provided funds for this project. ##Genome-Assembly-Data-START## Finishing Goal :: High-Quality Draft Current Finishing Status :: High-Quality Draft Assembly Method :: Newbler v. MapAsmResearch-10/14/2011 Assembly Name :: O_dentatum_10.0.ec.cg.pg Genome Coverage :: 15.0x Sequencing Technology :: LaRoche 454 ##Genome-Assembly-Data-END## FEATURES Location/Qualifiers source 1..65767 /organism="Oesophagostomum dentatum" /mol_type="genomic DNA" /submitter_seqid="O_dentatum_1.0_Cont809" /strain="OD-Hann" /db_xref="taxon:61180" /chromosome="Unknown" /lab_host="pig" /country="Germany: Lower Saxony" /collection_date="1990" /collected_by="Arwid Daugschies" assembly_gap 2628..2727 /estimated_length=100 /gap_type="within scaffold" /linkage_evidence="paired-ends" assembly_gap 4971..8316 /estimated_length=3346 /gap_type="within scaffold" /linkage_evidence="paired-ends" assembly_gap 9730..10049 /estimated_length=320 /gap_type="within scaffold" /linkage_evidence="paired-ends" assembly_gap 21131..21230 /estimated_length=100 /gap_type="within scaffold" /linkage_evidence="paired-ends" gene 22334..26039 /locus_tag="OESDEN_04737" mRNA join(22334..22380,22433..22733,22786..22871,24579..24668, 24736..24901,24978..25078,25741..25843,25899..26039) /locus_tag="OESDEN_04737" /product="HAD hydrolase, family IE" CDS join(22334..22380,22433..22733,22786..22871,24579..24668, 24736..24901,24978..25078,25741..25843,25899..26039) /locus_tag="OESDEN_04737" /inference="protein motif:HMMTigr:IPR018012" /note="KEGG: ecb:100055382 1.6e-52 hypothetical protein LOC100055382; K01081 5'-nucleotidase" /codon_start=1 /product="HAD hydrolase, family IE" /protein_id="KHJ95314.1" /db_xref="InterPro:IPR018012" /translation="MPTPVAFPLPVIDENRTIVEPNFNAIFNRPNVMMRDREAVERKL KIMVEGGKQKLMVGLENRRSIEVVKDKSSFQVISDFDYTLSRFEDSRGARCWTTHGVF DHCAMEVDPMLADKFQTLRAKYFPIEFDPKLSLEQKIPYMEEWWNKSHNHIVSARFSK PTIENFVRNSKIILRDQAEVMLQRLHHLGVPLVVFSAGIGNIIEMFLQQKFGQMPANV HIISNMMNFNDKGVVVSFSQPLIHTFCKNSSVIRKEAEFFHEVRGRNNVILLGDSMGD IHMDVGVEKQGPTLKIGFLNSDIDNLLEHYLDAYDVVLVRDQSMAIPDAIVQIIAEGY IKERESSLIS" assembly_gap 29803..30453 /estimated_length=651 /gap_type="within scaffold" /linkage_evidence="paired-ends" gene 31086..33143 /locus_tag="OESDEN_04738" mRNA join(31086..31235,31636..31720,32413..32541,32594..32692, 32746..32963,33030..33143) /locus_tag="OESDEN_04738" /product="collagen triple helix repeat protein" CDS join(31086..31235,31636..31720,32413..32541,32594..32692, 32746..32963,33030..33143) /locus_tag="OESDEN_04738" /inference="protein motif:HMMPfam:IPR008160" /note="KEGG: isc:IscW_ISCW014473 2.5e-36 basement membrane collagen, putative" /codon_start=1 /product="collagen triple helix repeat protein" /protein_id="KHJ95315.1" /db_xref="InterPro:IPR008160" /translation="MTENNADRLAWVVSAACLVFVAGTVAVVATLHSEISSVAERAER ELPKYNQTYNTYSTSQCECPGGPRGPPGLNGYDGVPGVPGEDGRNGNDDHTLRLHYSD SCATCPAGPPGPPGEPGPEGETGPKGFQGPPGSDGAPGIPGPKGPDGDKGVPGPPGLP GPPGNPGANGRRNTPVPGPQGPPGPAGEAGSIGEPGAPGLPGPEGPAGPGGWPGHPGS RGADGAYGPPGDPGASGEGGYCPCVSRNTRNSNDATDDYKKTDGIS" gene complement(33524..36643) /locus_tag="OESDEN_04739" mRNA complement(join(33524..33535,33981..34109,34633..34824, 34916..35096,36621..36643)) /locus_tag="OESDEN_04739" /product="hypothetical protein" CDS complement(join(33524..33535,33981..34109,34633..34824, 34916..35096,36621..36643)) /locus_tag="OESDEN_04739" /codon_start=1 /product="hypothetical protein" /protein_id="KHJ95316.1" /translation="MLRVITFLRELAGLWRAYVHSYSFQAKPTLCRMCVLQARANKLT SLAGVAISSHEHSGQKRKREEKKKKRFNEYFFLKTDKSQGEPAKVAKRDPFADSGENV LLVQQLRDQITKLHSLVAQKEAAMLEKDKKIATLQADLMSAERKHREKVEQLLKEKDE AIQMVIERQRQANKQVKK" assembly_gap 39300..40086 /estimated_length=787 /gap_type="within scaffold" /linkage_evidence="paired-ends" assembly_gap 40941..42756 /estimated_length=1816 /gap_type="within scaffold" /linkage_evidence="paired-ends" assembly_gap 43920..44124 /estimated_length=205 /gap_type="within scaffold" /linkage_evidence="paired-ends" assembly_gap 48242..49878 /estimated_length=1637 /gap_type="within scaffold" /linkage_evidence="paired-ends" gene 50284..50837 /locus_tag="OESDEN_04740" mRNA join(50284..50327,50432..50837) /locus_tag="OESDEN_04740" /product="hypothetical protein" CDS join(50284..50327,50432..50837) /locus_tag="OESDEN_04740" /codon_start=1 /product="hypothetical protein" /protein_id="KHJ95317.1" /translation="MKTTKSGNREETDVATVDRIYDTYAHAPRAAGYLNDDSYYETSM EEAIRFHMPAELCSFFSSLICFCLWERHKRDVSEDFINGGFRTGLAETLAFHEIAERA ALHSVKLNEVLNVNYPPVADSVNKRSMLRYIEIGCIDYKNQPSGVNV" assembly_gap 50943..51522 /estimated_length=580 /gap_type="within scaffold" /linkage_evidence="paired-ends" assembly_gap 53622..54905 /estimated_length=1284 /gap_type="within scaffold" /linkage_evidence="paired-ends" gene complement(61370..62011) /locus_tag="OESDEN_04741" mRNA complement(join(61370..61532,61854..62011)) /locus_tag="OESDEN_04741" /product="hypothetical protein" CDS complement(join(61370..61532,61854..62011)) /locus_tag="OESDEN_04741" /codon_start=1 /product="hypothetical protein" /protein_id="KHJ95318.1" /translation="MPWCAKGESHVLFNKQNDPEPSFHILIIEDSFTALGDENKLGKN FTFEAKHKTTGRSFIFAAEDFKTLEPWVELLMITTVDYVLLLKQSFGEQIDHIQYSEA EPGN" assembly_gap 62160..64142 /estimated_length=1983 /gap_type="within scaffold" /linkage_evidence="paired-ends" CONTIG join(JOOK01127775.1:1..2627,gap(100),JOOK01127780.1:1..2243, gap(3346),JOOK01127781.1:1..1413,gap(320),JOOK01127782.1:1..11081, gap(100),JOOK01127783.1:1..8572,gap(651),JOOK01127784.1:1..8846, gap(787),JOOK01127785.1:1..854,gap(1816),JOOK01127786.1:1..1163, gap(205),JOOK01127787.1:1..4117,gap(1637),JOOK01127776.1:1..1064, gap(580),JOOK01127777.1:1..2099,gap(1284),JOOK01127778.1:1..7254, gap(1983),JOOK01127779.1:1..1625) //