LOCUS       KN550008               65767 bp    DNA     linear   CON 15-DEC-2014
DEFINITION  Oesophagostomum dentatum strain OD-Hann unplaced genomic scaffold
            O_dentatum_1.0_Cont809, whole genome shotgun sequence.
ACCESSION   KN550008 JOOK01000000
VERSION     KN550008.1
DBLINK      BioProject: PRJNA72579
            BioSample: SAMN02866218
KEYWORDS    WGS; HIGH_QUALITY_DRAFT.
SOURCE      Oesophagostomum dentatum
  ORGANISM  Oesophagostomum dentatum
            Eukaryota; Metazoa; Ecdysozoa; Nematoda; Chromadorea; Rhabditida;
            Rhabditina; Rhabditomorpha; Strongyloidea; Strongylidae;
            Oesophagostomum.
REFERENCE   1  (bases 1 to 65767)
  AUTHORS   Mitreva,M.
  TITLE     Draft genome of the hookworm Oesophagostomum dentatum
  JOURNAL   Unpublished
REFERENCE   2  (bases 1 to 65767)
  AUTHORS   Mitreva,M., Pepin,K.H., Abubucker,S., Martin,J., Minx,P.,
            Warren,C., Palsikar,V.B., Zhang,X. and Wilson,R.K.
  TITLE     Direct Submission
  JOURNAL   Submitted (29-MAR-2014) The Genome Institute, Washington University
            School of Medicine, 4444 Forest Park, St. Louis, MO 63108, USA
COMMENT     Oesophagostomum dentatum, a Nodule worm, infects various livestock
            species and can also be found in humans. The larval stages invade
            the large intestinal wall and cause granulomatous inflammatory
            reactions. O. dentatum infects pigs in all production systems
            worldwide. Due to the lack of effective immune reactions worm
            burdens can be high and the parasite is found in pigs of all age
            groups. O. dentatum can be cultivated in vitro in all life cycle
            stages and serves as a model for strongylid nematodes which
            constitute a group of parasites most closely related to C. elegans.
            
            The sequenced strain (OD-Hann) was collected in Lower Saxony,
            Germany, around 1990 by Prof. Dr. Arwid Daugschies, and is since
            maintained in pigs, currently at the Institute of Parasitology,
            Vetmeduni Vienna, Austria. Material for sequencing was obtained
            from Prof Dr Anja Joachim (Anja.Joachim@vetmeduni.ac.at) of the
            Vetmeduni Vienna where it has been maintained since 2003. Worm
            isolation and extraction of nucleic acids was done by Prof Joachim
            and coworkers at the Institute of Parasitology, Vetmeduni Vienna,
            or the Genome Institute production team.
            
            This assembly consists of fragments, 3kb and 8kb insert whole
            genome shotgun libraries. The sequences were generating on the
            Roch/454 platform and assembled using Newbler. To improve
            scaffolding, in house tools CIGA (Cdna tool for Improving Genome
            Assembly) and Pygap (Gap closure tool) were used to map 454 cDNA
            reads using blat to the genomic assembly to link genomic contigs
            based on cDNA evidence. Only joins confirmed by additional
            independent data typing were accepted and used to close gaps,
            followed by the Pyramid assembler using Illumina paired reads to
            close gaps and extending contigs.
            
            The repeat library was generated using Repeatmodeler (A. Smit, R.
            Hubley http://www.systemsbiology.org/). The Ribosomal RNA genes
            were identified using RNAmmer
            ((http://www.cbs.dtu.dk/cgi-bin/nph-sw_request?rnammer ) and
            transfer RNA's were identified with tRNAscan-SE (Lowe and Eddy,
            1997). Non-coding RNAs, such as microRNAs, were identified by
            sequence homology search of the Rfam database
            (http://selab.janelia.org/software.html). Repeats and predicted
            RNA's were then masked using RepeatMasker (A. Smit, R. Hubley & P.
            Green http://repeatmasker.org). Protein-coding genes were predicted
            using a combination of ab initio programs Snap (I. Korf, 2004),
            Fgenesh (Softberry, Corp) and Augustus (M. Stanke, et. Al 2008) and
            the annotation pipeline tool Maker (M. Yandell et. al., 2007) which
            aligns mRNA, EST and protein information from same species or
            cross-species to aid in gene structure determination and
            modifications. A consensus gene set from the above prediction
            algorithms was generated, using a logical, hierarchical approach
            developed at the Genome institute. Gene product naming was
            determined by BER (JCVI: http://ber.sourceforge.net).
            
            Our goal is to explore this WGS draft sequence of Oesophagostomum
            dentatum to better define proteins involved in nematode parasitism
            that impact health and disease and are relevant to both
            host-parasite relationships and basic biological processes.
            
            For information regarding this assembly or project, or any other
            GSC genome project, please visit our Genome Groups web page
            (http://genome.wustl.edu/genome_group_index.cgi) and email the
            designated contact person. For specific questions regarding the
            Oesophagostomum dentatum genome project contact Makedonka Mitreva
            (mmitreva@genome.wustl.edu) at Washington University School of
            Medicine. The National Human Genome Research Institute (NHGRI) of
            the National Institutes of Health (NIH) provided funds for this
            project.
            
            ##Genome-Assembly-Data-START##
            Finishing Goal           :: High-Quality Draft
            Current Finishing Status :: High-Quality Draft
            Assembly Method          :: Newbler v. MapAsmResearch-10/14/2011
            Assembly Name            :: O_dentatum_10.0.ec.cg.pg
            Genome Coverage          :: 15.0x
            Sequencing Technology    :: LaRoche 454
            ##Genome-Assembly-Data-END##
FEATURES             Location/Qualifiers
     source          1..65767
                     /organism="Oesophagostomum dentatum"
                     /mol_type="genomic DNA"
                     /submitter_seqid="O_dentatum_1.0_Cont809"
                     /strain="OD-Hann"
                     /db_xref="taxon:61180"
                     /chromosome="Unknown"
                     /lab_host="pig"
                     /country="Germany: Lower Saxony"
                     /collection_date="1990"
                     /collected_by="Arwid Daugschies"
     assembly_gap    2628..2727
                     /estimated_length=100
                     /gap_type="within scaffold"
                     /linkage_evidence="paired-ends"
     assembly_gap    4971..8316
                     /estimated_length=3346
                     /gap_type="within scaffold"
                     /linkage_evidence="paired-ends"
     assembly_gap    9730..10049
                     /estimated_length=320
                     /gap_type="within scaffold"
                     /linkage_evidence="paired-ends"
     assembly_gap    21131..21230
                     /estimated_length=100
                     /gap_type="within scaffold"
                     /linkage_evidence="paired-ends"
     gene            22334..26039
                     /locus_tag="OESDEN_04737"
     mRNA            join(22334..22380,22433..22733,22786..22871,24579..24668,
                     24736..24901,24978..25078,25741..25843,25899..26039)
                     /locus_tag="OESDEN_04737"
                     /product="HAD hydrolase, family IE"
     CDS             join(22334..22380,22433..22733,22786..22871,24579..24668,
                     24736..24901,24978..25078,25741..25843,25899..26039)
                     /locus_tag="OESDEN_04737"
                     /inference="protein motif:HMMTigr:IPR018012"
                     /note="KEGG: ecb:100055382 1.6e-52 hypothetical protein
                     LOC100055382; K01081 5'-nucleotidase"
                     /codon_start=1
                     /product="HAD hydrolase, family IE"
                     /protein_id="KHJ95314.1"
                     /db_xref="InterPro:IPR018012"
                     /translation="MPTPVAFPLPVIDENRTIVEPNFNAIFNRPNVMMRDREAVERKL
                     KIMVEGGKQKLMVGLENRRSIEVVKDKSSFQVISDFDYTLSRFEDSRGARCWTTHGVF
                     DHCAMEVDPMLADKFQTLRAKYFPIEFDPKLSLEQKIPYMEEWWNKSHNHIVSARFSK
                     PTIENFVRNSKIILRDQAEVMLQRLHHLGVPLVVFSAGIGNIIEMFLQQKFGQMPANV
                     HIISNMMNFNDKGVVVSFSQPLIHTFCKNSSVIRKEAEFFHEVRGRNNVILLGDSMGD
                     IHMDVGVEKQGPTLKIGFLNSDIDNLLEHYLDAYDVVLVRDQSMAIPDAIVQIIAEGY
                     IKERESSLIS"
     assembly_gap    29803..30453
                     /estimated_length=651
                     /gap_type="within scaffold"
                     /linkage_evidence="paired-ends"
     gene            31086..33143
                     /locus_tag="OESDEN_04738"
     mRNA            join(31086..31235,31636..31720,32413..32541,32594..32692,
                     32746..32963,33030..33143)
                     /locus_tag="OESDEN_04738"
                     /product="collagen triple helix repeat protein"
     CDS             join(31086..31235,31636..31720,32413..32541,32594..32692,
                     32746..32963,33030..33143)
                     /locus_tag="OESDEN_04738"
                     /inference="protein motif:HMMPfam:IPR008160"
                     /note="KEGG: isc:IscW_ISCW014473 2.5e-36 basement membrane
                     collagen, putative"
                     /codon_start=1
                     /product="collagen triple helix repeat protein"
                     /protein_id="KHJ95315.1"
                     /db_xref="InterPro:IPR008160"
                     /translation="MTENNADRLAWVVSAACLVFVAGTVAVVATLHSEISSVAERAER
                     ELPKYNQTYNTYSTSQCECPGGPRGPPGLNGYDGVPGVPGEDGRNGNDDHTLRLHYSD
                     SCATCPAGPPGPPGEPGPEGETGPKGFQGPPGSDGAPGIPGPKGPDGDKGVPGPPGLP
                     GPPGNPGANGRRNTPVPGPQGPPGPAGEAGSIGEPGAPGLPGPEGPAGPGGWPGHPGS
                     RGADGAYGPPGDPGASGEGGYCPCVSRNTRNSNDATDDYKKTDGIS"
     gene            complement(33524..36643)
                     /locus_tag="OESDEN_04739"
     mRNA            complement(join(33524..33535,33981..34109,34633..34824,
                     34916..35096,36621..36643))
                     /locus_tag="OESDEN_04739"
                     /product="hypothetical protein"
     CDS             complement(join(33524..33535,33981..34109,34633..34824,
                     34916..35096,36621..36643))
                     /locus_tag="OESDEN_04739"
                     /codon_start=1
                     /product="hypothetical protein"
                     /protein_id="KHJ95316.1"
                     /translation="MLRVITFLRELAGLWRAYVHSYSFQAKPTLCRMCVLQARANKLT
                     SLAGVAISSHEHSGQKRKREEKKKKRFNEYFFLKTDKSQGEPAKVAKRDPFADSGENV
                     LLVQQLRDQITKLHSLVAQKEAAMLEKDKKIATLQADLMSAERKHREKVEQLLKEKDE
                     AIQMVIERQRQANKQVKK"
     assembly_gap    39300..40086
                     /estimated_length=787
                     /gap_type="within scaffold"
                     /linkage_evidence="paired-ends"
     assembly_gap    40941..42756
                     /estimated_length=1816
                     /gap_type="within scaffold"
                     /linkage_evidence="paired-ends"
     assembly_gap    43920..44124
                     /estimated_length=205
                     /gap_type="within scaffold"
                     /linkage_evidence="paired-ends"
     assembly_gap    48242..49878
                     /estimated_length=1637
                     /gap_type="within scaffold"
                     /linkage_evidence="paired-ends"
     gene            50284..50837
                     /locus_tag="OESDEN_04740"
     mRNA            join(50284..50327,50432..50837)
                     /locus_tag="OESDEN_04740"
                     /product="hypothetical protein"
     CDS             join(50284..50327,50432..50837)
                     /locus_tag="OESDEN_04740"
                     /codon_start=1
                     /product="hypothetical protein"
                     /protein_id="KHJ95317.1"
                     /translation="MKTTKSGNREETDVATVDRIYDTYAHAPRAAGYLNDDSYYETSM
                     EEAIRFHMPAELCSFFSSLICFCLWERHKRDVSEDFINGGFRTGLAETLAFHEIAERA
                     ALHSVKLNEVLNVNYPPVADSVNKRSMLRYIEIGCIDYKNQPSGVNV"
     assembly_gap    50943..51522
                     /estimated_length=580
                     /gap_type="within scaffold"
                     /linkage_evidence="paired-ends"
     assembly_gap    53622..54905
                     /estimated_length=1284
                     /gap_type="within scaffold"
                     /linkage_evidence="paired-ends"
     gene            complement(61370..62011)
                     /locus_tag="OESDEN_04741"
     mRNA            complement(join(61370..61532,61854..62011))
                     /locus_tag="OESDEN_04741"
                     /product="hypothetical protein"
     CDS             complement(join(61370..61532,61854..62011))
                     /locus_tag="OESDEN_04741"
                     /codon_start=1
                     /product="hypothetical protein"
                     /protein_id="KHJ95318.1"
                     /translation="MPWCAKGESHVLFNKQNDPEPSFHILIIEDSFTALGDENKLGKN
                     FTFEAKHKTTGRSFIFAAEDFKTLEPWVELLMITTVDYVLLLKQSFGEQIDHIQYSEA
                     EPGN"
     assembly_gap    62160..64142
                     /estimated_length=1983
                     /gap_type="within scaffold"
                     /linkage_evidence="paired-ends"
CONTIG      join(JOOK01127775.1:1..2627,gap(100),JOOK01127780.1:1..2243,
            gap(3346),JOOK01127781.1:1..1413,gap(320),JOOK01127782.1:1..11081,
            gap(100),JOOK01127783.1:1..8572,gap(651),JOOK01127784.1:1..8846,
            gap(787),JOOK01127785.1:1..854,gap(1816),JOOK01127786.1:1..1163,
            gap(205),JOOK01127787.1:1..4117,gap(1637),JOOK01127776.1:1..1064,
            gap(580),JOOK01127777.1:1..2099,gap(1284),JOOK01127778.1:1..7254,
            gap(1983),JOOK01127779.1:1..1625)
//