LOCUS       KZ271208                2950 bp    DNA     linear   CON 17-AUG-2017
DEFINITION  Onchocerca flexuosa isolate Red Deer unplaced genomic scaffold
            O_flexuosa-1.0_Cont1295, whole genome shotgun sequence.
ACCESSION   KZ271208 LQNL01000000
VERSION     KZ271208.1
DBLINK      BioProject: PRJNA230512
            BioSample: SAMN04226856
KEYWORDS    WGS; HIGH_QUALITY_DRAFT.
SOURCE      Onchocerca flexuosa
  ORGANISM  Onchocerca flexuosa
            Eukaryota; Metazoa; Ecdysozoa; Nematoda; Chromadorea; Rhabditida;
            Spirurina; Spiruromorpha; Filarioidea; Onchocercidae; Onchocerca.
REFERENCE   1  (bases 1 to 2950)
  AUTHORS   Mitreva,M.
  TITLE     Draft genome of the nematode, Onchocerca flexuosa
  JOURNAL   Unpublished
REFERENCE   2  (bases 1 to 2950)
  AUTHORS   Mitreva,M., Pepin,K.H., Martin,J., Ozersky,P., Zhang,X. and
            Wilson,R.K.
  TITLE     Direct Submission
  JOURNAL   Submitted (17-DEC-2015) McDonnell Genome Institute, Washington
            University School of Medicine, 4444 Forest Park, St. Louis, MO
            63108, USA
COMMENT     Onchocerca flexuosa is a filarial nematode (Nematoda) related to
            Onchocerca volvulus the agent of human African river blindness. O.
            flexuosa is in contrast to O. volvulus free of obligatory Wolbachia
            endosymbionts. Adult worms live in subcutaneous nodules in red
            deer. Infective larvae are transmitted by blackflies of the genus
            Simulium. This species of filarial parasites is widespread in
            Europe and parts of Asia. Adult worms for genome sequencing were
            isolated from naturally infected red deer (Cervus elaphus) by
            Norbert W. Brattig, Samantha McNulty and Kerstin Fischer. DNA and
            RNA samples were provided by Peter U. Fischer
            (Pufische@dom.wustl.edu).
            This assembly consists of fragments, 3kb and 8kb insert whole
            genome shotgun libraries. The sequences were generating on the
            Illumina platform and assembled using Allpaths_LG. To improve
            scaffolding, our in house tool Pygap (Gap closure tool), the
            Pyramid assembler using Illumina paired reads to close gaps and
            extending contigs.
            
            The repeat library was generated using Repeatmodeler (A. Smit, R.
            Hubley http://www.systemsbiology.org/). The Ribosomal RNA genes
            were identified using RNAmmer
            ((http://www.cbs.dtu.dk/cgi-bin/nph-sw_request?rnammer ) and
            transfer RNA's were identified with tRNAscan-SE (Lowe and Eddy,
            1997). Non-coding RNAs, such as microRNAs, were identified by
            sequence homology search of the Rfam database
            (http://selab.janelia.org/software.html). Repeats and predicted
            RNA's were then masked using RepeatMasker (A. Smit, R. Hubley & P.
            Green http://repeatmasker.org). Protein-coding genes were predicted
            using a combination of ab initio programs Snap (I. Korf, 2004),
            Fgenesh (Softberry, Corp) and Augustus (M. Stanke, et. Al 2008) and
            the annotation pipeline tool Maker (M. Yandell et. al., 2007) which
            aligns mRNA, EST and protein information from same species or
            cross-species to aid in gene structure determination and
            modifications. A consensus gene set from the above prediction
            algorithms was generated, using a logical, hierarchical approach
            developed at the Genome institute. Gene product naming was
            determined by BER (JCVI: http://ber.sourceforge.net).
            
            Our goal is to explore this WGS draft sequence of O. flexuosa to
            better define proteins involved in nematode parasitism that impact
            health and disease and are relevant to both host-parasite
            relationships and basic biological processes.
            
            For information regarding this assembly or project, or any other
            GSC genome project, please visit our Genome Groups web page
            (http://genome.wustl.edu/genome_group_index.cgi) and email the
            designated contact person. For specific questions regarding the
            Fasciola hepatica genome project contact Makedonka Mitreva
            (mmitreva@genome.wustl.edu) at Washington University School of
            Medicine. The National Human Genome Research Institute (NHGRI) of
            the National Institutes of Health (NIH) provided funds for this
            project.
            
            ##Genome-Assembly-Data-START##
            Finishing Goal           :: High-Quality Draft
            Current Finishing Status :: High-Quality Draft
            Assembly Method          :: AllPaths_LG v. 2012-12-28
            Assembly Name            :: O_flexuosa_1.0.allpaths.pg.lrna
            Genome Coverage          :: 66.0x
            Sequencing Technology    :: Illumina HiSeq 2000
            ##Genome-Assembly-Data-END##
FEATURES             Location/Qualifiers
     source          1..2950
                     /organism="Onchocerca flexuosa"
                     /mol_type="genomic DNA"
                     /submitter_seqid="O_flexuosa-1.0_Cont1295"
                     /isolate="Red Deer"
                     /isolation_source="Subcutaneous nodules"
                     /host="Cervus elaphus (red deer)"
                     /db_xref="taxon:387005"
                     /chromosome="Unknown"
                     /country="Germany"
                     /collection_date="2007"
                     /collected_by="Norbert W. Brattig, Samantha McNulty,
                     Kerstin Fischer"
     gene            complement(1481..1846)
                     /locus_tag="X798_07609"
     mRNA            complement(join(1481..1714,1787..1846))
                     /locus_tag="X798_07609"
                     /product="collagen triple helix repeat protein"
     CDS             complement(join(1481..1714,1787..1846))
                     /locus_tag="X798_07609"
                     /inference="protein motif:HMMPfam:IPR008160"
                     /note="KEGG: scl:sce7039 0.0012 putative 5'-nucleotidase
                     family protein; K01081 5'-nucleotidase"
                     /codon_start=1
                     /product="collagen triple helix repeat protein"
                     /protein_id="OZC05418.1"
                     /db_xref="InterPro:IPR008160"
                     /translation="MPMRLFIIFLLFVCAAFMAQSCGPPQRGPPGPPGQNGQDGSSGD
                     RGPQGPEGEMGEQGMTGQPGPRGPPGETGVIGEGGEDGDKGMKGIRGQDVAGG"
CONTIG      join(LQNL01004846.1:1..2950)
//