KHJ99694.1

LOCUS       KHJ99694.1               863 aa    PRT              CON 15-DEC-2014
DEFINITION  Oesophagostomum dentatum peptidase family M13 protein.
ACCESSION   KN549208-10
PROTEIN_ID  KHJ99694.1
SOURCE      Oesophagostomum dentatum
  ORGANISM  Oesophagostomum dentatum
            Eukaryota; Metazoa; Ecdysozoa; Nematoda; Chromadorea; Rhabditida;
            Rhabditina; Rhabditomorpha; Strongyloidea; Strongylidae;
            Oesophagostomum.
REFERENCE   1  (bases 1 to 574610)
  AUTHORS   Mitreva,M.
  TITLE     Draft genome of the hookworm Oesophagostomum dentatum
  JOURNAL   Unpublished
REFERENCE   2  (bases 1 to 574610)
  AUTHORS   Mitreva,M., Pepin,K.H., Abubucker,S., Martin,J., Minx,P.,
            Warren,C., Palsikar,V.B., Zhang,X. and Wilson,R.K.
  TITLE     Direct Submission
  JOURNAL   Submitted (29-MAR-2014) The Genome Institute, Washington University
            School of Medicine, 4444 Forest Park, St. Louis, MO 63108, USA
COMMENT     Oesophagostomum dentatum, a Nodule worm, infects various livestock
            species and can also be found in humans. The larval stages invade
            the large intestinal wall and cause granulomatous inflammatory
            reactions. O. dentatum infects pigs in all production systems
            worldwide. Due to the lack of effective immune reactions worm
            burdens can be high and the parasite is found in pigs of all age
            groups. O. dentatum can be cultivated in vitro in all life cycle
            stages and serves as a model for strongylid nematodes which
            constitute a group of parasites most closely related to C. elegans.
            
            The sequenced strain (OD-Hann) was collected in Lower Saxony,
            Germany, around 1990 by Prof. Dr. Arwid Daugschies, and is since
            maintained in pigs, currently at the Institute of Parasitology,
            Vetmeduni Vienna, Austria. Material for sequencing was obtained
            from Prof Dr Anja Joachim (Anja.Joachim@vetmeduni.ac.at) of the
            Vetmeduni Vienna where it has been maintained since 2003. Worm
            isolation and extraction of nucleic acids was done by Prof Joachim
            and coworkers at the Institute of Parasitology, Vetmeduni Vienna,
            or the Genome Institute production team.
            
            This assembly consists of fragments, 3kb and 8kb insert whole
            genome shotgun libraries. The sequences were generating on the
            Roch/454 platform and assembled using Newbler. To improve
            scaffolding, in house tools CIGA (Cdna tool for Improving Genome
            Assembly) and Pygap (Gap closure tool) were used to map 454 cDNA
            reads using blat to the genomic assembly to link genomic contigs
            based on cDNA evidence. Only joins confirmed by additional
            independent data typing were accepted and used to close gaps,
            followed by the Pyramid assembler using Illumina paired reads to
            close gaps and extending contigs.
            
            The repeat library was generated using Repeatmodeler (A. Smit, R.
            Hubley http://www.systemsbiology.org/). The Ribosomal RNA genes
            were identified using RNAmmer
            ((http://www.cbs.dtu.dk/cgi-bin/nph-sw_request?rnammer ) and
            transfer RNA's were identified with tRNAscan-SE (Lowe and Eddy,
            1997). Non-coding RNAs, such as microRNAs, were identified by
            sequence homology search of the Rfam database
            (http://selab.janelia.org/software.html). Repeats and predicted
            RNA's were then masked using RepeatMasker (A. Smit, R. Hubley & P.
            Green http://repeatmasker.org). Protein-coding genes were predicted
            using a combination of ab initio programs Snap (I. Korf, 2004),
            Fgenesh (Softberry, Corp) and Augustus (M. Stanke, et. Al 2008) and
            the annotation pipeline tool Maker (M. Yandell et. al., 2007) which
            aligns mRNA, EST and protein information from same species or
            cross-species to aid in gene structure determination and
            modifications. A consensus gene set from the above prediction
            algorithms was generated, using a logical, hierarchical approach
            developed at the Genome institute. Gene product naming was
            determined by BER (JCVI: http://ber.sourceforge.net).
            
            Our goal is to explore this WGS draft sequence of Oesophagostomum
            dentatum to better define proteins involved in nematode parasitism
            that impact health and disease and are relevant to both
            host-parasite relationships and basic biological processes.
            
            For information regarding this assembly or project, or any other
            GSC genome project, please visit our Genome Groups web page
            (http://genome.wustl.edu/genome_group_index.cgi) and email the
            designated contact person. For specific questions regarding the
            Oesophagostomum dentatum genome project contact Makedonka Mitreva
            (mmitreva@genome.wustl.edu) at Washington University School of
            Medicine. The National Human Genome Research Institute (NHGRI) of
            the National Institutes of Health (NIH) provided funds for this
            project.
            
            ##Genome-Assembly-Data-START##
            Finishing Goal           :: High-Quality Draft
            Current Finishing Status :: High-Quality Draft
            Assembly Method          :: Newbler v. MapAsmResearch-10/14/2011
            Assembly Name            :: O_dentatum_10.0.ec.cg.pg
            Genome Coverage          :: 15.0x
            Sequencing Technology    :: LaRoche 454
            ##Genome-Assembly-Data-END##
FEATURES             Qualifiers
     source          /organism="Oesophagostomum dentatum"
                     /mol_type="genomic DNA"
                     /submitter_seqid="O_dentatum_1.0_Cont9"
                     /strain="OD-Hann"
                     /db_xref="taxon:61180"
                     /chromosome="Unknown"
                     /lab_host="pig"
                     /country="Germany: Lower Saxony"
                     /collection_date="1990"
                     /collected_by="Arwid Daugschies"
     protein         /locus_tag="OESDEN_00301"
                     /inference="protein motif:HMMPfam:IPR008753"
                     /inference="protein motif:HMMPfam:IPR018497"
                     /note="KEGG: cel:F18A12.8 1.5e-77 peptidase; hypothetical
                     protein; K01415 endothelin-converting enzyme"
                     /db_xref="InterPro:IPR008753"
                     /db_xref="InterPro:IPR018497"
     intron_pos      12:1 (1/22)
     intron_pos      124:0 (2/22)
     intron_pos      151:0 (3/22)
     intron_pos      203:0 (4/22)
     intron_pos      230:2 (5/22)
     intron_pos      285:0 (6/22)
     intron_pos      335:0 (7/22)
     intron_pos      357:2 (8/22)
     intron_pos      402:0 (9/22)
     intron_pos      438:0 (10/22)
     intron_pos      469:2 (11/22)
     intron_pos      503:1 (12/22)
     intron_pos      533:0 (13/22)
     intron_pos      553:1 (14/22)
     intron_pos      582:0 (15/22)
     intron_pos      617:2 (16/22)
     intron_pos      663:2 (17/22)
     intron_pos      691:1 (18/22)
     intron_pos      731:2 (19/22)
     intron_pos      760:1 (20/22)
     intron_pos      781:2 (21/22)
     intron_pos      815:0 (22/22)
BEGIN
        1 MLQAKLLGAN TGLVGAICAL AIASLVFNIL IWNKVNKDDD VTKSAPIPVE PIPLEEVRIR
       61 TDPVTTTKDK PRTIAFPMKT RLRTISSTFT AANPTNSKPL PVRVNDTIYC PSYGKPDTSD
      121 AYKEAASYLL SGLDQSVDPC EDFYAFSCNT YVKNHNASEI GVSRVAAYDE AQQQVDVEIV
      181 EALQAVDIGD SSQSLTERLT KAALLECVYH SRARTPVDNS KDVLIEMRDL FGGIPFLNHS
      241 LKEGLDFFSV MGELEQNHAM GSLLHAAVSV DFKNVQQHTL FISQPILPIP RDYYVLPQHT
      301 TVLEDRIKLV TKVLQSFAET VLDDASPYID LIKTSARDVV KLEMQIAMAS WPESAMRNYA
      361 QQYNPYKLEQ LEKAYPSIKW KSYFNAMLST VSSTFDITKK NIIIAQPSYF GWLNALFTGE
      421 TVDAKTIANY LLTHLIFEDA DFMGGNIKTH VMKSDYVRYA LRKGKGATRI GVQQFPRIFR
      481 DSKDDPNIEC LNTIMVYMPF GPGYVYVKSK KNRDDVAKDI QHQTELVFKS FMKMIKELEW
      541 MSTNAQKLAA EKATKMIRNY GWPKDLFGDF SNSQKVDAYH QTDYGDIINY YKTNSTHLYY
      601 KIRKTMLKGY SNRESFRTVV SFLNRSFSRD QFLMSPAMVN AWYAPERNSI TFPYAIWNPP
      661 YYNYGYPQAY NYGGQAGTGG HELVHGFDDQ GVQFGADGSL SGCTWIECGW MEPEVKASFN
      721 NMAQCVVTQY STQCCPAKSG NIRCANGETT QGENIADIGG QLAAYYAYRE YVKELGKEEM
      781 RLPGLEQYTP NQLFWISYGF TWCMSQTESK LISQLLTDLH APGSCRVNQV MQDIPEFAKD
      841 FGCTIGQNMY PLPEQRCAVW VSE
//