LOCUS       KHJ99715.1              1987 aa    PRT              CON 15-DEC-2014
DEFINITION  Oesophagostomum dentatum myosin head protein.
ACCESSION   KN549208-31
PROTEIN_ID  KHJ99715.1
SOURCE      Oesophagostomum dentatum
  ORGANISM  Oesophagostomum dentatum
            Eukaryota; Metazoa; Ecdysozoa; Nematoda; Chromadorea; Rhabditida;
            Rhabditina; Rhabditomorpha; Strongyloidea; Strongylidae;
            Oesophagostomum.
REFERENCE   1  (bases 1 to 574610)
  AUTHORS   Mitreva,M.
  TITLE     Draft genome of the hookworm Oesophagostomum dentatum
  JOURNAL   Unpublished
REFERENCE   2  (bases 1 to 574610)
  AUTHORS   Mitreva,M., Pepin,K.H., Abubucker,S., Martin,J., Minx,P.,
            Warren,C., Palsikar,V.B., Zhang,X. and Wilson,R.K.
  TITLE     Direct Submission
  JOURNAL   Submitted (29-MAR-2014) The Genome Institute, Washington University
            School of Medicine, 4444 Forest Park, St. Louis, MO 63108, USA
COMMENT     Oesophagostomum dentatum, a Nodule worm, infects various livestock
            species and can also be found in humans. The larval stages invade
            the large intestinal wall and cause granulomatous inflammatory
            reactions. O. dentatum infects pigs in all production systems
            worldwide. Due to the lack of effective immune reactions worm
            burdens can be high and the parasite is found in pigs of all age
            groups. O. dentatum can be cultivated in vitro in all life cycle
            stages and serves as a model for strongylid nematodes which
            constitute a group of parasites most closely related to C. elegans.
            
            The sequenced strain (OD-Hann) was collected in Lower Saxony,
            Germany, around 1990 by Prof. Dr. Arwid Daugschies, and is since
            maintained in pigs, currently at the Institute of Parasitology,
            Vetmeduni Vienna, Austria. Material for sequencing was obtained
            from Prof Dr Anja Joachim (Anja.Joachim@vetmeduni.ac.at) of the
            Vetmeduni Vienna where it has been maintained since 2003. Worm
            isolation and extraction of nucleic acids was done by Prof Joachim
            and coworkers at the Institute of Parasitology, Vetmeduni Vienna,
            or the Genome Institute production team.
            
            This assembly consists of fragments, 3kb and 8kb insert whole
            genome shotgun libraries. The sequences were generating on the
            Roch/454 platform and assembled using Newbler. To improve
            scaffolding, in house tools CIGA (Cdna tool for Improving Genome
            Assembly) and Pygap (Gap closure tool) were used to map 454 cDNA
            reads using blat to the genomic assembly to link genomic contigs
            based on cDNA evidence. Only joins confirmed by additional
            independent data typing were accepted and used to close gaps,
            followed by the Pyramid assembler using Illumina paired reads to
            close gaps and extending contigs.
            
            The repeat library was generated using Repeatmodeler (A. Smit, R.
            Hubley http://www.systemsbiology.org/). The Ribosomal RNA genes
            were identified using RNAmmer
            ((http://www.cbs.dtu.dk/cgi-bin/nph-sw_request?rnammer ) and
            transfer RNA's were identified with tRNAscan-SE (Lowe and Eddy,
            1997). Non-coding RNAs, such as microRNAs, were identified by
            sequence homology search of the Rfam database
            (http://selab.janelia.org/software.html). Repeats and predicted
            RNA's were then masked using RepeatMasker (A. Smit, R. Hubley & P.
            Green http://repeatmasker.org). Protein-coding genes were predicted
            using a combination of ab initio programs Snap (I. Korf, 2004),
            Fgenesh (Softberry, Corp) and Augustus (M. Stanke, et. Al 2008) and
            the annotation pipeline tool Maker (M. Yandell et. al., 2007) which
            aligns mRNA, EST and protein information from same species or
            cross-species to aid in gene structure determination and
            modifications. A consensus gene set from the above prediction
            algorithms was generated, using a logical, hierarchical approach
            developed at the Genome institute. Gene product naming was
            determined by BER (JCVI: http://ber.sourceforge.net).
            
            Our goal is to explore this WGS draft sequence of Oesophagostomum
            dentatum to better define proteins involved in nematode parasitism
            that impact health and disease and are relevant to both
            host-parasite relationships and basic biological processes.
            
            For information regarding this assembly or project, or any other
            GSC genome project, please visit our Genome Groups web page
            (http://genome.wustl.edu/genome_group_index.cgi) and email the
            designated contact person. For specific questions regarding the
            Oesophagostomum dentatum genome project contact Makedonka Mitreva
            (mmitreva@genome.wustl.edu) at Washington University School of
            Medicine. The National Human Genome Research Institute (NHGRI) of
            the National Institutes of Health (NIH) provided funds for this
            project.
            
            ##Genome-Assembly-Data-START##
            Finishing Goal           :: High-Quality Draft
            Current Finishing Status :: High-Quality Draft
            Assembly Method          :: Newbler v. MapAsmResearch-10/14/2011
            Assembly Name            :: O_dentatum_10.0.ec.cg.pg
            Genome Coverage          :: 15.0x
            Sequencing Technology    :: LaRoche 454
            ##Genome-Assembly-Data-END##
FEATURES             Qualifiers
     source          /organism="Oesophagostomum dentatum"
                     /mol_type="genomic DNA"
                     /submitter_seqid="O_dentatum_1.0_Cont9"
                     /strain="OD-Hann"
                     /db_xref="taxon:61180"
                     /chromosome="Unknown"
                     /lab_host="pig"
                     /country="Germany: Lower Saxony"
                     /collection_date="1990"
                     /collected_by="Arwid Daugschies"
     protein         /locus_tag="OESDEN_00322"
                     /inference="protein motif:HMMPfam:IPR001609"
                     /inference="protein motif:HMMPfam:IPR002928"
                     /inference="protein motif:HMMPfam:IPR004009"
                     /note="KEGG: dme:Dmel_CG15792 0. zip; zipper K10352"
                     /db_xref="InterPro:IPR001609"
                     /db_xref="InterPro:IPR002928"
                     /db_xref="InterPro:IPR004009"
     intron_pos      46:2 (1/46)
     intron_pos      82:0 (2/46)
     intron_pos      111:0 (3/46)
     intron_pos      163:1 (4/46)
     intron_pos      172:2 (5/46)
     intron_pos      213:0 (6/46)
     intron_pos      244:0 (7/46)
     intron_pos      298:1 (8/46)
     intron_pos      346:1 (9/46)
     intron_pos      378:1 (10/46)
     intron_pos      404:0 (11/46)
     intron_pos      463:2 (12/46)
     intron_pos      504:1 (13/46)
     intron_pos      579:1 (14/46)
     intron_pos      619:1 (15/46)
     intron_pos      680:0 (16/46)
     intron_pos      741:2 (17/46)
     intron_pos      775:2 (18/46)
     intron_pos      803:0 (19/46)
     intron_pos      834:0 (20/46)
     intron_pos      878:0 (21/46)
     intron_pos      910:0 (22/46)
     intron_pos      943:0 (23/46)
     intron_pos      985:0 (24/46)
     intron_pos      1025:0 (25/46)
     intron_pos      1091:2 (26/46)
     intron_pos      1135:2 (27/46)
     intron_pos      1174:0 (28/46)
     intron_pos      1227:0 (29/46)
     intron_pos      1278:0 (30/46)
     intron_pos      1319:0 (31/46)
     intron_pos      1358:0 (32/46)
     intron_pos      1388:0 (33/46)
     intron_pos      1420:0 (34/46)
     intron_pos      1458:0 (35/46)
     intron_pos      1508:0 (36/46)
     intron_pos      1553:2 (37/46)
     intron_pos      1577:0 (38/46)
     intron_pos      1632:0 (39/46)
     intron_pos      1674:2 (40/46)
     intron_pos      1711:0 (41/46)
     intron_pos      1754:0 (42/46)
     intron_pos      1789:0 (43/46)
     intron_pos      1880:0 (44/46)
     intron_pos      1933:2 (45/46)
     intron_pos      1973:1 (46/46)
BEGIN
        1 MEESDLRFLQ VQRAAVADPA RASEWAGKKL CWVPHEKDGF VAGSIKKETN DEVIVEICDT
       61 GKTVTISKDD VQKANPPKFD KVEDMSELTY LNEASVLHNL KERYFSSLIY TYSGLFCVVI
      121 NPYKRLPIYS ESLIEEFKGK KRHEMPPHIF AIADSAYRSM LQDREDQSIL CTGESGAGKT
      181 ENTKKVIQYL AHVAGATRSK GGPQGPAASP AKGELEHQLL QANPILEAFG NSKTVKNDNS
      241 SRFGKFIRIN FDMSGYISGA NIEFYLLEKS RTLRQAPDER SFHIFYQFLR GTTPAEKANY
      301 LLEDIDKYRF LVNGNITLPN VDDAQEFQST LKSMRIMGFA EDEITSVLRV VSATILMGNF
      361 EFTQEKKSDQ AILPDDRVIQ KVCHLLGLQV IELTKAFLRP RIKVGREFVN KAQNKEQAEF
      421 AVEAIAKASY ERMFKWLVNR INKSLDRTRR QGASFIGILD IAGFEIFELN SFEQLCINYT
      481 NEKLQQLFNN TMFILEQEEY QREGIEWQFI DFGLDLQPTI DLIEKPMGLL ALLDEQCLFP
      541 KATDKTLVEK LQKTHSKHPK FIVPDMRAKS DFAVVHYAGR VDYSADQWLM KNMDPLNENV
      601 VALMQASTDP FVCGIWKDAE FAGICAAEMN ETAFGVRAKK GMFRTVSQLH KEQLTRLMTT
      661 LRNTSPHFVR CIIPNHEKKA GKINSMLVLE QLRCNGVLEG IRICRQGFPN RVPFQEFRHR
      721 YEILTPNVIP RGFMDGKESV KKMIEYLEID ANLYRIGQSK VFFRTGVLAH LEEERDLKLT
      781 DLIIQFQAQC RAFLARRLYV KRMQQSSAIR VLQRNGLAWM KLRNWQWWRL FTKVKPLLQV
      841 TNQEAAISAK EDELRVVREK LDKVESEYKE SLSKIDQVLA ERNVLQDQLQ QETDNNAELE
      901 EVKNRLQLKK NELEEMVNEM RDRLVDEEQR TEKMSQEKKK LVETVRDLEE QLEQEEQARQ
      961 KLQLDKANVD QRVKNVETKL VDITDAHDKL LKEKRILEDK MNQLNLQLSE EEERVKQVAR
     1021 QRGKVEGHVQ ELEQELLRER QIKSELEQQK RKLITELDDS RELLEEKRGK LEELNGQLMK
     1081 REEELSQVLT RSDEEAATIA LLQKQIRDMQ ATIDELREDI ETERAARNKA EMARREVVAQ
     1141 LEKVKGDMLD KVDETSVLQD IMRRKEDEVR DLKKALETST HTLENKLEEQ KLKYNRQIEE
     1201 LHEKIELQKK VNAQQEKYKH QADTERAELA QELANIQAQK AEADKRRKQQ EVQFLDIQSQ
     1261 LAECDEHRLQ VIEQLEKARE ELEHISRARE DEEQLVSNLN RKIAALEEQL HELSDQVQEE
     1321 TRAKLAQINR VRQLEEEKAA VAEERDEMDA ARQHLERDIG VLRQQLAEAR KKADEGVIQQ
     1381 MEELRKKAQR DLENTQHQLE ESEASKERLI QSKKKLQQEL EDANIELENI RTASREMEKR
     1441 QKKFDMQLAE ERANVQKAIL ERDAHAQESR DRETRILSLV NEIEQMKGTI DETERVRRML
     1501 QLELDESISS KDDVGKNVHE LEKAKRQLEQ TVQEQKATIE ELEDQLGFAE DARLRLEVNI
     1561 QALRAEQDRN LSAKDQEAED KRRSLVKQLR DLEQELENER RSKAGAISQK KKMEAHIAEI
     1621 EQQLDVANRL KDEYNKQLKK NQQMIKEYQH DSEEARQMKE EIASQLRDIE RRLRSAEAEN
     1681 QRLSEANEML TSQKRQLEQE KDELEELRGR GGSFSSEEKR RLEQKLAQLE EELEEEQNNA
     1741 EIAIDKQRKA QQQLEQLTTE LSMERSVAQK SEAERQGLER QNRELKAKIA ELESTAQSRA
     1801 RAQIAALEAK IQYLEEQNSV ESQERHNATR QFRFVRAHIS AHSKMMVDCV LFRRIEKRLH
     1861 DTILQLEDER RNVEQQKEIA EKCNLRAKQM RRQLDEQEEE MTRERAKSRN LQREIDDLTE
     1921 ANDTLTRENN TLRGGAARRN RENMRLRSAY QIPGSSDNLS RNDDEDGSIG TEVTGSDHTD
     1981 ELKKTSV
//