LOCUS KHJ99694.1 863 aa PRT CON 15-DEC-2014 DEFINITION Oesophagostomum dentatum peptidase family M13 protein. ACCESSION KN549208-10 PROTEIN_ID KHJ99694.1 SOURCE Oesophagostomum dentatum ORGANISM Oesophagostomum dentatum Eukaryota; Metazoa; Ecdysozoa; Nematoda; Chromadorea; Rhabditida; Rhabditina; Rhabditomorpha; Strongyloidea; Strongylidae; Oesophagostomum. REFERENCE 1 (bases 1 to 574610) AUTHORS Mitreva,M. TITLE Draft genome of the hookworm Oesophagostomum dentatum JOURNAL Unpublished REFERENCE 2 (bases 1 to 574610) AUTHORS Mitreva,M., Pepin,K.H., Abubucker,S., Martin,J., Minx,P., Warren,C., Palsikar,V.B., Zhang,X. and Wilson,R.K. TITLE Direct Submission JOURNAL Submitted (29-MAR-2014) The Genome Institute, Washington University School of Medicine, 4444 Forest Park, St. Louis, MO 63108, USA COMMENT Oesophagostomum dentatum, a Nodule worm, infects various livestock species and can also be found in humans. The larval stages invade the large intestinal wall and cause granulomatous inflammatory reactions. O. dentatum infects pigs in all production systems worldwide. Due to the lack of effective immune reactions worm burdens can be high and the parasite is found in pigs of all age groups. O. dentatum can be cultivated in vitro in all life cycle stages and serves as a model for strongylid nematodes which constitute a group of parasites most closely related to C. elegans. The sequenced strain (OD-Hann) was collected in Lower Saxony, Germany, around 1990 by Prof. Dr. Arwid Daugschies, and is since maintained in pigs, currently at the Institute of Parasitology, Vetmeduni Vienna, Austria. Material for sequencing was obtained from Prof Dr Anja Joachim (Anja.Joachim@vetmeduni.ac.at) of the Vetmeduni Vienna where it has been maintained since 2003. Worm isolation and extraction of nucleic acids was done by Prof Joachim and coworkers at the Institute of Parasitology, Vetmeduni Vienna, or the Genome Institute production team. This assembly consists of fragments, 3kb and 8kb insert whole genome shotgun libraries. The sequences were generating on the Roch/454 platform and assembled using Newbler. To improve scaffolding, in house tools CIGA (Cdna tool for Improving Genome Assembly) and Pygap (Gap closure tool) were used to map 454 cDNA reads using blat to the genomic assembly to link genomic contigs based on cDNA evidence. Only joins confirmed by additional independent data typing were accepted and used to close gaps, followed by the Pyramid assembler using Illumina paired reads to close gaps and extending contigs. The repeat library was generated using Repeatmodeler (A. Smit, R. Hubley http://www.systemsbiology.org/). The Ribosomal RNA genes were identified using RNAmmer ((http://www.cbs.dtu.dk/cgi-bin/nph-sw_request?rnammer ) and transfer RNA's were identified with tRNAscan-SE (Lowe and Eddy, 1997). Non-coding RNAs, such as microRNAs, were identified by sequence homology search of the Rfam database (http://selab.janelia.org/software.html). Repeats and predicted RNA's were then masked using RepeatMasker (A. Smit, R. Hubley & P. Green http://repeatmasker.org). Protein-coding genes were predicted using a combination of ab initio programs Snap (I. Korf, 2004), Fgenesh (Softberry, Corp) and Augustus (M. Stanke, et. Al 2008) and the annotation pipeline tool Maker (M. Yandell et. al., 2007) which aligns mRNA, EST and protein information from same species or cross-species to aid in gene structure determination and modifications. A consensus gene set from the above prediction algorithms was generated, using a logical, hierarchical approach developed at the Genome institute. Gene product naming was determined by BER (JCVI: http://ber.sourceforge.net). Our goal is to explore this WGS draft sequence of Oesophagostomum dentatum to better define proteins involved in nematode parasitism that impact health and disease and are relevant to both host-parasite relationships and basic biological processes. For information regarding this assembly or project, or any other GSC genome project, please visit our Genome Groups web page (http://genome.wustl.edu/genome_group_index.cgi) and email the designated contact person. For specific questions regarding the Oesophagostomum dentatum genome project contact Makedonka Mitreva (mmitreva@genome.wustl.edu) at Washington University School of Medicine. The National Human Genome Research Institute (NHGRI) of the National Institutes of Health (NIH) provided funds for this project. ##Genome-Assembly-Data-START## Finishing Goal :: High-Quality Draft Current Finishing Status :: High-Quality Draft Assembly Method :: Newbler v. MapAsmResearch-10/14/2011 Assembly Name :: O_dentatum_10.0.ec.cg.pg Genome Coverage :: 15.0x Sequencing Technology :: LaRoche 454 ##Genome-Assembly-Data-END## FEATURES Qualifiers source /organism="Oesophagostomum dentatum" /mol_type="genomic DNA" /submitter_seqid="O_dentatum_1.0_Cont9" /strain="OD-Hann" /db_xref="taxon:61180" /chromosome="Unknown" /lab_host="pig" /country="Germany: Lower Saxony" /collection_date="1990" /collected_by="Arwid Daugschies" protein /locus_tag="OESDEN_00301" /inference="protein motif:HMMPfam:IPR008753" /inference="protein motif:HMMPfam:IPR018497" /note="KEGG: cel:F18A12.8 1.5e-77 peptidase; hypothetical protein; K01415 endothelin-converting enzyme" /db_xref="InterPro:IPR008753" /db_xref="InterPro:IPR018497" intron_pos 12:1 (1/22) intron_pos 124:0 (2/22) intron_pos 151:0 (3/22) intron_pos 203:0 (4/22) intron_pos 230:2 (5/22) intron_pos 285:0 (6/22) intron_pos 335:0 (7/22) intron_pos 357:2 (8/22) intron_pos 402:0 (9/22) intron_pos 438:0 (10/22) intron_pos 469:2 (11/22) intron_pos 503:1 (12/22) intron_pos 533:0 (13/22) intron_pos 553:1 (14/22) intron_pos 582:0 (15/22) intron_pos 617:2 (16/22) intron_pos 663:2 (17/22) intron_pos 691:1 (18/22) intron_pos 731:2 (19/22) intron_pos 760:1 (20/22) intron_pos 781:2 (21/22) intron_pos 815:0 (22/22) BEGIN 1 MLQAKLLGAN TGLVGAICAL AIASLVFNIL IWNKVNKDDD VTKSAPIPVE PIPLEEVRIR 61 TDPVTTTKDK PRTIAFPMKT RLRTISSTFT AANPTNSKPL PVRVNDTIYC PSYGKPDTSD 121 AYKEAASYLL SGLDQSVDPC EDFYAFSCNT YVKNHNASEI GVSRVAAYDE AQQQVDVEIV 181 EALQAVDIGD SSQSLTERLT KAALLECVYH SRARTPVDNS KDVLIEMRDL FGGIPFLNHS 241 LKEGLDFFSV MGELEQNHAM GSLLHAAVSV DFKNVQQHTL FISQPILPIP RDYYVLPQHT 301 TVLEDRIKLV TKVLQSFAET VLDDASPYID LIKTSARDVV KLEMQIAMAS WPESAMRNYA 361 QQYNPYKLEQ LEKAYPSIKW KSYFNAMLST VSSTFDITKK NIIIAQPSYF GWLNALFTGE 421 TVDAKTIANY LLTHLIFEDA DFMGGNIKTH VMKSDYVRYA LRKGKGATRI GVQQFPRIFR 481 DSKDDPNIEC LNTIMVYMPF GPGYVYVKSK KNRDDVAKDI QHQTELVFKS FMKMIKELEW 541 MSTNAQKLAA EKATKMIRNY GWPKDLFGDF SNSQKVDAYH QTDYGDIINY YKTNSTHLYY 601 KIRKTMLKGY SNRESFRTVV SFLNRSFSRD QFLMSPAMVN AWYAPERNSI TFPYAIWNPP 661 YYNYGYPQAY NYGGQAGTGG HELVHGFDDQ GVQFGADGSL SGCTWIECGW MEPEVKASFN 721 NMAQCVVTQY STQCCPAKSG NIRCANGETT QGENIADIGG QLAAYYAYRE YVKELGKEEM 781 RLPGLEQYTP NQLFWISYGF TWCMSQTESK LISQLLTDLH APGSCRVNQV MQDIPEFAKD 841 FGCTIGQNMY PLPEQRCAVW VSE //