LOCUS KHJ47392.1 1705 aa PRT CON 17-DEC-2014 DEFINITION Trichuris suis GTPase-activator protein protein. ACCESSION KN538379-148 PROTEIN_ID KHJ47392.1 SOURCE Trichuris suis (pig whipworm) ORGANISM Trichuris suis Eukaryota; Metazoa; Ecdysozoa; Nematoda; Enoplea; Dorylaimia; Trichinellida; Trichuridae; Trichuris. REFERENCE 1 (bases 1 to 2343098) AUTHORS Mitreva,M. TITLE Draft genome of Trichuris suis JOURNAL Unpublished REFERENCE 2 (bases 1 to 2343098) AUTHORS Mitreva,M., Pepin,K.H., Abubucker,S., Martin,J., Minx,P., Warren,C., Palsikar,V.B., Zhang,X., Rosa,B.A. and Wilson,R.K. TITLE Direct Submission JOURNAL Submitted (27-JAN-2014) The Genome Institute, Washington University School of Medicine, 4444 Forest Park, St. Louis, MO 63108, USA COMMENT Trichuris suis contaminated a dirt lot located at the USDA, Agricultural Research Service, Beltsville Agricultural Research Center, Animal Parasitic Disease Laboratory in Beltsville, MD since the early 1960s. Adult worms were isolated for passage from pigs placed on the lot and naturally infected. The T. suis adults were manually removed from the cecum and proximal colon tissue and cultured in vitro to release fertilized eggs that were removed after 24-48 hours and embryonated to an infective stage (Hill et al., Experimental Parasitology 77, 170-178, 1993). The strain has been actively passed in pigs one to two times per year since that time and characterized for pathogenesis in pigs (Mansfield and Urban, Veterinary Immunology and Immunopathology 50, 1-17, 1996). This strain of T. suis has also been used therapeutically in human subjects with inflammatory bowel disease (Trichuris suis therapy in Crohn's disease. Summers RW, Elliott DE, Urban JF Jr, Thompson R, Weinstock JV. Gut. 2005 Jan;54(1):87-90. The Genome Institute collaborators that provided material for the genome/transcriptome sequencing are: USDA - Urban, Jr., J.F., Hill, D. E. and Michigan State University - Mansfield L.S. The repeat library was generated using Repeatmodeler (A. Smit, R. Hubley http://www.systemsbiology.org/). The Ribosomal RNA genes were identified using RNAmmer ((http://www.cbs.dtu.dk/cgi-bin/nph-sw_request?rnammer ) and transfer RNA's were identified with tRNAscan-SE (Lowe and Eddy, 1997). Non-coding RNAs, such as microRNAs, were identified by sequence homology search of the Rfam database (http://selab.janelia.org/software.html). Repeats and predicted RNA's were then masked using RepeatMasker (A. Smit, R. Hubley & P. Green http://repeatmasker.org). Protein-coding genes were predicted using a combination of ab initio programs Snap (I. Korf, 2004), Fgenesh (Softberry, Corp) and Augustus (M. Stanke, et. Al 2008) and the annotation pipeline tool Maker (M. Yandell et. al., 2007) which aligns mRNA, EST and protein information from same species or cross-species to aid in gene structure determination and modifications. A consensus gene set from the above prediction algorithms was generated, using a logical, hierarchical approach developed at the Genome institute. Gene product naming was determined by BER (JCVI: http://ber.sourceforge.net). Our goal is to explore this WGS draft sequence of Trichuris suis to better define proteins involved in nematode parasitism that impact health and disease and are relevant to both host-parasite relationships and basic biological processes. For information regarding this assembly or project, or any other GSC genome project, please visit our Genome Groups web page (http://genome.wustl.edu/genome_group_index.cgi) and email the designated contact person. For specific questions regarding the Trichuris suis genome project contact Makedonka Mitreva (mmitreva@genome.wustl.edu) at Washington University School of Medicine. The National Human Genome Research Institute (NHGRI) of the National Institutes of Health (NIH) provided funds for this project. ##Genome-Assembly-Data-START## Finishing Goal :: High-Quality Draft Current Finishing Status :: High-Quality Draft Assembly Method :: ALLPATHS_LG v. 2012-11-02 Assembly Name :: T_suis_1.0.allpaths Genome Coverage :: 392x Sequencing Technology :: Illumina ##Genome-Assembly-Data-END## FEATURES Qualifiers source /organism="Trichuris suis" /mol_type="genomic DNA" /submitter_seqid="T_suis-1.0_Cont4" /isolation_source="cecum and proximal colon of infected animals which were naturally infected" /host="Sus scrofa (pig)" /db_xref="taxon:68888" /chromosome="Unknown" /dev_stage="adult" /country="USA: Beltsville, MD" protein /locus_tag="D918_02252" /inference="protein motif:HMMPfam:IPR001936" /note="KEGG: edi:EDI_018350 2.0e-11 ras GTPase-activating protein with iq motif" /db_xref="InterPro:IPR001936" intron_pos 55:2 (1/30) intron_pos 235:2 (2/30) intron_pos 268:0 (3/30) intron_pos 353:2 (4/30) intron_pos 384:2 (5/30) intron_pos 426:1 (6/30) intron_pos 484:2 (7/30) intron_pos 562:2 (8/30) intron_pos 618:2 (9/30) intron_pos 660:0 (10/30) intron_pos 706:0 (11/30) intron_pos 761:0 (12/30) intron_pos 786:0 (13/30) intron_pos 853:0 (14/30) intron_pos 892:1 (15/30) intron_pos 928:0 (16/30) intron_pos 985:0 (17/30) intron_pos 1014:0 (18/30) intron_pos 1054:0 (19/30) intron_pos 1080:2 (20/30) intron_pos 1106:1 (21/30) intron_pos 1190:0 (22/30) intron_pos 1239:0 (23/30) intron_pos 1266:0 (24/30) intron_pos 1314:0 (25/30) intron_pos 1385:2 (26/30) intron_pos 1497:2 (27/30) intron_pos 1587:0 (28/30) intron_pos 1615:1 (29/30) intron_pos 1632:0 (30/30) BEGIN 1 MTHRDHVVFQ SEIRFRNKLV DLIASWIICN PSSCSAAVIM NCSPLFSSEA VVTLRELNLC 61 CMIALATLVK DLPLQSEEPT NNVDTLGMNK SQLFQKFFNL FMNLLNDCTL YENMATNARS 121 SYKVETVPLP KAACDSVVIE GNYPNSSPSC VSTSSSATAR RLDENKDRMD RLRSATIQAM 181 SGLLAANIDI GLMHAIGMGY HKDTRARVAF VEVLTKVLQE GTEFETLNET VMADRFDELV 241 KLTTIITSDG ELPIVNALAH VVQTEHMDGL ARLLTTLFAS KNLLHELLWK LFTKEVELAE 301 APTTLFRGNT LASKVMGCCF KLYGYGYLRS LLREFVCATT QNADKCYEVD QSRLEEGANL 361 AENTANLIDL VEDILNLILK SADQFPVQLK SMLNVLFHVV NARFPNTGLL AVGMIVFLRF 421 FNPAIVSPVE YGITDVYPSA STKRGLMLVS KILQQIANQA QNVKENCLHP FSNFIHDKYE 481 IVKQFSTSIA IDFKIHREEL STSSFAYGVD TPAYALHFLL WRYQDRIGEF FNTFRSHDKV 541 SRRLAALLAQ LGPPDGDLEA KQWVNVGFTC SRFEELMEKQ SAADKDEYKF VKETNIFYQY 601 GTSRAGNPVF YFIANRFKTG EVNGNMLIYH VILTLKQFSN QPCELVVDLT HTNTENRFRT 661 DLLSKWFVVI SQSVQQSVAA VYLCNCNSWV REYTRFHDRL FSTLKGNRKL HFLDSVRRLN 721 EFVCPENQRL PTATLLLDED AKTFNNVVRL SHKSIKCSVK IGPMALQIVS LEKQKVLGHL 781 VNLNDVCVVD ENSITVTILN EPSPLSLIHS ECEQIASSVA SIRSKYELSQ PDQSPVHTKI 841 RAKDVPGTLL NMALLNLGST DPSLRTAAYN LLCALAATFN FRMEGQLLET KGLLIPANNT 901 IFIKTVSQKL AVNETHLTLE FLQECIQGFR QSSIEIKHLC LEYITPWLAN LPRFCRPTLG 961 DESRRQAVSR VLDKLITLTI EEFEMYPSIQ AKIWGNVGQV SELLDLILDR FIRQSIAGGL 1021 GSVQAEVLAD TAVTLAAANT ELVSDILIRR LMHLLEKSCL NPQTCLEEHI LWDDIALLAS 1081 HNSPASAVPL CYCAREHRAA SAESIEGTKT NLRLALTEFF LPKFYRLFGV SNLTVKTVAA 1141 TAFRISPRMS LPVHASSEYY ASGIPVPRQD RLTLANLEVI TDSLIELIET CNQDIKNRDM 1201 IQEWAKLAHH FSLRFNPALQ GRALIVYSCL SKTTEEVTFV KRRDDSNLLS AVLMALRRLT 1261 PSLLSEHCNM FPSLFWISII ILQFEDPMLY EHGLALLEQV IMTLTSSNVF EYGSFEKIMM 1321 DARPPLEWEF KAMDQQVGIS FKHHFHFALV SYLLKGYRHP SQSVAPRTMR LLTLLLSTVA 1381 RCTKRERYEV TMDTVAYLIA LLPVSEEVRL RCPLHSSTIK ALSLREASEC SAKVSNSCQE 1441 NVDSPCDAAC KQQVSSLNGT HPPYHGTLGA WKLRHQQRSC DNLDAASPST STAVNPRQST 1501 GYSSQRSYSL PLGSTIVCPV KESSIVNRLY NYDKSLNWIH SAPQAAYTVG LSPRPSEPRS 1561 SFRENLLLDP EVLCEPSIQV LTLTVLSTLV RYSTDDKEMR ILFEYLAEAS VVFPNYGFGG 1621 LWRFAGPFPL KPTSPLAISP ETLTCLERII GNLGEVPPLT PPPLTPCKTN SGSALRFDAL 1681 ESQNSVSGDK RNSIVERLEH SSQAS //