LOCUS KHJ47249.1 1292 aa PRT CON 17-DEC-2014 DEFINITION Trichuris suis ABC transporter transmembrane region protein. ACCESSION KN538379-5 PROTEIN_ID KHJ47249.1 SOURCE Trichuris suis (pig whipworm) ORGANISM Trichuris suis Eukaryota; Metazoa; Ecdysozoa; Nematoda; Enoplea; Dorylaimia; Trichinellida; Trichuridae; Trichuris. REFERENCE 1 (bases 1 to 2343098) AUTHORS Mitreva,M. TITLE Draft genome of Trichuris suis JOURNAL Unpublished REFERENCE 2 (bases 1 to 2343098) AUTHORS Mitreva,M., Pepin,K.H., Abubucker,S., Martin,J., Minx,P., Warren,C., Palsikar,V.B., Zhang,X., Rosa,B.A. and Wilson,R.K. TITLE Direct Submission JOURNAL Submitted (27-JAN-2014) The Genome Institute, Washington University School of Medicine, 4444 Forest Park, St. Louis, MO 63108, USA COMMENT Trichuris suis contaminated a dirt lot located at the USDA, Agricultural Research Service, Beltsville Agricultural Research Center, Animal Parasitic Disease Laboratory in Beltsville, MD since the early 1960s. Adult worms were isolated for passage from pigs placed on the lot and naturally infected. The T. suis adults were manually removed from the cecum and proximal colon tissue and cultured in vitro to release fertilized eggs that were removed after 24-48 hours and embryonated to an infective stage (Hill et al., Experimental Parasitology 77, 170-178, 1993). The strain has been actively passed in pigs one to two times per year since that time and characterized for pathogenesis in pigs (Mansfield and Urban, Veterinary Immunology and Immunopathology 50, 1-17, 1996). This strain of T. suis has also been used therapeutically in human subjects with inflammatory bowel disease (Trichuris suis therapy in Crohn's disease. Summers RW, Elliott DE, Urban JF Jr, Thompson R, Weinstock JV. Gut. 2005 Jan;54(1):87-90. The Genome Institute collaborators that provided material for the genome/transcriptome sequencing are: USDA - Urban, Jr., J.F., Hill, D. E. and Michigan State University - Mansfield L.S. The repeat library was generated using Repeatmodeler (A. Smit, R. Hubley http://www.systemsbiology.org/). The Ribosomal RNA genes were identified using RNAmmer ((http://www.cbs.dtu.dk/cgi-bin/nph-sw_request?rnammer ) and transfer RNA's were identified with tRNAscan-SE (Lowe and Eddy, 1997). Non-coding RNAs, such as microRNAs, were identified by sequence homology search of the Rfam database (http://selab.janelia.org/software.html). Repeats and predicted RNA's were then masked using RepeatMasker (A. Smit, R. Hubley & P. Green http://repeatmasker.org). Protein-coding genes were predicted using a combination of ab initio programs Snap (I. Korf, 2004), Fgenesh (Softberry, Corp) and Augustus (M. Stanke, et. Al 2008) and the annotation pipeline tool Maker (M. Yandell et. al., 2007) which aligns mRNA, EST and protein information from same species or cross-species to aid in gene structure determination and modifications. A consensus gene set from the above prediction algorithms was generated, using a logical, hierarchical approach developed at the Genome institute. Gene product naming was determined by BER (JCVI: http://ber.sourceforge.net). Our goal is to explore this WGS draft sequence of Trichuris suis to better define proteins involved in nematode parasitism that impact health and disease and are relevant to both host-parasite relationships and basic biological processes. For information regarding this assembly or project, or any other GSC genome project, please visit our Genome Groups web page (http://genome.wustl.edu/genome_group_index.cgi) and email the designated contact person. For specific questions regarding the Trichuris suis genome project contact Makedonka Mitreva (mmitreva@genome.wustl.edu) at Washington University School of Medicine. The National Human Genome Research Institute (NHGRI) of the National Institutes of Health (NIH) provided funds for this project. ##Genome-Assembly-Data-START## Finishing Goal :: High-Quality Draft Current Finishing Status :: High-Quality Draft Assembly Method :: ALLPATHS_LG v. 2012-11-02 Assembly Name :: T_suis_1.0.allpaths Genome Coverage :: 392x Sequencing Technology :: Illumina ##Genome-Assembly-Data-END## FEATURES Qualifiers source /organism="Trichuris suis" /mol_type="genomic DNA" /submitter_seqid="T_suis-1.0_Cont4" /isolation_source="cecum and proximal colon of infected animals which were naturally infected" /host="Sus scrofa (pig)" /db_xref="taxon:68888" /chromosome="Unknown" /dev_stage="adult" /country="USA: Beltsville, MD" protein /locus_tag="D918_02109" /inference="protein motif:HMMPfam:IPR001140" /inference="protein motif:HMMPfam:IPR003439" /note="KEGG: phu:Phum_PHUM351610 7.6e-218 multidrug resistance protein, putative" /db_xref="InterPro:IPR001140" /db_xref="InterPro:IPR003439" intron_pos 19:0 (1/19) intron_pos 61:0 (2/19) intron_pos 104:0 (3/19) intron_pos 151:0 (4/19) intron_pos 171:2 (5/19) intron_pos 283:2 (6/19) intron_pos 343:0 (7/19) intron_pos 428:0 (8/19) intron_pos 450:0 (9/19) intron_pos 496:0 (10/19) intron_pos 573:1 (11/19) intron_pos 614:0 (12/19) intron_pos 670:0 (13/19) intron_pos 728:1 (14/19) intron_pos 745:2 (15/19) intron_pos 1102:0 (16/19) intron_pos 1136:0 (17/19) intron_pos 1191:2 (18/19) intron_pos 1256:2 (19/19) BEGIN 1 MRKKRNPLPK TYLLLSRLAI SFLLVIASIT TCCLIPSVQA TGVPINSIDE LAYSLNALAM 61 IFLFLLTVLC SRRGIITSGS LFLSNTFYCV FESMEFSATV RRFNYEKWTS LDVSYAVYYC 121 LLLMQLLISC WADRMDVWDI PAKGKETNVR EENPEKTVSF FNRITYTFFD PIAWRGWRRG 181 VQKEDLWLLL PKDTVEELRQ KWENLWNAKS QRFLNAQRHR QPSISYDNKK PSISLEAAVH 241 KAKREAPSVF LTLVQCFKWQ ALAGLFIRTL SDILEFMKPL VLRKLIQFME IKELTLSYGI 301 FFSFMLLLIG VFQSLLLQTY HITTSRVGMN LKSILTSAIF EKALRLSNES RKQTTAGEMI 361 NMMSADIERV LHVCPYLLLV WSAPLEVILA LVFLWGNLGP SVFAGLGVMV LMIPINFAIS 421 NLELKCQVQQ MALKDTRMKM MNEVLNGIKA LKLHAWEPAF EKKLADIRKQ ELHMLRKAAI 481 YRSVITFAWI VTPPLIAVVS FATFLLSDPS NELTPEVTFV SLSLFHILHM PISLIPIVVA 541 YAIHAYVSLK RLSTFLRLEE LDESTVLRTE GSHSAISFEK ATFSWEADSE SSAILRNLNL 601 KVPRGACIAV IGRGSNLSGG QKQRLSLARA VYQDANIYLL DDPLSAVDSH VGRHIFDHVI 661 GKNGLLRKKT RIFVTNAVTY LKEVDIIVIL EDGGIKTIGT PNELLKEEDA LKQFLQEEIN 721 PDEENATEQG GKKSYEEISR VDARRLESQT SIISTKSLGA ISRRSSGSTS EQYRKEKAAA 781 EDKSAMLSEE ETMETGRVAW RVYGIYMRSI GLCICLTVLL AYIITGGLTI ASSDWLARWS 841 EDALLSKNDS RYVNTSTRIG GYAGFGIGQA FFMFLGALVM AFGMVRASAK LHANLLHSLL 901 RVPMSFYETT PLGRIVNRVG KDIDMVDNLL PNNIRGLMNT VTQVISTLVI IMINMSVFGV 961 VVIPLAVLYV FLQRFYVATS RQLQRMEAVS RSPIYSLFQE VVQGISSVRA YNAQQWFRKR 1021 FDMLINENQM NYFPKIISNR LTKGTSCLPQ VNGTLFRWLA VRIEFISTIM VFCAAIFAVC 1081 SRDSGIMSAG MIGLSITYAL TITQVLNWVM EIASFVETNI VSVERIDEYI QLKHEDPVLF 1141 FGTIRMNIDP PEQKTDDEIW LALEQANLKP FVSNLPGKLD FEVSEGGENL SVGQRQMLCL 1201 ARALLRRSRI LVLDEATAAV DLETDQLIQD TIRRYFADCT VLTIAHRLNT ILDSDRILVL 1261 SNGYLIENNS PENLLKDNES EFYSMARESN IV //