LOCUS       KHJ47492.1              4892 aa    PRT              CON 17-DEC-2014
DEFINITION  Trichuris suis hypothetical protein protein.
ACCESSION   KN538379-248
PROTEIN_ID  KHJ47492.1
SOURCE      Trichuris suis (pig whipworm)
  ORGANISM  Trichuris suis
            Eukaryota; Metazoa; Ecdysozoa; Nematoda; Enoplea; Dorylaimia;
            Trichinellida; Trichuridae; Trichuris.
REFERENCE   1  (bases 1 to 2343098)
  AUTHORS   Mitreva,M.
  TITLE     Draft genome of Trichuris suis
  JOURNAL   Unpublished
REFERENCE   2  (bases 1 to 2343098)
  AUTHORS   Mitreva,M., Pepin,K.H., Abubucker,S., Martin,J., Minx,P.,
            Warren,C., Palsikar,V.B., Zhang,X., Rosa,B.A. and Wilson,R.K.
  TITLE     Direct Submission
  JOURNAL   Submitted (27-JAN-2014) The Genome Institute, Washington University
            School of Medicine, 4444 Forest Park, St. Louis, MO 63108, USA
COMMENT     Trichuris suis contaminated a dirt lot located at the USDA,
            Agricultural Research Service, Beltsville Agricultural Research
            Center, Animal Parasitic Disease Laboratory in Beltsville, MD since
            the early 1960s.  Adult worms were isolated for passage from pigs
            placed on the lot and naturally infected.  The T. suis adults were
            manually removed from the cecum and proximal colon tissue and
            cultured in vitro to release fertilized eggs that were removed
            after 24-48 hours and embryonated to an infective stage (Hill et
            al., Experimental Parasitology 77, 170-178, 1993).  The strain has
            been actively passed in pigs one to two times per year since that
            time and characterized for pathogenesis in pigs (Mansfield and
            Urban, Veterinary Immunology and Immunopathology 50, 1-17, 1996).
            This strain of T. suis has also been used therapeutically in human
            subjects with inflammatory bowel disease (Trichuris suis therapy in
            Crohn's disease. Summers RW, Elliott DE, Urban JF Jr, Thompson R,
            Weinstock JV. Gut. 2005 Jan;54(1):87-90. The Genome Institute
            collaborators that provided material for the genome/transcriptome
            sequencing are: USDA - Urban, Jr., J.F., Hill, D. E. and Michigan
            State University - Mansfield L.S.
            
            The repeat library was generated using Repeatmodeler (A. Smit, R.
            Hubley http://www.systemsbiology.org/). The Ribosomal RNA genes
            were identified using RNAmmer
            ((http://www.cbs.dtu.dk/cgi-bin/nph-sw_request?rnammer ) and
            transfer RNA's were identified with tRNAscan-SE (Lowe and Eddy,
            1997). Non-coding RNAs, such as microRNAs, were identified by
            sequence homology search of the Rfam database
            (http://selab.janelia.org/software.html). Repeats and predicted
            RNA's were then masked using RepeatMasker (A. Smit, R. Hubley & P.
            Green http://repeatmasker.org). Protein-coding genes were predicted
            using a combination of ab initio programs Snap (I. Korf, 2004),
            Fgenesh (Softberry, Corp) and Augustus (M. Stanke, et. Al 2008) and
            the annotation pipeline tool Maker (M. Yandell et. al., 2007) which
            aligns mRNA, EST and protein information from same species or
            cross-species to aid in gene structure determination and
            modifications. A consensus gene set from the above prediction
            algorithms was generated, using a logical, hierarchical approach
            developed at the Genome institute. Gene product naming was
            determined by BER (JCVI: http://ber.sourceforge.net).
            
            Our goal is to explore this WGS draft sequence of Trichuris suis to
            better define proteins involved in nematode parasitism that impact
            health and disease and are relevant to both host-parasite
            relationships and basic biological processes.
            
            For information regarding this assembly or project, or any other
            GSC genome project, please visit our Genome Groups web page
            (http://genome.wustl.edu/genome_group_index.cgi) and email the
            designated contact person. For specific questions regarding the
            Trichuris suis genome project contact Makedonka Mitreva
            (mmitreva@genome.wustl.edu) at Washington University School of
            Medicine. The National Human Genome Research Institute (NHGRI) of
            the National Institutes of Health (NIH) provided funds for this
            project.
            
            ##Genome-Assembly-Data-START##
            Finishing Goal           :: High-Quality Draft
            Current Finishing Status :: High-Quality Draft
            Assembly Method          :: ALLPATHS_LG v. 2012-11-02
            Assembly Name            :: T_suis_1.0.allpaths
            Genome Coverage          :: 392x
            Sequencing Technology    :: Illumina
            ##Genome-Assembly-Data-END##
FEATURES             Qualifiers
     source          /organism="Trichuris suis"
                     /mol_type="genomic DNA"
                     /submitter_seqid="T_suis-1.0_Cont4"
                     /isolation_source="cecum and proximal colon of infected
                     animals which were naturally infected"
                     /host="Sus scrofa (pig)"
                     /db_xref="taxon:68888"
                     /chromosome="Unknown"
                     /dev_stage="adult"
                     /country="USA: Beltsville, MD"
     protein         /locus_tag="D918_02352"
                     /inference="protein motif:HMMPfam:IPR013098"
                     /note="KEGG: ecb:100053844 0. hypothetical LOC100053844;
                     K12567 titin"
                     /db_xref="InterPro:IPR013098"
     intron_pos      55:0 (1/56)
     intron_pos      105:0 (2/56)
     intron_pos      161:1 (3/56)
     intron_pos      199:1 (4/56)
     intron_pos      234:1 (5/56)
     intron_pos      278:2 (6/56)
     intron_pos      338:1 (7/56)
     intron_pos      522:2 (8/56)
     intron_pos      661:0 (9/56)
     intron_pos      1407:2 (10/56)
     intron_pos      1449:0 (11/56)
     intron_pos      1494:2 (12/56)
     intron_pos      1554:1 (13/56)
     intron_pos      1660:1 (14/56)
     intron_pos      1766:1 (15/56)
     intron_pos      1871:1 (16/56)
     intron_pos      1977:1 (17/56)
     intron_pos      2082:1 (18/56)
     intron_pos      2188:1 (19/56)
     intron_pos      2288:1 (20/56)
     intron_pos      2394:1 (21/56)
     intron_pos      2499:1 (22/56)
     intron_pos      2605:1 (23/56)
     intron_pos      2725:1 (24/56)
     intron_pos      2831:1 (25/56)
     intron_pos      2868:0 (26/56)
     intron_pos      2936:1 (27/56)
     intron_pos      2973:0 (28/56)
     intron_pos      3042:1 (29/56)
     intron_pos      3076:0 (30/56)
     intron_pos      3203:2 (31/56)
     intron_pos      3264:1 (32/56)
     intron_pos      3301:2 (33/56)
     intron_pos      3360:1 (34/56)
     intron_pos      3396:2 (35/56)
     intron_pos      3455:1 (36/56)
     intron_pos      3595:2 (37/56)
     intron_pos      3653:1 (38/56)
     intron_pos      3689:2 (39/56)
     intron_pos      3748:1 (40/56)
     intron_pos      3888:2 (41/56)
     intron_pos      3946:1 (42/56)
     intron_pos      3982:2 (43/56)
     intron_pos      4085:2 (44/56)
     intron_pos      4146:1 (45/56)
     intron_pos      4183:2 (46/56)
     intron_pos      4286:2 (47/56)
     intron_pos      4346:1 (48/56)
     intron_pos      4383:2 (49/56)
     intron_pos      4545:1 (50/56)
     intron_pos      4639:1 (51/56)
     intron_pos      4681:2 (52/56)
     intron_pos      4740:1 (53/56)
     intron_pos      4787:2 (54/56)
     intron_pos      4846:1 (55/56)
     intron_pos      4880:0 (56/56)
BEGIN
        1 MIRYSITANV PTDSLQRALE LMLWIPKRAS DLNLIHNILD LPTETAKLGR LLRHEQFEVW
       61 QDSLDRPKGQ LHVFLFKTKI VATEKVEPED PDEVPEFKHV FTVRLDKYDI REYLGNSNIV
      121 QLVPVDASLP TYFFKATAPD NAEIVKQAWI KDVQENKETT GELPESEVEV QGDFIDFSDI
      181 KSEFSEYSSV SRKSSEYGDG RDDESPPAKK PKTPPAISRS TSAQSVYSMN MESLTQTGSI
      241 EMEGSSVTRT QYGFRTVHET TAKMSLKVTG NPMPVITWYK DGVLLQEDER KKFYSDDDGY
      301 FALTIEPVQV EDTGRYTCVA TNEYGQARTS AFFRVVRVDR EPEQPKFLKV MRDLELHEGD
      361 TATFTCEVEG WPEPEIQFYL DGQPIHISRE HNIEYDGRTV RLTVREVQPE DGGSYVLKAV
      421 NGSGEVQCAA TLTVIKDLEK NKMPPYFQQQ LNDVVVVEDQ SVKFKTVCSG DPTPEVVWYI
      481 NGVQLTNSDK VHMIAEDGVY ILTIDNITQH FDGELTCCAF NRLGEISCCC RITVNKADYP
      541 PSFEQELRDQ VVTAGEAVKL QVVISAQPEA TVSWWFNDEQ ISEYHPSVRL SAKPQAGIYS
      601 IEITKATTDM CGVIKCQASN YLGTVFSSAN LSVEEAKSAP AFRNMLQDIV ALPREYLKIH
      661 VSTTGYPRPT VAWKLNGEEI KESPNVKISS SGNDYYLEIL SFTENDAGEL ECTAVNSLGI
      721 ASSKCQVSTS PAKMKSNFEK DLPKSTTAEE GKPISFSVQS KQSATFSWFL NGRELHDSDS
      781 GIRIMAIGDQ ESRLIIESFS ESLSGRLTCT ATSPLGVSET STDIKSTLGM KEGAGPLPPI
      841 VLAEYGGRVL LKVTIESDQA EVCWRLNGEP LKNSEKVHVG RKGADFFLEI EDIDDSASGE
      901 LLCEVRQGTR SDTFGTTVKV ERRALTTLLD GLKDTTVRVG DTVQFNLSLK DAKKYEVKWF
      961 LNDHELVESE KVHINVYPEQ AECCLKIESI TEEFNGELKC NIMTPSGDYV SSAQLHVVPR
     1021 AAPVVVKGLS DSFVQAGDTV KFSVFAENAV EGKVKWVLNG SELLPSDSVI IATEQQNECS
     1081 LTLKNINPQQ SGVLECLIST PYGVSRSSAK LTVKPLPEKL MKPEFRAIMA PVVIYEGDTL
     1141 ETRVVLEGEP QPEVKWYIND VPLVEDSNVE ITTEKGISKV EIKKVNFDLN GVLKCVAKNE
     1201 YGEVATSTSV SIRRQIPVEF EQFLCDTTCR EGDTLKLKAV LLGQPRPEVS WYLKGKKLVQ
     1261 TDKIHIYTEQ NTYVAIISDI TCDYSGEVLC KAVNEFGEAS SSAMLTVLPR GVPPDFLEWM
     1321 NSISAVDGAE VVHHVKFTGE PTPTIRWFIN NQEVHNSDNF AIHTDKDVCT LTIKHFSASL
     1381 VGEIICKAEN DAGEVSCTSQ MSLAPAGYIR EEVSERSELE AVALSTGEIE GSEAGTDFAV
     1441 SLPDEDMEEE TSRLESSVLA PKFITKIKDT RVTVGKQAVF ECIVPGTKGV CVKWLKDNKE
     1501 IELLARIRVG SHKEENLIIH RLTVDDVTKA DAGTYTCVVM NEHGQEICSA RLEVDEQWTV
     1561 AQLPETVPEI VEVLHSCVVE ENEEAIMHCT VTGCPDAQVR WYKDNVQLES SKRHQMVSEA
     1621 EGVFVLKIPN VTMKDAGEYK CEVFNASGVA SSTATLTVTV PTAKEPTVGE AMAPKFVEPL
     1681 MVSEDAHKKV TMFTCKIYGQ PRPQVRWFKN EKMLNSSYKY EIINEEQDSY VLKIHDTTTE
     1741 DNGEYRCEAF NENGIASSSV ALTMKFEQVT ESTLPESAPE FSKPLKSAVL NKGEALHLEC
     1801 TVVGQPAPSI QWFKSEEELK TTETVTITSL PNGVECLDMK ETMPSDSGDY RCIATNRLGY
     1861 SSSEATVAVH APEEMETAEL LESTTEFVES LHNQTVKENE TGILSCRVTG PPVTSVQWFK
     1921 EDVLLESSEK YEIISEANGV FALQVHDSTA EDSGEFKCEV TTEKGSSISK AYLTVEKSSV
     1981 KEEFAEEQPP EILKSLTPTS LREGESLTLE CTVTGKPRPT VQWFRNGEEI EVDESVKLET
     2041 SSGGVCRLTV HNVTQKDAGE YRCIATNTCG ASWSDASVTV NVAEQAVPTS LTEAAPLFVR
     2101 QLQSCTVKAN EQHILQCRVT GAPKPQVRWL KNGVELEPSA KYEIVCEDGE THILKVQNFN
     2161 KDDSAVYRCE ASNEKGFAAS EANIEIQRAE VATAVPTFSE TLVSTSVVEG EPLLLECTVS
     2221 GEQDSIVEWF KNGQKIEATQ NVKIESLAGG VQHLEVRNIA LSDSGVYRCV VANQLGDSST
     2281 EARVTVQTHE TVEEAELCEA VPVFIEVLKS QVAKENEVAS LSCKVYGLPT AEVQWFKDGV
     2341 ELKSSEKYEL ISEVSGVFLL QVHNIGKQDA GEYTCVATNK MGSVSSNAFL TVSVAEERVS
     2401 ASQQMAPAFL SAITEAFPMC GEKVHLECVV TGNPPPEVHW LKNGQELLST DMMKVSSFSD
     2461 GTQCLEIEHV AVKDAGDYCC VATNPLGEAS TKIAVVVQTA ETEGDFKLYG TVPQFVETLH
     2521 NCDVQENETA VMKCKVIGTP LPEVRWFKEN ASLESSAKFE LTSDVDGLFL LKIHNAKEED
     2581 VGEYRCEVFN CKGVASSKAQ LSVTVGETSE AVKALEAPLF VKTLTSCSLT EGEHLQLNCA
     2641 VSGQPTPTIQ WFKNGEEIKA TGLVKIESLP EGILTLVVQN AHVGDSGMYR CLATNEAGEC
     2701 STEATVSVQG KYFPSIIQFF PLNTACETME LEELAETAPE FVQVLHFCDV AETQEGILSC
     2761 KLTGFPKPQV RWFKDGVPIQ PSEKYEMVTD DNGLVLLKVH VAGREDSGEY RCEAFNSRGV
     2821 AWTEAPLNVK AAELMEYEEG EEVAPDFLEP VQACVVNKGE DAVLICRISG VPTPQVCWYK
     2881 NGVPLVPDER VEITSFGDRH TLVVHNAQQE DVADYRCEAK NDAGVVWSDA TVCVLSEDYL
     2941 MESVQDEVAP FFVREIQQES VNAGDRAVLT CQVSGNPTPE VHWYRDGKLL DLSKDVEIAS
     3001 TSDGTVTLVI QHAKVEDQGN YRCEAVNILG SACSKAPLSV FPTEEIMEVE ESASMQFIEP
     3061 LHHYMREEDT VAVFECKLKA QPVSTIVWYK DGAPILPDNF TVIESLPDGL QRLTLRCTTA
     3121 SAVGHYACLA KSDITEVKTE DDLLSSSACC FSSNEIYSSV SKVPLALEFL QGLKRQCVKK
     3181 GDTVTMRCQV NMKGRSKLPT VKWYKSGREL LPDKRVKMEA TMDGWFTLTV SKFEENDAGM
     3241 YTCIITENSS VIKSEAPVEL TVTEAGGEFT IVKELASQTI SAGEKLELEL VTSESWDDIR
     3301 WLHNNEVVLN DKRTKIEQPE ACVCRLTVAE TSPKDQGNYF VIATKGNKTV ESQAKVIVTE
     3361 GKHLEITKAL EDITVPAGAE VTLEVQFNEP IDHAQWYLDS KELVSDAKVS LEQLDERILR
     3421 LHLKNADKND AGVYGVVAHS GEQTTECKAN LSVQAQQFLT ITKGLDDVLV EPHKPLTLTV
     3481 NVDGLPEKVE WLKDGEPLKE QANLKIEVPA NGVYRLAISD TLPDSAGLYT FRAIDETGVI
     3541 ESSGTLSLMD MAEELPQKRI TKLEMVEGLM DQAVAEREPV CLRIKLNKKP KVVKWYKNGK
     3601 EIIPSNRFKA EVDETGASLE ISSLLAEDSG LYEVVASDER DSVASSGKLL VTSASQLDIT
     3661 RGLQDRTVMK GTELTLEIQL SKPADMATWY HGTEKLANGQ NLRLEEIDGR VYRLHVQHAD
     3721 IQDAGEYRVV VKGDGEKAES KANITVKTIP NLRISKELQD VSPTLHETVI MEVLLEGLPD
     3781 NVEWFKNGSK LSSVPGMRIE VGGDGWHRLI LEDVLPDSAG LYKFRASNPD GSVESSGTVI
     3841 LKQPVEEKPG DAEAALTLMK GLEDQTVDLG HPISLSVKLN KRPKDIKWYS NGKEIRPSMK
     3901 KKIKFDGLEA TLEVDKASED DGGIYEVVVS DENTTIRSSG NVRIAVPTAL KIISGLKPCK
     3961 VTVGEPLVLS VGLEGNADLI EWLKDGVKLT NVPNYSFTCT EGLYNLQVKQ AEMGNAGEYM
     4021 FIARKASDAV SSVGSVVVKS PPSEEAVEQP EKLTFVECLE DQQLQEGDEL KLKVQTNKKP
     4081 MTVKWTRNGI PVTSTGGTKL VDNGDRVFEL IVENASRGDA GSYKVIIGDE HASAESVADV
     4141 VVIEKAAEKP LQVLKGMADL TCDIGSKACF EVQISGKPKS YRWYKNGREV KQTPRISLKE
     4201 MDGGRYCLEI EKTVTDDGGE FSFEAENDVG KVHSQALLTV QSPAARKGPA AEPLVITRPL
     4261 DDQTAEEGAE ISFEAEFNRG PKEVHWYKGM DVVTGNEKVK ISSPAVNASK LHVTRVSPED
     4321 SSSYRVEAID DLGNIVTSTA RLTVNASPER LEFIKPMADV QVAKGDTATL EVQVKGIPQS
     4381 VKWYKNGRLL PSQGRQQEIG KGTYILKIPN ASDDDQATYK CELENQLGSI STEGTLVVLP
     4441 SVEEAGEKEI GALKVLKGLA DITLYVGDDL LLEVELSSKP EEVLWFNNGH PVLARNCEVE
     4501 VMPNGHAVCR CRLPNIDISC DGTFTVKARN PYGSADSSNK LTVKGKPPKI LTGLEDRRTS
     4561 PGNRVVFEVE VDRKPKLVKW YKNGRLVKEN ERTVLISVDD CTYQLVLNDV DKEDVGNYLV
     4621 EVSNDFGVAK SEAKLTIIEP VDEMWRSSPR IVKGLNDVEI FDGNHATFSV TVEGKPTSVR
     4681 WYRNGTELTS SQRVLPTQLD GFTYKLTLRE CHKDDMGVIK VLAMNDFGSD TSEGRLIVKE
     4741 VPTGRMPSGM EKAARFIIPL EDVAAEESKK AVLSCKVEGL PTPLITWFKD GKEIDKNERV
     4801 TYSMDEDKVC SLSIGAISPE DEGCYAALAK NSLGQDRTEC YLTIAAPKGA EELEHGVAPE
     4861 FIKPLRSKSI IEGEDLTLDA RIIGNPLPEI CW
//