LOCUS       KHJ47266.1              2365 aa    PRT              CON 17-DEC-2014
DEFINITION  Trichuris suis hypothetical protein protein.
ACCESSION   KN538379-22
PROTEIN_ID  KHJ47266.1
SOURCE      Trichuris suis (pig whipworm)
  ORGANISM  Trichuris suis
            Eukaryota; Metazoa; Ecdysozoa; Nematoda; Enoplea; Dorylaimia;
            Trichinellida; Trichuridae; Trichuris.
REFERENCE   1  (bases 1 to 2343098)
  AUTHORS   Mitreva,M.
  TITLE     Draft genome of Trichuris suis
  JOURNAL   Unpublished
REFERENCE   2  (bases 1 to 2343098)
  AUTHORS   Mitreva,M., Pepin,K.H., Abubucker,S., Martin,J., Minx,P.,
            Warren,C., Palsikar,V.B., Zhang,X., Rosa,B.A. and Wilson,R.K.
  TITLE     Direct Submission
  JOURNAL   Submitted (27-JAN-2014) The Genome Institute, Washington University
            School of Medicine, 4444 Forest Park, St. Louis, MO 63108, USA
COMMENT     Trichuris suis contaminated a dirt lot located at the USDA,
            Agricultural Research Service, Beltsville Agricultural Research
            Center, Animal Parasitic Disease Laboratory in Beltsville, MD since
            the early 1960s.  Adult worms were isolated for passage from pigs
            placed on the lot and naturally infected.  The T. suis adults were
            manually removed from the cecum and proximal colon tissue and
            cultured in vitro to release fertilized eggs that were removed
            after 24-48 hours and embryonated to an infective stage (Hill et
            al., Experimental Parasitology 77, 170-178, 1993).  The strain has
            been actively passed in pigs one to two times per year since that
            time and characterized for pathogenesis in pigs (Mansfield and
            Urban, Veterinary Immunology and Immunopathology 50, 1-17, 1996).
            This strain of T. suis has also been used therapeutically in human
            subjects with inflammatory bowel disease (Trichuris suis therapy in
            Crohn's disease. Summers RW, Elliott DE, Urban JF Jr, Thompson R,
            Weinstock JV. Gut. 2005 Jan;54(1):87-90. The Genome Institute
            collaborators that provided material for the genome/transcriptome
            sequencing are: USDA - Urban, Jr., J.F., Hill, D. E. and Michigan
            State University - Mansfield L.S.
            
            The repeat library was generated using Repeatmodeler (A. Smit, R.
            Hubley http://www.systemsbiology.org/). The Ribosomal RNA genes
            were identified using RNAmmer
            ((http://www.cbs.dtu.dk/cgi-bin/nph-sw_request?rnammer ) and
            transfer RNA's were identified with tRNAscan-SE (Lowe and Eddy,
            1997). Non-coding RNAs, such as microRNAs, were identified by
            sequence homology search of the Rfam database
            (http://selab.janelia.org/software.html). Repeats and predicted
            RNA's were then masked using RepeatMasker (A. Smit, R. Hubley & P.
            Green http://repeatmasker.org). Protein-coding genes were predicted
            using a combination of ab initio programs Snap (I. Korf, 2004),
            Fgenesh (Softberry, Corp) and Augustus (M. Stanke, et. Al 2008) and
            the annotation pipeline tool Maker (M. Yandell et. al., 2007) which
            aligns mRNA, EST and protein information from same species or
            cross-species to aid in gene structure determination and
            modifications. A consensus gene set from the above prediction
            algorithms was generated, using a logical, hierarchical approach
            developed at the Genome institute. Gene product naming was
            determined by BER (JCVI: http://ber.sourceforge.net).
            
            Our goal is to explore this WGS draft sequence of Trichuris suis to
            better define proteins involved in nematode parasitism that impact
            health and disease and are relevant to both host-parasite
            relationships and basic biological processes.
            
            For information regarding this assembly or project, or any other
            GSC genome project, please visit our Genome Groups web page
            (http://genome.wustl.edu/genome_group_index.cgi) and email the
            designated contact person. For specific questions regarding the
            Trichuris suis genome project contact Makedonka Mitreva
            (mmitreva@genome.wustl.edu) at Washington University School of
            Medicine. The National Human Genome Research Institute (NHGRI) of
            the National Institutes of Health (NIH) provided funds for this
            project.
            
            ##Genome-Assembly-Data-START##
            Finishing Goal           :: High-Quality Draft
            Current Finishing Status :: High-Quality Draft
            Assembly Method          :: ALLPATHS_LG v. 2012-11-02
            Assembly Name            :: T_suis_1.0.allpaths
            Genome Coverage          :: 392x
            Sequencing Technology    :: Illumina
            ##Genome-Assembly-Data-END##
FEATURES             Qualifiers
     source          /organism="Trichuris suis"
                     /mol_type="genomic DNA"
                     /submitter_seqid="T_suis-1.0_Cont4"
                     /isolation_source="cecum and proximal colon of infected
                     animals which were naturally infected"
                     /host="Sus scrofa (pig)"
                     /db_xref="taxon:68888"
                     /chromosome="Unknown"
                     /dev_stage="adult"
                     /country="USA: Beltsville, MD"
     protein         /locus_tag="D918_02126"
                     /inference="protein motif:HMMPfam:IPR001747"
                     /inference="protein motif:HMMPfam:IPR015255"
                     /db_xref="InterPro:IPR001747"
                     /db_xref="InterPro:IPR015255"
     intron_pos      19:1 (1/28)
     intron_pos      35:1 (2/28)
     intron_pos      92:0 (3/28)
     intron_pos      164:0 (4/28)
     intron_pos      213:0 (5/28)
     intron_pos      278:1 (6/28)
     intron_pos      423:1 (7/28)
     intron_pos      473:0 (8/28)
     intron_pos      524:0 (9/28)
     intron_pos      572:1 (10/28)
     intron_pos      715:0 (11/28)
     intron_pos      771:0 (12/28)
     intron_pos      894:2 (13/28)
     intron_pos      982:0 (14/28)
     intron_pos      1053:0 (15/28)
     intron_pos      1102:1 (16/28)
     intron_pos      1208:0 (17/28)
     intron_pos      1333:0 (18/28)
     intron_pos      1466:0 (19/28)
     intron_pos      1529:2 (20/28)
     intron_pos      1762:2 (21/28)
     intron_pos      1941:0 (22/28)
     intron_pos      1985:2 (23/28)
     intron_pos      2095:2 (24/28)
     intron_pos      2196:1 (25/28)
     intron_pos      2259:0 (26/28)
     intron_pos      2310:2 (27/28)
     intron_pos      2359:0 (28/28)
BEGIN
        1 MKSHLASFVL FLFLCLHQEV YCKTHGDACF AECSGKNDKF HFDRGYRYAY DWHLKTKVAY
       61 NGTDRSPEVL TYNLNGKVHL YAMAPCEFTM KIISAEQTGG GTAQLDLEVL EKYTTQFSFD
      121 NGAIRSICPD PVEPIWSVNI KRGILSAFQN LYEETALESE VEEADVSGLC PTSYRGKASG
      181 SDLLVTKTKY LNSCRNRGHG SFGRMVPYKS RSNYQNVPLF SGQQRCTQEI RNKILHRASC
      241 EESNTFAPAY TDARGGLFVY AEMELSYRDS AAETVPAEQR NHQPLYFDYS DQEREMSSDA
      301 VSKMEALLPK FCAASSEVVR DTPDLFSELV HCLTRVTRTD IEAVWNRVEK QHECDAQLNY
      361 IADGLSACGS PSCVSFIIDH LSQRQQTEDQ KKRWLDSLIF IDNPTLDVIS SLTPLIKKSE
      421 KEELFRLSNV IHKYCAHDSH CASHREIIQA LDALSGHLTH GCNPRSPEDF DKMLFALRAA
      481 GNIDVPSTGL QNQLKCLSDK TAADELRLAA VQSVRRLPCH PSANTHLFAL LNDQSESVEI
      541 RISAFMVLMK CPTDSMVDRI LALLKTETSM QVASFIWSYL NSIIKGNSPS QEHLRHLLMF
      601 KTLEPKVDRD ARKFSKFSQF STYSTHGNMG LDVEKSLIFS PDEYIPRQAL LNLTMHIYGQ
      661 SVNFLEIGAR TVGLDSLVEH LFGPDGYFTN PEGHYFKEPI SAHSNPSIRT LTDKFQRHMK
      721 TKENFDLAVY ARIFGDDVFH YSGREMDAKH FVRESATIEN LMRQLAKERH FEKKRNVVFA
      781 DWKYTMPTVG GLPLQFDLNG VASVNVALTG KLDVKDLLKG KTDVDIKTNV RGSVAARTAA
      841 VVSILPGKSS FGMSIKTRFF ANRAVELAAL LKGGKSLHLK FLLPKEEMDL LRIESETFVV
      901 RGDTWAPLYP PDASGHLTHA CSNKELNEWL GFKVCYSQRL PEKLISPQAL LTGPTLFRLN
      961 LKNNVHGQQA YEMTVTWKMT PEEKRVHVKA EHSDYTADFV YYPKGELLME VKSDKRPMNL
     1021 RINAEYTHEK KLANFTFSGD KGQIFSGHAY LKRKVASPLE SSVEFDLLIS AHHERKLLLH
     1081 YDYNHKYPQA GRQTRSLGKR YGSRRRRDNS DEEAYLLNLN FESPWRNLHL NGKLNASPHV
     1141 SLMALTADYQ RGTKDDTAKL KAEYRRSVQG EHAFIVLASS LEFTKHPERR TEFSLKASNM
     1201 SNSLSVEAHL TSGERTMSFV GNAKKDPANG IYSAEGKIGG NVLSEHLLSG TYQNKLPNLF
     1261 KFGARVQGPN IGEAAYDVKY TKKNERLTRV EIESKAQYGK KTLDFVSTVS EIEPKIYDVA
     1321 AELKSPERNY IMIKGKAMAR IHSTKKFNCS LNSILRILDQ PDLTLVKSIQ RDGFNIYIDS
     1381 ILSDAASELY HGTFKTASKD AGVYDGHLSF KSKVGKPMDL VVDSHWTPTD NGHVSILKMR
     1441 NMDHEIFKTE LIIPKSFKQK FNQIKLQIIS NWRNAHKHLI MDYSLTRSDV ITHSLDLEGK
     1501 LMDHNHYLNV KLIKAKVNQL TVKLGKGGSV NLQIHSNYSV QQSPTKDGYN TVLTTALTSN
     1561 LGGQSCSWTG NLTFEKHPKG HLAHAEVKSA ENKNARFAFL YGSRMDVGER FRKTDFLVHV
     1621 KLHGLQFEVK DEFWRQMQKQ NFSNKLTVTW NKNSLVNSDQ RFTNDFHICL SPFAVSTSTE
     1681 FGSGHSSMLH SSVLFGVRPE HFDASIKLDL ANKSVLHFNG MYRLRGRHEV FEGMLASQMG
     1741 NSYNLGVHLS RDRTLYGTFS AELTFGDRNV IKINRRKETN SEGNHVERLS FEQHSKGELV
     1801 RYLTTVVEIQ KQTQLQKFEI AFKVGGIDYS AHGEYSTKDK SGRIQIKKTK EGTALTDGKF
     1861 TEQAPNEYAT VLTGGQLVGE VDIKINTHAD KQMLKIDVRH VDSPFTLEAE SHWQAQRASI
     1921 RILVIQRDSQ GKETSKSGFA FSVQTPKTHS ITIDCEFLRS NGDILIHLSY VNERPALLNF
     1981 KFLINPARPS REQWSGLSFV HRLESSEKRS LKISLIDPKI ADNVYLTTNW QLSGDCRLLK
     2041 PQSCRISVDS EASYSSDKNR LIKATFNFNR ENAADGVWAV DAGIQNAFTN LDFHCYHTYK
     2101 FRKCKNSDLW CERQVEWVIR RLDEQGQAKR YELSYNEHRG QSAQLKAKCN EGEWVLQVHK
     2161 EEQGYKVNIL RDGAFKYTLA CTYVREEKLA KCQVENPYTI LLHGYAQLVT NEWFTGSVWH
     2221 SKTNGERIND IDVSSKIEAN DILTTRVYIR PDIKSSVKSW LDGIDARSKL NEYLSGISRT
     2281 SGLLKKLMII ADDQVVQLAS TREQWQKEFS QLVNDHADAV RELTRVRQAA RSWLKGASME
     2341 LYEIFIGMGR RFVEATKHIY LLIAM
//