LOCUS       BC072668                2107 bp    mRNA    linear   HUM 25-JUN-2004
DEFINITION  Homo sapiens SRY (sex determining region Y)-box 4, mRNA (cDNA clone
            MGC:71240 IMAGE:6584346), complete cds.
ACCESSION   BC072668
VERSION     BC072668.1
KEYWORDS    MGC.
SOURCE      Homo sapiens (human)
  ORGANISM  Homo sapiens
            Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi;
            Mammalia; Eutheria; Euarchontoglires; Primates; Haplorrhini;
            Catarrhini; Hominidae; Homo.
REFERENCE   1  (bases 1 to 2107)
  AUTHORS   Strausberg,R.L., Feingold,E.A., Grouse,L.H., Derge,J.G.,
            Klausner,R.D., Collins,F.S., Wagner,L., Shenmen,C.M., Schuler,G.D.,
            Altschul,S.F., Zeeberg,B., Buetow,K.H., Schaefer,C.F., Bhat,N.K.,
            Hopkins,R.F., Jordan,H., Moore,T., Max,S.I., Wang,J., Hsieh,F.,
            Diatchenko,L., Marusina,K., Farmer,A.A., Rubin,G.M., Hong,L.,
            Stapleton,M., Soares,M.B., Bonaldo,M.F., Casavant,T.L.,
            Scheetz,T.E., Brownstein,M.J., Usdin,T.B., Toshiyuki,S.,
            Carninci,P., Prange,C., Raha,S.S., Loquellano,N.A., Peters,G.J.,
            Abramson,R.D., Mullahy,S.J., Bosak,S.A., McEwan,P.J.,
            McKernan,K.J., Malek,J.A., Gunaratne,P.H., Richards,S.,
            Worley,K.C., Hale,S., Garcia,A.M., Gay,L.J., Hulyk,S.W.,
            Villalon,D.K., Muzny,D.M., Sodergren,E.J., Lu,X., Gibbs,R.A.,
            Fahey,J., Helton,E., Ketteman,M., Madan,A., Rodrigues,S.,
            Sanchez,A., Whiting,M., Madan,A., Young,A.C., Shevchenko,Y.,
            Bouffard,G.G., Blakesley,R.W., Touchman,J.W., Green,E.D.,
            Dickson,M.C., Rodriguez,A.C., Grimwood,J., Schmutz,J., Myers,R.M.,
            Butterfield,Y.S., Krzywinski,M.I., Skalska,U., Smailus,D.E.,
            Schnerch,A., Schein,J.E., Jones,S.J. and Marra,M.A.
  TITLE     Generation and initial analysis of more than 15,000 full-length
            human and mouse cDNA sequences
  JOURNAL   Proc. Natl. Acad. Sci. U.S.A. 99 (26), 16899-16903 (2002)
   PUBMED   12477932
REFERENCE   2  (bases 1 to 2107)
  AUTHORS   Strausberg,R.
  TITLE     Direct Submission
  JOURNAL   Submitted (07-JUN-2004) National Institutes of Health, Mammalian
            Gene Collection (MGC), Cancer Genomics Office, National Cancer
            Institute, 31 Center Drive, Room 11A03, Bethesda, MD 20892-2590,
            USA
  REMARK    NIH-MGC Project URL: http://mgc.nci.nih.gov
COMMENT     Contact: MGC help desk
            Email: cgapbs-r@mail.nih.gov
            Tissue Procurement: ATCC
            cDNA Library Preparation: Rubin Laboratory
            cDNA Library Arrayed by: The I.M.A.G.E. Consortium (LLNL)
            DNA Sequencing by: National Institutes of Health Intramural
            Sequencing Center (NISC),
            Gaithersburg, Maryland;
            Web site: http://www.nisc.nih.gov/
            Contact: nisc_mgc@nhgri.nih.gov
            Akhter,N., Ayele,K., Beckstrom-Sternberg,S.M., Benjamin,B.,
            Blakesley,R.W., Bouffard,G.G., Breen,K., Brinkley,C., Brooks,S.,
            Dietrich,N.L., Granite,S., Guan,X., Gupta,J., Haghighi,P.,
            Hansen,N., Ho,S.-L., Karlins,E., Kwong,P., Laric,P., Legaspi,R.,
            Maduro,Q.L., Masiello,C., Maskeri,B., Mastrian,S.D.,McCloskey,J.C.,
            McDowell,J., Pearson,R., Stantripop,S., Thomas,P.J., Touchman,J.W.,
            Tsurgeon,C., Vogt,J.L., Walker,M.A., Wetherby,K.D., Wiggins,L.,
            Young,A., Zhang,L.-H. and Green,E.D.
            
            Clone distribution: MGC clone distribution information can be found
            through the I.M.A.G.E. Consortium/LLNL at: http://image.llnl.gov
            Series: IRAL Plate: 50 Row: o Column: 18
            This clone was selected for full length sequencing because it
            passed the following selection criteria: matched mRNA gi: 4507162.
FEATURES             Location/Qualifiers
     source          1..2107
                     /db_xref="H-InvDB:HIT000264907"
                     /organism="Homo sapiens"
                     /mol_type="mRNA"
                     /db_xref="taxon:9606"
                     /clone="MGC:71240 IMAGE:6584346"
                     /tissue_type="Ovary, teratocarcinoma"
                     /clone_lib="NIH_MGC_109"
                     /lab_host="DH10B-R"
                     /note="Vector: pOTB7"
     gene            1..2107
                     /gene="SOX4"
                     /gene_synonym="EVI16"
                     /db_xref="GeneID:6659"
                     /db_xref="MIM:184430"
     CDS             615..2039
                     /gene="SOX4"
                     /gene_synonym="EVI16"
                     /codon_start=1
                     /product="SOX4 protein"
                     /protein_id="AAH72668.1"
                     /db_xref="GeneID:6659"
                     /db_xref="MIM:184430"
                     /translation="MVQQTNNAENTEALLAGESSDSGAGLELGIASSPTPGSTASTGG
                     KADDPSWCKTPSGHIKRPMNAFMVWSQIERRKIMEQSPDMHNAEISKRLGKRWKLLKD
                     SDKIPFIREAERLRLKHMADYPDYKYRPRKKVKSGNANSSSSAAASSKPGEKGDKVGG
                     SGGGGHGGGGGGGSSNAGGGGGGASGGGANSKPAQKKSCGSKVAGGAGGGVSKPHAKL
                     ILAGGGGGGKAAAAAAASFAAEQAGAAALLPLGAAADHHSLYKARTPSASASASSAAS
                     ASAALAAPGKHLAEKKVKRVYLFGGLGTSSSPVGGVGAGADPSDPLGLYEEEGAGCSP
                     DAPSLSGRSSAASSPAAGRSPADHRGYASLRAASPAPSSAPSHASSSASSHSSSSSSS
                     GSSSSDDEFEDDLLDLNPSSNFESMSLGSFSSSSALDRDLDFNFEPGSGSHFEFPDYC
                     TPEVSEMISGDWLESSISNLVFTY"
BASE COUNT          424 a          699 c          669 g          315 t
ORIGIN      
        1 ctgcactgga ggaactcctg ccattaccag ctcccttctt gcagaaggga gggggaaaca
       61 tacatttatt catgccagtc tgttgcatgc aggctttttg gcttcctacc ttgcaacaaa
      121 ataattgcac caactcctta gtgccgattc cgcccacaga gagtcctgga gccacagtct
      181 tttttgcttt gcattgtagg agagggacta agtgctagag actatgtcgc tttcctgagc
      241 taccgagagc gctcgtgaac tggaatcaac tgcttcaggg aaaaagaaaa aaaaaaaaaa
      301 aagacttgcc tgggaggccg cgagaaactt gcattggaag cttcagcaac cagcattcga
      361 gaaactcctc tctactttag cacggtctcc agactcagcc gagagacagc aaactgcagc
      421 gcggtgagag agcgagagag agggagagag agactctcca gcctgggaac tataactcct
      481 ctgcgagagg cggagaactc cttccccaaa tcttttgggg acttttctct ctttacccac
      541 ctccgcccct gcgaggagtt gaggcggccg ccgcgagggt gtgagcgcgc gtgggcgccc
      601 gccgagccga ggccatggtg cagcaaacca acaatgccga gaacacggaa gcgctgctgg
      661 ccggcgagag ctcggactcg ggcgccggcc tcgagctggg aatcgcctcc tcccccacgc
      721 ccggctccac cgcctccacg ggcggcaagg ccgacgaccc gagctggtgc aagaccccga
      781 gtgggcacat caagcgaccc atgaacgcct tcatggtgtg gtcgcagatc gagcggcgca
      841 agatcatgga gcagtcgccc gacatgcaca acgccgagat ctccaagcgg ctgggcaaac
      901 gctggaagct gctcaaagac agcgacaaga tccctttcat tcgagaggcg gagcggctgc
      961 gcctcaagca catggctgac taccccgact acaagtaccg gcccaggaag aaggtgaagt
     1021 ccggcaacgc caactccagc tcctcggccg ccgcctcctc caagccgggg gagaagggag
     1081 acaaggtcgg tggcagtggc gggggcggcc atgggggcgg cggcggcggc gggagcagca
     1141 acgcgggggg aggaggcggc ggtgcgagtg gcggcggcgc caactccaaa ccggcgcaga
     1201 aaaagagctg cggctccaaa gtggcgggcg gcgcgggcgg tggggttagc aaaccgcacg
     1261 ccaagctcat cctggcaggc ggcggcggcg gcgggaaagc agcggctgcc gccgccgcct
     1321 ccttcgccgc cgaacaggcg ggggccgccg ccctgctgcc cctgggcgcc gccgccgacc
     1381 accactcgct gtacaaggcg cggactccca gcgcctcggc ctccgcctcc tcggcagcct
     1441 cggcctccgc agcgctcgcg gccccgggca agcacctggc ggagaagaag gtgaagcgcg
     1501 tctacctgtt cggcggcctg ggcacgtcgt cgtcgcccgt gggcggcgtg ggcgcgggag
     1561 ccgaccccag cgaccccctg ggcctgtacg aggaggaggg cgcgggctgc tcgcccgacg
     1621 cgcccagcct gagcggccgc agcagcgccg cctcgtcccc cgccgccggc cgctcgcccg
     1681 ccgaccaccg cggctacgcc agcctgcgcg ccgcctcgcc cgccccgtcc agcgcgccct
     1741 cgcacgcgtc ctcctcggcc tcgtcccact cctcctcttc ctcctcctcg ggctcctcgt
     1801 cctccgacga cgagttcgaa gacgacctgc tcgacctgaa ccccagctca aactttgaga
     1861 gcatgtccct gggcagcttc agttcgtcgt cggcgctcga ccgggacctg gattttaact
     1921 tcgagcccgg ctccggctcg cacttcgagt tcccggacta ctgcacgccc gaggtgagcg
     1981 agatgatctc gggagactgg ctcgagtcca gcatctccaa cctggttttc acctactgaa
     2041 gggcgcgcag gcagggagaa gggccggggg gggtaggaga ggagaaaaaa aaaaaaaaaa
     2101 aaaaaaa
//