LOCUS       Z74616                  5086 bp    mRNA    linear   HUM 21-OCT-2008
DEFINITION  H.sapiens mRNA for prepro-alpha2(I) collagen.
ACCESSION   Z74616
VERSION     Z74616.1
KEYWORDS    alpha2(I) collagen.
SOURCE      Homo sapiens (human)
  ORGANISM  Homo sapiens
            Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi;
            Mammalia; Eutheria; Euarchontoglires; Primates; Haplorrhini;
            Catarrhini; Hominidae; Homo.
REFERENCE   1  (bases 1 to 54)
  AUTHORS   Dickson L.A., de Wet W., Di Liberto M., Weil D., Ramirez F.
  TITLE     Analysis of the promoter region and the N-propeptide domain of the
            human pro alpha 2(I) collagen gene
  JOURNAL   Nucleic Acids Res. 13(10), 3427-3438(1985).
   PUBMED   4011429
REFERENCE   2  (bases 55 to 64)
  AUTHORS   Sherwood A.L., Bottenus R.E., Martzen M.R., Bornstein P.
  TITLE     Structural and functional analysis of the first intron of the human
            alpha 2(I) collagen-encoding gene
  JOURNAL   Gene 89(2), 239-244(1990).
   PUBMED   2129528
REFERENCE   3  (bases 65 to 2434)
  AUTHORS   Kuivaniemi H., Tromp G., Chu M., Prockop D.J.
  TITLE     Structure of a full-length cDNA clone for the preproalpha2(I) chain
            of human type I procollagen
  JOURNAL   Biochem. J. 252(3), 633-640(1988).
   PUBMED   3421913
REFERENCE   4  (bases 2435 to 3015;4191 to 5086)
  AUTHORS   de Wet W., Bernard M., Benson-Chanda V., Chu M., Dickson L.,
            Weil D., Ramirez F.
  TITLE     Organization of the human pro-alpha 2(I) collagen gene
  JOURNAL   J Biol Chem 262(33), 16032-16036(1987).
   PUBMED   2824475
REFERENCE   5  (bases 3016 to 4190)
  AUTHORS   Makela J.K., Vuorio T., Vuorio E.
  TITLE     Growth-dependent modulation of type I collagen production and mRNA
            levels in cultured human skin fibroblasts
  JOURNAL   Biochim. Biophys. Acta 1049(2), 171-176(1990).
   PUBMED   2364107
REFERENCE   6  (bases 1 to 5086)
  AUTHORS   Dalgleish R.
  JOURNAL   Submitted (01-JUL-1996) to the INSDC. Raymond Dalgleish, Department
            of Genetics, University of Leicester, University Road, Leicester,
            LE1 7RH, United Kingdom
REFERENCE   7  (bases 1 to 5086)
  AUTHORS   Dalgleish R.
  TITLE     The human type I collagen mutation database
  JOURNAL   Nucleic Acids Res. 25(1), 181-187(1997).
   PUBMED   9016532
FEATURES             Location/Qualifiers
     source          1..5086
                     /db_xref="H-InvDB:HIT000328004"
                     /organism="Homo sapiens"
                     /mol_type="mRNA"
                     /db_xref="taxon:9606"
     5'UTR           1..139
     exon            1..209
                     /number=1
     CDS             140..4240
                     /product="prepro-alpha2(I) collagen"
                     /db_xref="GOA:P08123"
                     /db_xref="H-InvDB:HIT000328004.13"
                     /db_xref="HGNC:HGNC:2198"
                     /db_xref="InterPro:IPR000885"
                     /db_xref="InterPro:IPR008160"
                     /db_xref="PDB:5CTD"
                     /db_xref="PDB:5CTI"
                     /db_xref="PDB:5CVA"
                     /db_xref="UniProtKB/Swiss-Prot:P08123"
                     /protein_id="CAA98969.1"
                     /translation="MLSFVDTRTLLLLAVTLCLATCQSLQEETVRKGPAGDRGPRGER
                     GPPGPPGRDGEDGPTGPPGPPGPPGPPGLGGNFAAQYDGKGVGLGPGPMGLMGPRGPP
                     GAAGAPGPQGFQGPAGEPGEPGQTGPAGARGPAGPPGKAGEDGHPGKPGRPGERGVVG
                     PQGARGFPGTPGLPGFKGIRGHNGLDGLKGQPGAPGVKGEPGAPGENGTPGQTGARGL
                     PGERGRVGAPGPAGARGSDGSVGPVGPAGPIGSAGPPGFPGAPGPKGEIGAVGNAGPA
                     GPAGPRGEVGLPGLSGPVGPPGNPGANGLTGAKGAAGLPGVAGAPGLPGPRGIPGPVG
                     AAGATGARGLVGEPGPAGSKGESGNKGEPGSAGPQGPPGPSGEEGKRGPNGEAGSAGP
                     PGPPGLRGSPGSRGLPGADGRAGVMGPPGSRGASGPAGVRGPNGDAGRPGEPGLMGPR
                     GLPGSPGNIGPAGKEGPVGLPGIDGRPGPIGPAGARGEPGNIGFPGPKGPTGDPGKNG
                     DKGHAGLAGARGAPGPDGNNGAQGPPGPQGVQGGKGEQGPAGPPGFQGLPGPSGPAGE
                     VGKPGERGLHGEFGLPGPAGPRGERGPPGESGAAGPTGPIGSRGPSGPPGPDGNKGEP
                     GVVGAVGTAGPSGPSGLPGERGAAGIPGGKGEKGEPGLRGEIGNPGRDGARGAHGAVG
                     APGPAGATGDRGEAGAAGPAGPAGPRGSPGERGEVGPAGPNGFAGPAGAAGQPGAKGE
                     RGAKGPKGENGVVGPTGPVGAAGPAGPNGPPGPAGSRGDGGPPGMTGFPGAAGRTGPP
                     GPSGISGPPGPPGPAGKEGLRGPRGDQGPVGRTGEVGAVGPPGFAGEKGPSGEAGTAG
                     PPGTPGPQGLLGAPGILGLPGSRGERGLPGVAGAVGEPGPLGIAGPPGARGPPGAVGS
                     PGVNGAPGEAGRDGNPGNDGPPGRDGQPGHKGERGYPGNIGPVGAAGAPGPHGPVGPA
                     GKHGNRGETGPSGPVGPAGAVGPRGPSGPQGIRGDKGEPGEKGPRGLPGLKGHNGLQG
                     LPGIAGHHGDQGAPGSVGPAGPRGPAGPSGPAGKDGRTGHPGTVGPAGIRGPQGHQGP
                     AGPPGPPGPPGPPGVSGGGYDFGYDGDFYRADQPRSAPSLRPKDYEVDATLKSLNNQI
                     ETLLTPEGSRKNPARTCRDLRLSHPEWSSGYYWIDPNQGCTMDAIKVYCDFSTGETCI
                     RAQPENIPAKNWYRSSKDKKHVWLGETINAGSQFEYNVEGVTSKEMATQLAFMRLLAN
                     YASQNITYHCKNSIAYMDEETGNLKKAVILQGSNDVELVAEGNSRFTYTVLVDGCSKK
                     TNEWGKTIIEYKTNKPSRLPFLDIAPLDIGGADHEFFVDIGPVCFK"
     sig_peptide     140..205
     misc_feature    206..376
                     /note="N_propeptide"
     exon            210..220
                     /number=2
     exon            221..235
                     /number=3
     exon            236..271
                     /number=4
     exon            272..364
                     /number=5
     exon            365..418
                     /number=6
     misc_feature    377..409
                     /note="N_telopeptide"
     misc_feature    410..3451
                     /note="triple_helix"
     exon            419..463
                     /number=7
     exon            464..517
                     /number=8
     exon            518..571
                     /number=9
     exon            572..625
                     /number=10
     exon            626..679
                     /number=11
     exon            680..733
                     /number=12
     exon            734..778
                     /number=13
     exon            779..832
                     /number=14
     exon            833..877
                     /number=15
     exon            878..931
                     /number=16
     exon            932..1030
                     /number=17
     exon            1031..1075
                     /number=18
     exon            1076..1174
                     /number=19
     exon            1175..1228
                     /number=20
     exon            1229..1336
                     /number=21
     exon            1337..1390
                     /number=22
     exon            1391..1489
                     /number=23
     exon            1490..1543
                     /number=24
     exon            1544..1642
                     /number=25
     exon            1643..1696
                     /number=26
     exon            1697..1750
                     /number=27
     exon            1751..1804
                     /number=28
     exon            1805..1858
                     /number=29
     exon            1859..1903
                     /number=30
     exon            1904..2002
                     /number=31
     exon            2003..2110
                     /number=32
     exon            2111..2164
                     /number=33
     exon            2165..2218
                     /number=34
     exon            2219..2272
                     /number=35
     exon            2273..2326
                     /number=36
     exon            2327..2434
                     /number=37
     exon            2435..2488
                     /number=38
     exon            2489..2542
                     /number=39
     exon            2543..2704
                     /number=40
     exon            2705..2812
                     /number=41
     exon            2813..2920
                     /number=42
     exon            2921..2974
                     /number=43
     exon            2975..3082
                     /number=44
     exon            3083..3136
                     /number=45
     exon            3137..3244
                     /number=46
     exon            3245..3298
                     /number=47
     exon            3299..3406
                     /number=48
     exon            3407..3665
                     /number=49
     misc_feature    3452..3496
                     /note="C_telopeptide"
     misc_feature    3497..4237
                     /note="C_propeptide"
     exon            3666..3850
                     /number=50
     exon            3851..4093
                     /number=51
     misc_feature    3938..3946
                     /note="carbohydrate attachment site"
     exon            4094..5082
                     /number=52
     regulatory      4420..4424
                     /regulatory_class="polyA_signal_sequence"
     polyA_site      4450
     regulatory      4515..4520
                     /regulatory_class="polyA_signal_sequence"
     regulatory      4529..4534
                     /regulatory_class="polyA_signal_sequence"
     polyA_site      4550
     regulatory      4866..4871
                     /regulatory_class="polyA_signal_sequence"
     polyA_site      4885
     regulatory      5032..5037
                     /regulatory_class="polyA_signal_sequence"
     regulatory      5053..5058
                     /regulatory_class="polyA_signal_sequence"
     polyA_site      5082
BASE COUNT         1081 a         1290 c         1484 g         1229 t
ORIGIN      
        1 agcaccacgg cagcaggagg tttcggncta agttggaggt actggnccac gactgcatgc
       61 ccgcgcccgc caggtgatac ctccgccggt gacccagggg ctctgcgaca caaggagtct
      121 gcatgtctaa gtgctagaca tgctcagctt tgtggatacg cggactttgt tgctgcttgc
      181 agtaacctta tgcctagcaa catgccaatc tttacaagag gaaactgtaa gaaagggccc
      241 agccggagat agaggaccac gtggagaaag gggtccacca ggccccccag gcagagatgg
      301 tgaagatggt cccacaggcc ctcctggtcc acctggtcct cctggccccc ctggtctcgg
      361 tgggaacttt gctgctcagt atgatggaaa aggagttgga cttggccctg gaccaatggg
      421 cttaatggga cctagaggcc cacctggtgc agctggagcc ccaggccctc aaggtttcca
      481 aggacctgct ggtgagcctg gtgaacctgg tcaaactggt cctgcaggtg ctcgtggtcc
      541 agctggccct cctggcaagg ctggtgaaga tggtcaccct ggaaaacccg gacgacctgg
      601 tgagagagga gttgttggac cacagggtgc tcgtggtttc cctggaactc ctggacttcc
      661 tggcttcaaa ggcattaggg gacacaatgg tctggatgga ttgaagggac agcccggtgc
      721 tcctggtgtg aagggtgaac ctggtgcccc tggtgaaaat ggaactccag gtcaaacagg
      781 agcccgtggg cttcctggtg agagaggacg tgttggtgcc cctggcccag ctggtgcccg
      841 tggcagtgat ggaagtgtgg gtcccgtggg tcctgctggt cccattgggt ctgctggccc
      901 tccaggcttc ccaggtgccc ctggccccaa gggtgaaatt ggagctgttg gtaacgctgg
      961 tcctgctggt cccgccggtc cccgtggtga agtgggtctt ccaggcctct ccggccccgt
     1021 tggacctcct ggtaatcctg gagcaaacgg ccttactggt gccaagggtg ctgctggcct
     1081 tcccggcgtt gctggggctc ccggcctccc tggaccccgc ggtattcctg gccctgttgg
     1141 tgctgccggt gctactggtg ccagaggact tgttggtgag cctggtccag ctggctccaa
     1201 aggagagagc ggtaacaagg gtgagcccgg ctctgctggg ccccaaggtc ctcctggtcc
     1261 cagtggtgaa gaaggaaaga gaggccctaa tggggaagct ggatctgccg gccctccagg
     1321 acctcctggg ctgagaggta gtcctggttc tcgtggtctt cctggagctg atggcagagc
     1381 tggcgtcatg ggccctcctg gtagtcgtgg tgcaagtggc cctgctggag tccgaggacc
     1441 taatggagat gctggtcgcc ctggggagcc tggtctcatg ggacccagag gtcttcctgg
     1501 ttcccctgga aatatcggcc ccgctggaaa agaaggtcct gtcggcctcc ctggcatcga
     1561 cggcaggcct ggcccaattg gcccagctgg agcaagagga gagcctggca acattggatt
     1621 ccctggaccc aaaggcccca ctggtgatcc tggcaaaaac ggtgataaag gtcatgctgg
     1681 tcttgctggt gctcggggtg ctccaggtcc tgatggaaac aatggtgctc agggacctcc
     1741 tggaccacag ggtgttcaag gtggaaaagg tgaacagggt cccgctggtc ctccaggctt
     1801 ccagggtctg cctggcccct caggtcccgc tggtgaagtt ggcaaaccag gagaaagggg
     1861 tctccatggt gagtttggtc tccctggtcc tgctggtcca agaggggaac gcggtccccc
     1921 aggtgagagt ggtgctgccg gtcctactgg tcctattgga agccgaggtc cttctggacc
     1981 cccagggcct gatggaaaca agggtgaacc tggtgtggtt ggtgctgtgg gcactgctgg
     2041 tccatctggt cctagtggac tcccaggaga gaggggtgct gctggcatac ctggaggcaa
     2101 gggagaaaag ggtgaacctg gtctcagagg tgaaattggt aaccctggca gagatggtgc
     2161 tcgtggtgct catggtgctg taggtgcccc tggtcctgct ggagccacag gtgaccgggg
     2221 cgaagctggg gctgctggtc ctgctggtcc tgctggtcct cggggaagcc ctggtgaacg
     2281 tggcgaggtc ggtcctgctg gccccaacgg atttgctggt ccggctggtg ctgctggtca
     2341 accgggtgct aaaggagaaa gaggagccaa agggcctaag ggtgaaaacg gtgttgttgg
     2401 tcccacaggc cccgttggag ctgctggccc agctggtcca aatggtcccc ccggtcctgc
     2461 tggaagtcgt ggtgatggag gcccccctgg tatgactggt ttccctggtg ctgctggacg
     2521 gactggtccc ccaggaccct ctggtatttc tggccctcct ggtccccctg gtcctgctgg
     2581 gaaagaaggg cttcgtggtc ctcgtggtga ccaaggtcca gttggccgaa ctggagaagt
     2641 aggtgcagtt ggtccccctg gcttcgctgg tgagaagggt ccctctggag aggctggtac
     2701 tgctggacct cctggcactc caggtcctca gggtcttctt ggtgctcctg gtattctggg
     2761 tctccctggc tcgagaggtg aacgtggtct acctggtgtt gctggtgctg tgggtgaacc
     2821 tggtcctctt ggcattgccg gccctcctgg ggcccgtggt cctcctggtg ctgtgggtag
     2881 tcctggagtc aacggtgctc ctggtgaagc tggtcgtgat ggcaaccctg ggaacgatgg
     2941 tcccccaggt cgcgatggtc aacccggaca caagggagag cgcggttacc ctggcaatat
     3001 tggtcccgtt ggtgctgcag gtgcacctgg tcctcatggc cccgtgggtc ctgctggcaa
     3061 acatggaaac cgtggtgaaa ctggtccttc tggtcctgtt ggtcctgctg gtgctgttgg
     3121 cccaagaggt cctagtggcc cacaaggcat tcgtggcgat aagggagagc ccggtgaaaa
     3181 ggggcccaga ggtcttcctg gcttaaaggg acacaatgga ttgcaaggtc tgcctggtat
     3241 cgctggtcac catggtgatc aaggtgctcc tggctccgtg ggtcctgctg gtcctagggg
     3301 ccctgctggt ccttctggcc ctgctggaaa agatggtcgc actggacatc ctggtacggt
     3361 tggacctgct ggcattcgag gccctcaggg tcaccaaggc cctgctggcc cccctggtcc
     3421 ccctggccct cctggacctc caggtgtaag cggtggtggt tatgactttg gttacgatgg
     3481 agacttctac agggctgacc agcctcgctc agcaccttct ctcagaccca aggactatga
     3541 agttgatgct actctgaagt ctctcaacaa ccagattgag acccttctta ctcctgaagg
     3601 ctctagaaag aacccagctc gcacatgccg tgacttgaga ctcagccacc cagagtggag
     3661 cagtggttac tactggattg accctaacca aggatgcact atggatgcta tcaaagtata
     3721 ctgtgatttc tctactggcg aaacctgtat ccgggcccaa cctgaaaaca tcccagccaa
     3781 gaactggtat aggagctcca aggacaagaa acacgtctgg ctaggagaaa ctatcaatgc
     3841 tggcagccag tttgaatata atgtagaagg agtgacttcc aaggaaatgg ctacccaact
     3901 tgccttcatg cgcctgctgg ccaactatgc ctctcagaac atcacctacc actgcaagaa
     3961 cagcattgca tacatggatg aggagactgg caacctgaaa aaggctgtca ttctacaggg
     4021 ctctaatgat gttgaacttg ttgctgaggg caacagcagg ttcacttaca ctgttcttgt
     4081 agatggctgc tctaaaaaga caaatgaatg gggaaagaca atcattgaat acaaaacaaa
     4141 taagccatca cgcctgccct tccttgatat tgcacctttg gacatcggtg gtgctgacca
     4201 tgaattcttt gtggacattg gcccagtctg tttcaaataa atgaactcaa tctaaattaa
     4261 aaaagaaaga aatttgaaaa aactttctct ttgccatttc ttcttcttct tttttaactg
     4321 aaagctgaat ccttccattt cttctgcaca tctacttgct taaattgtgg gcaaaagaga
     4381 aaaagaagga ttgatcagag cattgtgcaa tacagtttca ttaactcctt cccccgctcc
     4441 cccaaaaatt tgaatttttt tttcaacact cttacacctg ttatggaaaa tgtcaacctt
     4501 tgtaagaaaa ccaaaataaa aattgaaaaa taaaaaccat aaacatttgc accacttgtg
     4561 gcttttgaat atcttccaca gagggaagtt taaaacccaa acttccaaag gtttaaacta
     4621 cctcaaaaca ctttcccatg agtgtgatcc acattgttag gtgctgacct agacagagat
     4681 gaactgaggt ccttgttttg ttttgttcat aatacaaagg tgctaattaa tagtatttca
     4741 gatacttgaa gaatgttgat ggtgctagaa gaatttgaga agaaatactc ctgtattgag
     4801 ttgtatcgtg tggtgtattt tttaaaaaat ttgatttagc attcatattt tccatcttat
     4861 tcccaattaa aagtatgcag attatttgcc caaagttgtc ctcttcttca gattcagcat
     4921 ttgttctttg ccagtctcat tttcatcttc ttccatggtt ccacagaagc tttgtttctt
     4981 gggcaagcag aaaaattaaa ttgtacctat tttgtatatg tgagatgttt aaataaattg
     5041 tgaaaaaaat gaaataaagc atgtttggtt ttccaaaaga acatat
//