LOCUS Z74616 5086 bp mRNA linear HUM 21-OCT-2008 DEFINITION H.sapiens mRNA for prepro-alpha2(I) collagen. ACCESSION Z74616 VERSION Z74616.1 KEYWORDS alpha2(I) collagen. SOURCE Homo sapiens (human) ORGANISM Homo sapiens Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Primates; Haplorrhini; Catarrhini; Hominidae; Homo. REFERENCE 1 (bases 1 to 54) AUTHORS Dickson L.A., de Wet W., Di Liberto M., Weil D., Ramirez F. TITLE Analysis of the promoter region and the N-propeptide domain of the human pro alpha 2(I) collagen gene JOURNAL Nucleic Acids Res. 13(10), 3427-3438(1985). PUBMED 4011429 REFERENCE 2 (bases 55 to 64) AUTHORS Sherwood A.L., Bottenus R.E., Martzen M.R., Bornstein P. TITLE Structural and functional analysis of the first intron of the human alpha 2(I) collagen-encoding gene JOURNAL Gene 89(2), 239-244(1990). PUBMED 2129528 REFERENCE 3 (bases 65 to 2434) AUTHORS Kuivaniemi H., Tromp G., Chu M., Prockop D.J. TITLE Structure of a full-length cDNA clone for the preproalpha2(I) chain of human type I procollagen JOURNAL Biochem. J. 252(3), 633-640(1988). PUBMED 3421913 REFERENCE 4 (bases 2435 to 3015;4191 to 5086) AUTHORS de Wet W., Bernard M., Benson-Chanda V., Chu M., Dickson L., Weil D., Ramirez F. TITLE Organization of the human pro-alpha 2(I) collagen gene JOURNAL J Biol Chem 262(33), 16032-16036(1987). PUBMED 2824475 REFERENCE 5 (bases 3016 to 4190) AUTHORS Makela J.K., Vuorio T., Vuorio E. TITLE Growth-dependent modulation of type I collagen production and mRNA levels in cultured human skin fibroblasts JOURNAL Biochim. Biophys. Acta 1049(2), 171-176(1990). PUBMED 2364107 REFERENCE 6 (bases 1 to 5086) AUTHORS Dalgleish R. JOURNAL Submitted (01-JUL-1996) to the INSDC. Raymond Dalgleish, Department of Genetics, University of Leicester, University Road, Leicester, LE1 7RH, United Kingdom REFERENCE 7 (bases 1 to 5086) AUTHORS Dalgleish R. TITLE The human type I collagen mutation database JOURNAL Nucleic Acids Res. 25(1), 181-187(1997). PUBMED 9016532 FEATURES Location/Qualifiers source 1..5086 /db_xref="H-InvDB:HIT000328004" /organism="Homo sapiens" /mol_type="mRNA" /db_xref="taxon:9606" 5'UTR 1..139 exon 1..209 /number=1 CDS 140..4240 /product="prepro-alpha2(I) collagen" /db_xref="GOA:P08123" /db_xref="H-InvDB:HIT000328004.13" /db_xref="HGNC:HGNC:2198" /db_xref="InterPro:IPR000885" /db_xref="InterPro:IPR008160" /db_xref="PDB:5CTD" /db_xref="PDB:5CTI" /db_xref="PDB:5CVA" /db_xref="UniProtKB/Swiss-Prot:P08123" /protein_id="CAA98969.1" /translation="MLSFVDTRTLLLLAVTLCLATCQSLQEETVRKGPAGDRGPRGER GPPGPPGRDGEDGPTGPPGPPGPPGPPGLGGNFAAQYDGKGVGLGPGPMGLMGPRGPP GAAGAPGPQGFQGPAGEPGEPGQTGPAGARGPAGPPGKAGEDGHPGKPGRPGERGVVG PQGARGFPGTPGLPGFKGIRGHNGLDGLKGQPGAPGVKGEPGAPGENGTPGQTGARGL PGERGRVGAPGPAGARGSDGSVGPVGPAGPIGSAGPPGFPGAPGPKGEIGAVGNAGPA GPAGPRGEVGLPGLSGPVGPPGNPGANGLTGAKGAAGLPGVAGAPGLPGPRGIPGPVG AAGATGARGLVGEPGPAGSKGESGNKGEPGSAGPQGPPGPSGEEGKRGPNGEAGSAGP PGPPGLRGSPGSRGLPGADGRAGVMGPPGSRGASGPAGVRGPNGDAGRPGEPGLMGPR GLPGSPGNIGPAGKEGPVGLPGIDGRPGPIGPAGARGEPGNIGFPGPKGPTGDPGKNG DKGHAGLAGARGAPGPDGNNGAQGPPGPQGVQGGKGEQGPAGPPGFQGLPGPSGPAGE VGKPGERGLHGEFGLPGPAGPRGERGPPGESGAAGPTGPIGSRGPSGPPGPDGNKGEP GVVGAVGTAGPSGPSGLPGERGAAGIPGGKGEKGEPGLRGEIGNPGRDGARGAHGAVG APGPAGATGDRGEAGAAGPAGPAGPRGSPGERGEVGPAGPNGFAGPAGAAGQPGAKGE RGAKGPKGENGVVGPTGPVGAAGPAGPNGPPGPAGSRGDGGPPGMTGFPGAAGRTGPP GPSGISGPPGPPGPAGKEGLRGPRGDQGPVGRTGEVGAVGPPGFAGEKGPSGEAGTAG PPGTPGPQGLLGAPGILGLPGSRGERGLPGVAGAVGEPGPLGIAGPPGARGPPGAVGS PGVNGAPGEAGRDGNPGNDGPPGRDGQPGHKGERGYPGNIGPVGAAGAPGPHGPVGPA GKHGNRGETGPSGPVGPAGAVGPRGPSGPQGIRGDKGEPGEKGPRGLPGLKGHNGLQG LPGIAGHHGDQGAPGSVGPAGPRGPAGPSGPAGKDGRTGHPGTVGPAGIRGPQGHQGP AGPPGPPGPPGPPGVSGGGYDFGYDGDFYRADQPRSAPSLRPKDYEVDATLKSLNNQI ETLLTPEGSRKNPARTCRDLRLSHPEWSSGYYWIDPNQGCTMDAIKVYCDFSTGETCI RAQPENIPAKNWYRSSKDKKHVWLGETINAGSQFEYNVEGVTSKEMATQLAFMRLLAN YASQNITYHCKNSIAYMDEETGNLKKAVILQGSNDVELVAEGNSRFTYTVLVDGCSKK TNEWGKTIIEYKTNKPSRLPFLDIAPLDIGGADHEFFVDIGPVCFK" sig_peptide 140..205 misc_feature 206..376 /note="N_propeptide" exon 210..220 /number=2 exon 221..235 /number=3 exon 236..271 /number=4 exon 272..364 /number=5 exon 365..418 /number=6 misc_feature 377..409 /note="N_telopeptide" misc_feature 410..3451 /note="triple_helix" exon 419..463 /number=7 exon 464..517 /number=8 exon 518..571 /number=9 exon 572..625 /number=10 exon 626..679 /number=11 exon 680..733 /number=12 exon 734..778 /number=13 exon 779..832 /number=14 exon 833..877 /number=15 exon 878..931 /number=16 exon 932..1030 /number=17 exon 1031..1075 /number=18 exon 1076..1174 /number=19 exon 1175..1228 /number=20 exon 1229..1336 /number=21 exon 1337..1390 /number=22 exon 1391..1489 /number=23 exon 1490..1543 /number=24 exon 1544..1642 /number=25 exon 1643..1696 /number=26 exon 1697..1750 /number=27 exon 1751..1804 /number=28 exon 1805..1858 /number=29 exon 1859..1903 /number=30 exon 1904..2002 /number=31 exon 2003..2110 /number=32 exon 2111..2164 /number=33 exon 2165..2218 /number=34 exon 2219..2272 /number=35 exon 2273..2326 /number=36 exon 2327..2434 /number=37 exon 2435..2488 /number=38 exon 2489..2542 /number=39 exon 2543..2704 /number=40 exon 2705..2812 /number=41 exon 2813..2920 /number=42 exon 2921..2974 /number=43 exon 2975..3082 /number=44 exon 3083..3136 /number=45 exon 3137..3244 /number=46 exon 3245..3298 /number=47 exon 3299..3406 /number=48 exon 3407..3665 /number=49 misc_feature 3452..3496 /note="C_telopeptide" misc_feature 3497..4237 /note="C_propeptide" exon 3666..3850 /number=50 exon 3851..4093 /number=51 misc_feature 3938..3946 /note="carbohydrate attachment site" exon 4094..5082 /number=52 regulatory 4420..4424 /regulatory_class="polyA_signal_sequence" polyA_site 4450 regulatory 4515..4520 /regulatory_class="polyA_signal_sequence" regulatory 4529..4534 /regulatory_class="polyA_signal_sequence" polyA_site 4550 regulatory 4866..4871 /regulatory_class="polyA_signal_sequence" polyA_site 4885 regulatory 5032..5037 /regulatory_class="polyA_signal_sequence" regulatory 5053..5058 /regulatory_class="polyA_signal_sequence" polyA_site 5082 BASE COUNT 1081 a 1290 c 1484 g 1229 t ORIGIN 1 agcaccacgg cagcaggagg tttcggncta agttggaggt actggnccac gactgcatgc 61 ccgcgcccgc caggtgatac ctccgccggt gacccagggg ctctgcgaca caaggagtct 121 gcatgtctaa gtgctagaca tgctcagctt tgtggatacg cggactttgt tgctgcttgc 181 agtaacctta tgcctagcaa catgccaatc tttacaagag gaaactgtaa gaaagggccc 241 agccggagat agaggaccac gtggagaaag gggtccacca ggccccccag gcagagatgg 301 tgaagatggt cccacaggcc ctcctggtcc acctggtcct cctggccccc ctggtctcgg 361 tgggaacttt gctgctcagt atgatggaaa aggagttgga cttggccctg gaccaatggg 421 cttaatggga cctagaggcc cacctggtgc agctggagcc ccaggccctc aaggtttcca 481 aggacctgct ggtgagcctg gtgaacctgg tcaaactggt cctgcaggtg ctcgtggtcc 541 agctggccct cctggcaagg ctggtgaaga tggtcaccct ggaaaacccg gacgacctgg 601 tgagagagga gttgttggac cacagggtgc tcgtggtttc cctggaactc ctggacttcc 661 tggcttcaaa ggcattaggg gacacaatgg tctggatgga ttgaagggac agcccggtgc 721 tcctggtgtg aagggtgaac ctggtgcccc tggtgaaaat ggaactccag gtcaaacagg 781 agcccgtggg cttcctggtg agagaggacg tgttggtgcc cctggcccag ctggtgcccg 841 tggcagtgat ggaagtgtgg gtcccgtggg tcctgctggt cccattgggt ctgctggccc 901 tccaggcttc ccaggtgccc ctggccccaa gggtgaaatt ggagctgttg gtaacgctgg 961 tcctgctggt cccgccggtc cccgtggtga agtgggtctt ccaggcctct ccggccccgt 1021 tggacctcct ggtaatcctg gagcaaacgg ccttactggt gccaagggtg ctgctggcct 1081 tcccggcgtt gctggggctc ccggcctccc tggaccccgc ggtattcctg gccctgttgg 1141 tgctgccggt gctactggtg ccagaggact tgttggtgag cctggtccag ctggctccaa 1201 aggagagagc ggtaacaagg gtgagcccgg ctctgctggg ccccaaggtc ctcctggtcc 1261 cagtggtgaa gaaggaaaga gaggccctaa tggggaagct ggatctgccg gccctccagg 1321 acctcctggg ctgagaggta gtcctggttc tcgtggtctt cctggagctg atggcagagc 1381 tggcgtcatg ggccctcctg gtagtcgtgg tgcaagtggc cctgctggag tccgaggacc 1441 taatggagat gctggtcgcc ctggggagcc tggtctcatg ggacccagag gtcttcctgg 1501 ttcccctgga aatatcggcc ccgctggaaa agaaggtcct gtcggcctcc ctggcatcga 1561 cggcaggcct ggcccaattg gcccagctgg agcaagagga gagcctggca acattggatt 1621 ccctggaccc aaaggcccca ctggtgatcc tggcaaaaac ggtgataaag gtcatgctgg 1681 tcttgctggt gctcggggtg ctccaggtcc tgatggaaac aatggtgctc agggacctcc 1741 tggaccacag ggtgttcaag gtggaaaagg tgaacagggt cccgctggtc ctccaggctt 1801 ccagggtctg cctggcccct caggtcccgc tggtgaagtt ggcaaaccag gagaaagggg 1861 tctccatggt gagtttggtc tccctggtcc tgctggtcca agaggggaac gcggtccccc 1921 aggtgagagt ggtgctgccg gtcctactgg tcctattgga agccgaggtc cttctggacc 1981 cccagggcct gatggaaaca agggtgaacc tggtgtggtt ggtgctgtgg gcactgctgg 2041 tccatctggt cctagtggac tcccaggaga gaggggtgct gctggcatac ctggaggcaa 2101 gggagaaaag ggtgaacctg gtctcagagg tgaaattggt aaccctggca gagatggtgc 2161 tcgtggtgct catggtgctg taggtgcccc tggtcctgct ggagccacag gtgaccgggg 2221 cgaagctggg gctgctggtc ctgctggtcc tgctggtcct cggggaagcc ctggtgaacg 2281 tggcgaggtc ggtcctgctg gccccaacgg atttgctggt ccggctggtg ctgctggtca 2341 accgggtgct aaaggagaaa gaggagccaa agggcctaag ggtgaaaacg gtgttgttgg 2401 tcccacaggc cccgttggag ctgctggccc agctggtcca aatggtcccc ccggtcctgc 2461 tggaagtcgt ggtgatggag gcccccctgg tatgactggt ttccctggtg ctgctggacg 2521 gactggtccc ccaggaccct ctggtatttc tggccctcct ggtccccctg gtcctgctgg 2581 gaaagaaggg cttcgtggtc ctcgtggtga ccaaggtcca gttggccgaa ctggagaagt 2641 aggtgcagtt ggtccccctg gcttcgctgg tgagaagggt ccctctggag aggctggtac 2701 tgctggacct cctggcactc caggtcctca gggtcttctt ggtgctcctg gtattctggg 2761 tctccctggc tcgagaggtg aacgtggtct acctggtgtt gctggtgctg tgggtgaacc 2821 tggtcctctt ggcattgccg gccctcctgg ggcccgtggt cctcctggtg ctgtgggtag 2881 tcctggagtc aacggtgctc ctggtgaagc tggtcgtgat ggcaaccctg ggaacgatgg 2941 tcccccaggt cgcgatggtc aacccggaca caagggagag cgcggttacc ctggcaatat 3001 tggtcccgtt ggtgctgcag gtgcacctgg tcctcatggc cccgtgggtc ctgctggcaa 3061 acatggaaac cgtggtgaaa ctggtccttc tggtcctgtt ggtcctgctg gtgctgttgg 3121 cccaagaggt cctagtggcc cacaaggcat tcgtggcgat aagggagagc ccggtgaaaa 3181 ggggcccaga ggtcttcctg gcttaaaggg acacaatgga ttgcaaggtc tgcctggtat 3241 cgctggtcac catggtgatc aaggtgctcc tggctccgtg ggtcctgctg gtcctagggg 3301 ccctgctggt ccttctggcc ctgctggaaa agatggtcgc actggacatc ctggtacggt 3361 tggacctgct ggcattcgag gccctcaggg tcaccaaggc cctgctggcc cccctggtcc 3421 ccctggccct cctggacctc caggtgtaag cggtggtggt tatgactttg gttacgatgg 3481 agacttctac agggctgacc agcctcgctc agcaccttct ctcagaccca aggactatga 3541 agttgatgct actctgaagt ctctcaacaa ccagattgag acccttctta ctcctgaagg 3601 ctctagaaag aacccagctc gcacatgccg tgacttgaga ctcagccacc cagagtggag 3661 cagtggttac tactggattg accctaacca aggatgcact atggatgcta tcaaagtata 3721 ctgtgatttc tctactggcg aaacctgtat ccgggcccaa cctgaaaaca tcccagccaa 3781 gaactggtat aggagctcca aggacaagaa acacgtctgg ctaggagaaa ctatcaatgc 3841 tggcagccag tttgaatata atgtagaagg agtgacttcc aaggaaatgg ctacccaact 3901 tgccttcatg cgcctgctgg ccaactatgc ctctcagaac atcacctacc actgcaagaa 3961 cagcattgca tacatggatg aggagactgg caacctgaaa aaggctgtca ttctacaggg 4021 ctctaatgat gttgaacttg ttgctgaggg caacagcagg ttcacttaca ctgttcttgt 4081 agatggctgc tctaaaaaga caaatgaatg gggaaagaca atcattgaat acaaaacaaa 4141 taagccatca cgcctgccct tccttgatat tgcacctttg gacatcggtg gtgctgacca 4201 tgaattcttt gtggacattg gcccagtctg tttcaaataa atgaactcaa tctaaattaa 4261 aaaagaaaga aatttgaaaa aactttctct ttgccatttc ttcttcttct tttttaactg 4321 aaagctgaat ccttccattt cttctgcaca tctacttgct taaattgtgg gcaaaagaga 4381 aaaagaagga ttgatcagag cattgtgcaa tacagtttca ttaactcctt cccccgctcc 4441 cccaaaaatt tgaatttttt tttcaacact cttacacctg ttatggaaaa tgtcaacctt 4501 tgtaagaaaa ccaaaataaa aattgaaaaa taaaaaccat aaacatttgc accacttgtg 4561 gcttttgaat atcttccaca gagggaagtt taaaacccaa acttccaaag gtttaaacta 4621 cctcaaaaca ctttcccatg agtgtgatcc acattgttag gtgctgacct agacagagat 4681 gaactgaggt ccttgttttg ttttgttcat aatacaaagg tgctaattaa tagtatttca 4741 gatacttgaa gaatgttgat ggtgctagaa gaatttgaga agaaatactc ctgtattgag 4801 ttgtatcgtg tggtgtattt tttaaaaaat ttgatttagc attcatattt tccatcttat 4861 tcccaattaa aagtatgcag attatttgcc caaagttgtc ctcttcttca gattcagcat 4921 ttgttctttg ccagtctcat tttcatcttc ttccatggtt ccacagaagc tttgtttctt 4981 gggcaagcag aaaaattaaa ttgtacctat tttgtatatg tgagatgttt aaataaattg 5041 tgaaaaaaat gaaataaagc atgtttggtt ttccaaaaga acatat //