LOCUS       HUMAPOAICI              8966 bp    DNA     linear   HUM 30-OCT-2000
DEFINITION  Human apolipoprotein A-I and C-III genes, complete cds.
ACCESSION   J00098 J00099 J00100 J00101 J03222 K01518 M10372
VERSION     J00098.1
KEYWORDS    apolipoprotein; apolipoprotein A-I; apolipoprotein C-III;
            glycoprotein; high density lipoprotein; isoform; lipoprotein;
            polymorphism; very low density lipoprotein.
SOURCE      Homo sapiens (human)
  ORGANISM  Homo sapiens
            Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi;
            Mammalia; Eutheria; Euarchontoglires; Primates; Haplorrhini;
            Catarrhini; Hominidae; Homo.
COMMENT     [3]  sites; translational and post-translational A-I cleavage
            [14]  sites; gene order and transcription directions. Draft entry
            and computer-readable sequence for [15] kindly provided by K.Reue,
            22-MAR-1988 Apolipoprotein A-I is the major protein component of
            high density lipoprotein (HDL) in the plasma. Synthesized in the
            liver and small intestine, it consists of two identical chains of
            77 amino acids; an 18-amino acid signal peptide is removed
            co-translationally and a 6-amino acid propeptide is cleaved
            post-translationally [7].  Variation in the latter step, in
            addition to modifications leading to so-called isoforms, is
            responsible for some of the polymorphism observed.  The A-I gene
            structure resembles the C-III gene structure included in this
            entry, but most resembles the IV and E genes [14], suggesting that
            these apolipoprotein gene sequences have arisen by intergenic and
            intragenic duplication. In particular, a 66 bp repeat (starting at
            base 1843) characterizes the fourth exon of the A-I, A-IV and E
            genes [8],[14].  The mRNA initiation site for A-I has not been
            exactly specified; if initiation is around base 469 as argued by
            [9] and [12], a potential TATA box is found at 438-443. Because
            some of the sequence differences are thought to be natural
            polymorphisms, we have annotated all differences as variations. The
            MspI site at base 1221 is responsible for RFLP [9].  The assignment
            of the A-I, C-III and A-IV genes to chromosome 11 is from Bruns,
            Karathanasis and Breslow (1984) Arteriosclerosis 4, 97-102. The
            linkage of these genes is demonstrated by [4],[11] and [14]. The
            A-I and A-IV genes are transcribed from the same strand (see
            segment 2 of this entry), while the C-III gene is transcribed
            convergently in relation to A-I. It remains open whether these
            genes are coordinately controlled.  The C-III gene product is a
            major component of the very low density lipoprotein (VLDL) of
            plasma, and a minor component of the HDL. It is a glycoprotein
            existing as three or more isoforms. While it has a signal peptide,
            no propeptide is observed.  Variation in the C-III gene occurs at
            5163 in such a way as to promote hypertriglyceridaemia [4] (SacI
            site RFLP). A potential TATA box is seen at 8170-8176; potential
            CAAT boxes are situated at 8229-8237 and 8243-8251; and a possible
            polyadenylation signal can be found at 5037-5042 (all on the
            complementary strand) [11].  Draft entry and clean copy of the
            sequences for [9] kindly provided by J.Seilhamer, 23-MAY-1985.
            Complete source information:
            Human DNA [5],[8],[9],[12],[11] and liver, cDNA to mRNA [1],[2],
FEATURES             Location/Qualifiers
     source          1..8966
                     /organism="Homo sapiens"
                     /mol_type="genomic DNA"
     variation       238..239
                     /note="gc in [5]"
     variation       282..284
                     /note="acg in [5]"
     variation       288..289
                     /note="cg in [5]"
     variation       310..312
                     /note="cct in [5]"
     variation       323
                     /note="t in [5]"
     variation       394
                     /note="a in [5]"
     variation       401..403
                     /note="ggc in [5]"
     gene            411..2335
     mRNA            join(411..486,683..745,932..1088,1677..2335)
                     /product="preproapolipoprotein A-I"
     variation       509..510
                     /note="gc in [5]"
     variation       562..563
                     /note="gg in [5],[9]"
     variation       612..613
                     /note="at in [5],[9]"
     variation       618..620
                     /note="ggg in [5],[9]"
     variation       653..655
                     /note="ccc in [5],[9]"
     mRNA            join(703..745,932..1088,1677..2280)
                     /product="preproapolipoprotein A-I"
     CDS             join(703..745,932..1088,1677..2280)
                     /note="preproapolipoprotein A-I"
     sig_peptide     join(703..745,932..942)
     mat_peptide     join(961..1088,1677..2277)
                     /product="apolipoprotein A-I"
     variation       766..767
                     /note="ga in [5]"
     variation       785
                     /note="t in [5]"
     variation       808..810
                     /note="cac in [5]"
     variation       1800
                     /note="a in [5]"
     old_sequence    1984
     old_sequence    2124
     old_sequence    2282..2283
     gene            complement(5016..8418)
     mRNA            complement(join(5016..5323,7161..7284,7420..7487,
                     /product="preapolipoprotein C-III"
     variation       5132
                     /note="a in [11]"
     variation       5163
                     /note="g in [10],[12],[11]; c in [4] (RFLP for SacI)"
     mRNA            complement(join(5203..5323,7161..7284,7420..7474))
                     /product="preapolipoprotein C-III"
     CDS             complement(join(5203..5323,7161..7284,7420..7474))
                     /note="preapolipoprotein C-III"
     mat_peptide     complement(join(5206..5323,7161..7237))
                     /product="apolipoprotein C-III"
     variation       7238
                     /note="g in [11]"
BASE COUNT         2045 a         2581 c         2645 g         1693 t
        1 ctaaagaaga gcactggtgg gaggacaggg cgggggaagg gggaggggag tgaagtagtc
       61 tccctggaat gctggtggtg ggggaggcag tctccttggt ggaggagtcc cagcgtccct
      121 cccctcccct cctctgccaa cacaatggac aatggcaact gcccacacac tcccatggag
      181 gggaagggga tgagtgcagg gaaccccgac cccacccggg agacctgcaa gcctgcagca
      241 ctcccctccc gcccccactg aacccttgac ccctgccctg cacgccccgc agcttgctgt
      301 ttgcccactc ctatttgccc agtcccaggg acagagctga tccttgaact cttaagttcc
      361 acattgccag gaccagtgag cagcaacagg gccagggctg ggcttatcag cctcccagcc
      421 cagaccctgg ctgcagacat aaataggccc tgcaagagct ggctgcttag agactgcgag
      481 aaggaggtgc gtcctgctgc ctgccccggc actctggctc cccagctcaa ggttcaggcc
      541 ttgccccagg ccgggcctct gggtacctga ggtcttctcc cgctctgtgc ccttctcctc
      601 acctggctgc aatgagtggg ggagcacggg gcttctgcat gctgaaggca ccccactcag
      661 ccaggccctt cttctcctcc aggtccccca cggcccttca ggatgaaagc tgcggtgctg
      721 accttggccg tgctcttcct gacgggtagg tgtcccctaa cctaggagcc aaccatcggg
      781 gggctttctc cctaaatccc cgtggcccac cctcctgggc agaggcagca ggtttctcac
      841 tggccccctc tcccccacct ccaagcttgg cctttcggct cagatctcag cccacagctg
      901 gcctgatctg ggtctcccct cccaccctca gggagccagg ctcggcattt ctggcagcaa
      961 gatgaacccc cccagagccc ctgggatcga gtgaaggacc tggccactgt gtacgtggat
     1021 gtgctcaaag acagcggcag agactatgtg tcccagtttg aaggctccgc cttgggaaaa
     1081 cagctaaagt aaggacccag cctggggttg agggcagggg cagggggcag aggcctgtgg
     1141 gatgatgttg aagccagact ggccgagtcc tcacctaata tctgatgagc tgggccccac
     1201 agatggtctg gatggagaaa ccggaatgga tctccaggca gggtcacagc ccatgtcccc
     1261 tgcaaaggac agaccagggc tgcccgatgc gtgatcacag agccacattg tgcctgcaag
     1321 tgtagcaagc ccctttccct tcttcaccac ctcctctgct cctgcccagc aagactgtgg
     1381 gctgtcttcg gagaggagaa tgcgctggag gcatagaagc gaggtccttc aagggcccac
     1441 tttggagacc aacgtaactg ggcaccagtc ccagctctgt ctccttttta gctcctctct
     1501 gtgcctcggt ccagctgcac aacggggcat ggcctggcgg ggcaggggtg ttggttgaga
     1561 gtgtactgga aatgctaggc cactgcacct ccgcggacag gtgtcaccca gggctcaccc
     1621 ctgataggct ggggcgctgg gaggccagcc ctcaaccctt ctgtctcacc ctccagccta
     1681 aagctccttg acaactggga cagcgtgacc tccaccttca gcaagctgcg cgaacagctc
     1741 ggccctgtga cccaggagtt ctgggataac ctggaaaagg agacagaggg cctgaggcaa
     1801 gagatgagca aggatctgga ggaggtgaag gccaaggtgc agccctacct ggacgacttc
     1861 cagaagaagt ggcaggagga gatggagctc taccgccaga aggtggagcc gctgcgcgca
     1921 gagctccaag agggcgcgcg ccagaagctg cacgagctgc aagagaagct gagcccactg
     1981 ggcgaggaga tgcgcgaccg cgcgcgcgcc catgtggacg cgctgcgcac gcatctggcc
     2041 ccctacagcg acgagctgcg ccagcgcttg gccgcgcgcc ttgaggctct caaggagaac
     2101 ggcggcgcca gactggccga gtaccacgcc aaggccaccg agcatctgag cacgctcagc
     2161 gagaaggcca agcccgcgct cgaggacctc cgccaaggcc tgctgcccgt gctggagagc
     2221 ttcaaggtca gcttcctgag cgctctcgag gagtacacta agaagctcaa cacccagtga
     2281 ggcgcccgcc gccgcccccc ttcccggtgc tcagaataaa cgtttccaaa gtgggaagca
     2341 gcttctttct tttgggagaa tagagggggg tgcggggaca tccgggggag cccgggaggg
     2401 gcctttggcc ctggagcagg gacttcctgc cggatctcaa caactccgtg cccagactgg
     2461 acgtcttagg gccaagatcg acgttggagg acctgctgga cgcntggctg cttacgagtg
     2521 agggagtaga gtctgcctta gcaaggctca agtagaaagg aagtcacagc ggacnaggca
     2581 aagccacaga caatccaagg ccaggtgccc tgaaaggggc tcaaacaagg cctgcagcct
     2641 gtctgaggcg ggccaggaaa cagcggttgc tttagctggg agcagtcggg ttccccgtcc
     2701 ccagaggtgt gtccgtatag agccttctcc agcccagccg ctgtcagcgg gcgggacgga
     2761 gcggggcgcc tcagggagcc agccactggg attggggttt ggtcccgggt gcaagtgaag
     2821 cgcttggagt ttgcgcctgt cctcctttac taattcaaaa acctctcaaa cagacacttc
     2881 ccttttcttc tcacaaggcc agtatccccc tcccactact cccatcccgc ccagaaacag
     2941 ccgcggcttc ctcaggcaca gcagtggaag ccagtcctcc accccctgcg gctccatgcc
     3001 atgccacccc ctctttctgc cagccctggc agaagctggc ctgagtaaga aaattcacca
     3061 ccacctcttg caggtacatt tttatttcca agatgctctc atatctgtgc tctcactgca
     3121 tcctcccttc cccacatcct ggctagattg ccatcagacg cagagcatgg atgaggacac
     3181 tgaagcctgg acctgtgacg tcgcttgccc agtgaacagc aggatgggct aggcagcgct
     3241 ttttagaccc tgcacccctg gccatccatg attattgaaa agagtgtgcg ggtcgggtgc
     3301 ggtggctcaa gcctgtaatc ccagcacttt gggaggctga ggtgggcgta tcacttcagg
     3361 gccaggagtt tgagaccagc ctggccaata tggtgaaacc ctgtctctac taaaaataca
     3421 aaaaaaaaat cagctggcat ggtggcttgc acccgtaatc ccagctacta ggaaggctga
     3481 ggcaggagaa tcgcttgaac ctgggaggca gaggtcacag tgagccgaaa tcatgccact
     3541 gcactccagc ctgggcgacg gagcaagact ccagctaaaa aaaaaaaaaa aaaaaaaaag
     3601 agtgtgtggc ctggcactca agttcacatg ggtgtgcagg catgcctgtg tattctcaca
     3661 tgacctccct gctcacggtc cctccttgca ctcatgtctg aatgtccccg ctgcacgcac
     3721 atggcttcac agatctggca gtgccttccc taccctctct ctgcagggcc ttttgccccc
     3781 tcatgcaggc ccctggataa tcggccccat ccccatgtcc ccatctccag tgtatcttag
     3841 ctaccctagg taaaggagtg ggctttttag ttcctaacct tccagagcta caacagcagt
     3901 catccagcca ggtctgggtg ggaacatttt ctagatacgg gtgctgagat ctctcagccc
     3961 agagagaagc cctggggaat tttcagagag aaagcagtct ccaggtgggg ctggatgtac
     4021 tgatgccact gagatctgta aaggagtccc taacacctga cataggagtg acaaaactgt
     4081 tttctgcacc aactgagcag aatacacgca gctgacctgg gctcaaggtc tggccctgcc
     4141 acgtgctggc tctgtgatgc tggccaagtg ccttcgcctc tccgggccac agttttttga
     4201 tctgaagagt ggagccctac tcaagccatc tgcagctctc gggctctctg acctgacatc
     4261 tttcgggtgg tggggacaca aaggaagcag cctctattgg gagaccttgt gcttcttttt
     4321 ggtcccagga cactgccccc caccactcca gtccggtccc aagggcccag tcagctcaac
     4381 tgtaatcatg acaacattga tcaagcatct ttacgtgcag gtgctgtgcc aaacggttcg
     4441 aacgctctct catttcaatc tcacggcaaa cctacggtgg agggggtacg gttgtatcca
     4501 ctttacatgt aagaaactga ggctgatatc aagtggtgga gccaagaata gtgcctcgtt
     4561 gcatcttact ccaacctcta gcccatccgg cctcctccct tcacgtgcgc ctaagagggc
     4621 taggtggcct ggatagggga ggtcagctcc acagttttga gtaaacacac acagtctcaa
     4681 ctctgatgac acttaagtgc caggcatagt ggctggcatg gggcacacac tcaagtcatg
     4741 ttgtgcagca cctaacagtt tatcaaagta tcagcaaact tattgtcctg tttgaccttc
     4801 cgcacaaagc tgtcaaggaa ggcagggtac ggagggtgat tcctacctta gagatgaaga
     4861 aactgaggcc cagagactag cccagctacc agaaggtgga tagagcgctg gcctccatgc
     4921 ctgcctgacc tggagtctgt ccagtgccca cccacagaac agcctcggcc ctttcccatg
     4981 cccagacaca gatggcacac ttgcgacggc ccactcatag cagcttcttg tccagcttta
     5041 ttgggaggcc agcatgcctg gaggggggcc aggcatgagg tggggtagga gagcactgag
     5101 aatactgtcc cttttaagca acctacaggg gaagccctgg agattgcagg acccaaggag
     5161 ctggcaggat ggataggcag gtggacttgg ggtattgagg tctcaggcag ccacggctga
     5221 agttggtctg acctcagggt ccaaatccca gaactcagag aacttgtcct taacggtgct
     5281 ccagtagtct ttcagggaac tgaagccatc ggtcacccag cccctaaatc agtcagggga
     5341 agcaacagag cagggcatga gaactcctct gtaggcaacc atgggaccca cacccatgtc
     5401 cccactggac gacaccagtc aggaccacac caccctctca acttcactgg acgacagccc
     5461 tgagacctca ggcaggaatc ccccacagct gccaccctgg gagaagagat atccttgcag
     5521 gaaccccagc acaagtcaaa ccctgccatc tccaggtcac ctcagatgtt tatgcccctg
     5581 ggcctgaggc acaaagtgac agggtgggga gatgtgaaag gtcaaggctg tcattgtttt
     5641 ctgtgcaaac agcaccacct ggagttgcac aacctggtgg ctctgagcag ggtaggacag
     5701 agggaggcag cctctcattt ggaaagtcat tggaggattg attagttgtg tgatctgggg
     5761 caggtcacct aatcgctctg agcctcaatt ttctcatctg caaagtgaga aaataacacc
     5821 taccccaaag ccggttctgg ggactaagaa tgtttatgaa cacctctgct atgccagcta
     5881 atgccagtca gggtgaggtg gagaagggag taggggagag gagagtactg atgggcaggg
     5941 gcaggatggg agaggaagga cacacttctg ctcacactgg ccccaggtct tagaacaaag
     6001 cagaagcact cacgggcttg aattgggtca ggtggggcct cccaaggcaa acccctagcc
     6061 cagggtcccc aacccccaag tcatagatag gtactggtcc atggactgtt aggagccggg
     6121 ctgcacagca cgaagtgagt ggtgggtgag tgagcaatac gcctgagctc cgcctcctgt
     6181 cagatcaagt ggccttacat tctcatagga gtgtgaaccc tattgtgaac tgcacatgcc
     6241 agggatctag gtggagggct ccttatgaga atctaatgcc tgatgatctg aggtggaaca
     6301 gcttcatccc aaaaccatcc ccgctggccc atggaaaaat tgtccaccac aaaactggtc
     6361 cctggtgcca aaaaggctga ggaccactgc tctagccaga ccttcagaaa aggaaaatgg
     6421 ggccaggcgc agtggctcgt gcctgtaatc ccagcacttt gggaggccga ggcaggagga
     6481 tcccctgagg tcaggaattc aagaccaacc tggccaacat ggtgaaaccc catctctgct
     6541 aaaaatacaa aaattagctg ggtgtggtgg cgcgtgcctg taatcccagc tacttgggag
     6601 gctgaggcag gagaatgggt tgaacccggg agacggaggt tgcagtgagc cgagatggca
     6661 ccactgcact ccagcctggg tgacagaggg agactccatt aaaaaaaaaa gaagaagaaa
     6721 gaaagagaga aaggaagaga gggaaagaaa gaaaaaggaa agaaagaaag aaagagaaag
     6781 aaagaaaaga aagaaagaaa ggaaagaaag aaaggaaaga aagaaagaaa gaaagaaaga
     6841 aagaaagaaa gaaagaaaga aagaaaaata gaaagagaaa gaaagggaaa ggaaaggaaa
     6901 atggaaatga gggctaaaac ggcgcggccc taggactgct ccgggagaaa ggagggacct
     6961 gggaggaatg cagtctcaac tctgtcatct ctcccgcagc agcctgacaa aggccctgtg
     7021 aggaaagagg aggctgaaga gcacaggccc aggggaggct ggagcacctc cattccattg
     7081 ttgggatctc accagggcag gggcgggtgg gaatggaggc agctggcagg ggggatgggg
     7141 agggaggcca gcgggtgtac ctggcctgct gggccacctg ggactcctgc acgctgctca
     7201 gtgcatcctt ggcggtcttg gtggcgtgct tcatgtagcc ctgcatgaag ctgagaaggg
     7261 aggcatcctc ggcctctgaa gctcctgagg aaagagcagg gctgagtggg gtggatcggc
     7321 ctctggacga gccctggggc tcctgcttga ccacccattg gactgggatc cccaagttgc
     7381 ctccaccctg cccccagccc agtcccacca agtgcttacg ggcagaggcc aggagcgcca
     7441 ggagggcaac aacaaggagt acccggggct gcatggcacc tctgttcctg caaggaagtg
     7501 tcctgtgagg ggcaccccag gtccccatgc ctctggaccc ctccctgggg aggtggcgtg
     7561 gcccctaagg tagaacctta gctgggtctg ccagaaggag taggggccgg ctcctgctca
     7621 tacgggctct cagaaggggg actggtgagc ggcgagggat cgaggcccaa aggaggtggg
     7681 tgggatggag cagaaaaccc accagactga acatcaaggc actgcggtct ggactgatct
     7741 ccgtccagtc cagccaacat gctgtgtgtc tttgggtgat ttctggccct cttccaggcc
     7801 tcagtttccc tgtctggggt aggactgggc tgtctaaggc ttcctccagc cctaagcctg
     7861 aagaatgagg ggggaacctg cacttggagc cacttccagc cccaccccct atgtagcttt
     7921 gggcaagtga cacctcccgg gcctccatgt tcttcaggtt atgatgaggg gtggggggca
     7981 cccgtccagc tccgaggctt ccttagctct agcaagtgct tctccaggct tgctggctgg
     8041 gctgggcagg gagctcctct tgcccctctt catcctcctc ccctcctctt tcccctcccc
     8101 agagggcatt acctggagca gctgcctcta gggatgaact gagcagacag gcaggagggt
     8161 tctgacctgt tttatatcat ctccagggca gcaggcactg aggacccagg gcgctgggca
     8221 aaggtcacct gctgaccagt ggagatgagg gcctgaggca gggtgtccag atgcagcaag
     8281 cgggcgggag agttgggaaa tccctaggag actgagtcca cgctgctgtc ccgccagccc
     8341 tgcagcccag atgagctcag gaactggggg tgggcctggg gagcctcatt ccttcctagc
     8401 tgactggctc cccagggaga ggctggtgag aggggaaatg gcttggacat aggccagcgc
     8461 tcgggcccat ctcagccttt cacactggaa tttcaggccc ctccctccac cagccccaag
     8521 ccctgaacac agcctggagt agaggggtga ggggcttctt cagacttgag aacaagtggg
     8581 tggcttgggc tggggggtgt ttggagtaaa ggcacagaag accgggcatc agtggctcgc
     8641 aaggcctggt tacagggctg agctcacagc ccctcccagc acctccatct ctgggtttca
     8701 atccaggcag gcgagtgctg ctcacccagc agccccccac ccgcccccac cctgtgtgcc
     8761 ccaccgccgc ctccccctga gtgtagggca ggggttggtg gagaagcgca aggccgctca
     8821 gagcccgagg cctttgcccc tccctccacc aggctcccta ttttgccctc tggacccact
     8881 gataacatcc ctgggggagg agcccccacc aggactgatt ctctcgttca ccttcctggg
     8941 ttcctcagtg gggctggggg aagagc