LOCUS MG848611 7901 bp DNA linear VRL 20-MAR-2018 DEFINITION Human papillomavirus 16 isolate NCI_123034, partial genome. ACCESSION MG848611 VERSION MG848611.1 KEYWORDS . SOURCE Human papillomavirus 16 ORGANISM Human papillomavirus 16 Viruses; Monodnaviria; Shotokuvirae; Cossaviricota; Papovaviricetes; Zurhausenvirales; Papillomaviridae; Firstpapillomavirinae; Alphapapillomavirus; Alphapapillomavirus 9. REFERENCE 1 (bases 1 to 7901) AUTHORS Mirabello,L., Yeager,M., Yu,K., Clifford,G., Xiao,Y., Zhu,B., Cullen,M., Boland,J., Wentzensen,N., Nelson,C., Raine-Bennett,T., Zigui,C., Bass,S., Song,L., Yang,Q., Steinberg,M., Burdett,L., Dean,M., Roberson,D., Mitchell,J., Lorey,T., Franceschi,S., Castle,P., Walker,J., Zuna,R., Kreimer,A., Beachler,D., Hildesheim,A., Gonzalez,P., Porras,C., Burk,R. and Schiffman,M. TITLE HPV16 E7 Genetic Conservation Is Critical to Carcinogenesis JOURNAL Unpublished REFERENCE 2 (bases 1 to 7901) AUTHORS Mirabello,L., Yeager,M., Yu,K., Clifford,G., Xiao,Y., Zhu,B., Cullen,M., Boland,J., Wentzensen,N., Nelson,C., Raine-Bennett,T., Zigui,C., Bass,S., Song,L., Yang,Q., Steinberg,M., Burdett,L., Dean,M., Roberson,D., Mitchell,J., Lorey,T., Franceschi,S., Castle,P., Walker,J., Zuna,R., Kreimer,A., Beachler,D., Hildesheim,A., Gonzalez,P., Porras,C., Burk,R. and Schiffman,M. TITLE Direct Submission JOURNAL Submitted (25-JAN-2018) Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, Bethesda, MD 20850, USA COMMENT ##Assembly-Data-START## Assembly Method :: Torrent Mapping Alignment Program v. v5.0.13 Sequencing Technology :: IonTorrent ##Assembly-Data-END## FEATURES Location/Qualifiers source 1..7901 /organism="Human papillomavirus 16" /mol_type="genomic DNA" /isolate="NCI_123034" /host="Homo sapiens" /db_xref="taxon:333760" /geo_loc_name="USA" gene 79..555 /gene="E6" CDS 79..555 /gene="E6" /codon_start=1 /product="E6" /protein_id="AVN73810.1" /translation="MHQKRTAMFQDPQERPRKLPQLCTELQTTIHDIILEYVYCKQQL LRREVYDFAFRDLCIVYRDGNPYAVCDKCLKFYSKISEYRHYCYSVYGTTLEQQYNKP LCDLLIRCINCQKPLCPEEKQRHLDKKQRFHNIRGRWTGRCMSCCRSSRTRRETQL" gene 558..854 /gene="E7" CDS 558..854 /gene="E7" /codon_start=1 /product="E6" /protein_id="AVN73811.1" /translation="MHGDTPTLHEYMLDLQPETTDLYCYEQLNDSSEEEDEIDGPAGQ AEPDRAHYNIVTFCCKCDSTLRLCVQSTHVDIRTLEDLLMGTLGIVCPICSQKP" gene 861..2810 /gene="E1" misc_feature 861..2810 /gene="E1" /note="nonfunctional E1 due to mutation" gene 2752..3849 /gene="E2" CDS 2752..3849 /gene="E2" /codon_start=1 /product="E2" /protein_id="AVN73812.1" /translation="METLCQRLNVCQDKILTHYXNDSTDLRDHIDYWKHMRLECAIYY KAREMGFKHINHQVVPTLAVSKNKALQAIELQLTLETIYNSQYSNEKWTLQDVSLEVY LTAPTGCIKKHGYTVEVXFDGDICNTMHYTNWTHIYICEEASVTVVEGQVDYYGLYYV HEGIRTYFVQFKDDAEKYSKNKVWEVHAGGQVILCPTSVFSSNEVSSPEIIRQHLANH SAATHTKAVALGTEETQTTIQRPRSEPDTGNPCHTTKLLHRDSVDSAPILTAFNSSHK GRINCNSNTTPIVHLKGDANTLKCLRYRFKKHCTLYTAVSSTWHWTGHNVKHKSAIVT LTYDSEWQRDQFLSQVKIPKTITVSTGFMSI" gene 3329..3616 /gene="E4" misc_feature 3329..3616 /gene="E4" /note="nonfunctional E4 due to mutation" gene 3846..4097 /gene="E5" CDS 3846..4097 /gene="E5" /codon_start=1 /product="E5" /protein_id="AVN73813.1" /translation="MTNLDTASTTLLACFLLCFCVLLCVCLLIRPLLXXXXXXXXXXX XXXXXXXXXXXAFRCFIVYIVFVYIPLFLIHTHARFLIT" gene 4233..5654 /gene="L2" CDS 4233..5654 /gene="L2" /codon_start=1 /product="L2" /protein_id="AVN73814.1" /translation="MRHKRSAKRTKRASATQLYKTCKQAGTCPPDIIPKVEGKTIADQ ILQYGSMGVFFGGLGIGTGSGTGGRTGYIPLGTRPPTATDTLAPVRPPLTVDPVGPSD PSIVSLVEETSFIDAGAPTSVPSIPPDVSGFSITTSTDTTPAILDINNTVTTVTTHNN PTFTDPSVLQPPTPAETGGHFTLSSSTISTHNYEEIPMDTFIVSTNPNTVTSSTPIPG SRPVARLGLYSRTTQQVKVVDPAFVTTPTKLITYDNPAYEGIDVDNTLYFSSNDNSIN IAPDPDFLDIVALHRPALTSRRTGIRYSRIGNKQTLRTRSGKSIGAKVHYYYDLSTID PAEEIELQTITPSTYTTTSHAASPTSINNGLYDIYADDFITDTSTTPVPSVPSTSLSG YIPANTTIPFGGAYNIPLVSGPDIPINITDQAPSLIPIVPGSPQYTIIADAGDFYLHP SYYMLRKRRKRLPYFFSDVSLAA" BASE COUNT 2575 a 1357 c 1495 g 2387 t ORIGIN 1 caataattca tgtataaaac taagggcgta accgaaatcg gttgaaccga aaccggttag 61 tataaaagca gacattttat gcaccaaaag agaactgcaa tgtttcagga cccacaggag 121 cgacccagaa agttaccaca gttatgcaca gagctgcaaa caactataca tgatataata 181 ttagaatatg tgtactgcaa gcaacagtta ctacgacgtg aggtatatga ctttgctttt 241 cgggatttat gcatagtata tagagatggg aatccatatg ctgtatgtga taaatgttta 301 aagttttatt ctaaaattag tgagtataga cattattgtt atagtgtgta tggaacaaca 361 ttagaacagc aatacaacaa accgttgtgt gatttgttaa ttaggtgtat taactgtcaa 421 aagccactgt gtcctgaaga aaagcaaaga catctggaca aaaagcaaag attccataat 481 ataaggggtc ggtggaccgg tcgatgtatg tcttgttgca gatcatcaag aacacgtaga 541 gaaacccagc tgtaatcatg catggagata cacctacatt gcatgaatat atgttagatt 601 tgcaaccaga gacaactgat ctctactgtt atgagcaatt aaatgacagc tcagaggagg 661 aggatgaaat agatggtcca gctggacaag cagaaccgga cagagcccat tacaatattg 721 taaccttttg ttgcaagtgt gactctacgc ttcggttgtg cgtacaaagc acacacgtag 781 acattcgtac tttggaagac ctgttaatgg gcacactagg aattgtgtgc cccatctgtt 841 ctcagaaacc ataatctacc atggctgatc ctgcaggtac caatggggaa gagggtacgg 901 gatgtaatgg atggttttat gtagaggctg tagtggaaaa aaaaacaggg gatgctatat 961 cagatgacga gaacgaaaat gacagtgata caggtgaaga tttggtagat tttatagtaa 1021 atgataatga ttatttaaca caggcagaaa cagagacagc acatgcgttg tttactgcac 1081 aggaagcaaa acaacataga gatgcagtac aggttctaaa acgaaagtat ttgggtagtc 1141 cacttagtga tattagtgga tgtgtagaca ataatattag tcctagatta aaagctatat 1201 gtatagaaaa acaaagtaga gctgcaaaaa ggagattatt tgaaagcgaa gacagcgggt 1261 atggcaatac tgaagtggaa actcagcaga tgttacaggt agaagggcgc catgagactg 1321 aaacaccatg tagtcagtat agtggtggaa gtgggggtgg ttgcagtcag tacagtagtg 1381 gaagtggggg agagggtgtt agtgaaagac acactatatg ccaaacacca cttacaaata 1441 ttttaaatgt actaaaaact agtaatgcaa aggcagcaat gttagcaaaa tttaaagagt 1501 tatacggggt gagttttaca gaattagtaa gaccatttaa aagtaataaa tcaacgtgtt 1561 gcgattggtg tattgctgca tttggactta cacccagtat agctgacagt ataaaaacac 1621 tattacaaca atattgttta tatttacaca ttcaaagttt agcatgttca tggggaatgg 1681 ttgtgttact attagtaaga tataaatgtg gaaaaaatag agaaacaatt gaaaaattgc 1741 tgtctaaact attatgtgtg tctccaatgt gtatgatgat agagcctcca aaattgcgta 1801 gtacagcagc agcattatat tggtataaaa caggtatatc aaatattagt gaagtgtatg 1861 gagacacgcc agaatggata caaagacaaa cagtattaca acatagtttt aatgattgta 1921 catttgaatt atcacagatg gtacaatggg cctacgataa tgacatagta gacgatagtg 1981 aaattgcata taaatatgca caattggcag acactaatag taatgcaagt gcctttctaa 2041 aaagtaattc acaggcaaaa attgtaaagg attgtgcaac aatgtgtaga cattataaac 2101 gagcagaaaa aaaacaaatg agtatgagtc aatggataaa atatagatgt gatagggtag 2161 atgatggagg tgattggaag caaattgtta tgtttttaag gtatcaaggt gtagagttta 2221 tgtcattttt aactgcatta aaaagatttt tgcaaggcat acctaaaaaa aattgcatat 2281 tactatatgg tgcagctaac acaggtaaat cattatttgg tatgagttta atgaaatttc 2341 tgcaagggtc tgtaatatgt tttgtaaatt ctaaaagcca tttttggtta caaccattag 2401 cagatgccaa aataggtatg ttagatgatg ctacagtgcc ctgttggaac tacatagatg 2461 acaatttaag aaatgcattg gatggaaatt tagtttctat ggatgtaaag catagaccat 2521 tggtacaact aaaatgccct ccattattaa ttacatctaa cattaatgct ggtacagatt 2581 ctaggtggcc ttatttacat aatagattgg tggtgtttac atttcctaat gagtttccat 2641 ttgacgaaaa cggaaatcca gtgtatgagc ttaatgataa gaactggaaa tcctttttct 2701 caaggacgtg gtccagatta agtttgcacg aggacgagga caaggaaaac gatggagact 2761 ctttgccaac gtttaaatgt gtgtcaggac aaaatactaa cacattatna aaatgatagt 2821 acagacctac gtgaccatat agactattgg aaacacatgc gcctagaatg tgctatttat 2881 tacaaggcca gagaaatggg atttaaacat attaaccacc aggtggtgcc aacactggct 2941 gtatcaaaga ataaagcatt acaagcaatt gaactgcaac taacgttaga aacaatatat 3001 aactcacaat atagtaatga aaagtggaca ttacaagacg ttagccttga agtgtattta 3061 actgcaccaa caggatgtat aaaaaaacat ggatatacag tggaagtgna gtttgatgga 3121 gacatatgca atacaatgca ttatacaaac tggacacata tatatatttg tgaagaagca 3181 tcagtaactg tggtagaggg tcaagttgac tattatggtt tatattatgt tcatgaagga 3241 atacgaacat attttgtgca gtttaaagat gatgcagaaa aatatagtaa aaataaagta 3301 tgggaagttc atgcgggtgg tcaggtaata ttatgtccta catctgtgtt tagcagcaac 3361 gaagtatcct ctcctgaaat tattaggcag cacttggcca accactccgc cgcgacccat 3421 accaaagccg tcgccttggg caccgaagaa acacagacga ctatccagcg accaagatca 3481 gagccagaca ccggaaaccc ctgccacacc actaagttgt tgcacagaga ctcagtggac 3541 agtgctccaa tcctcactgc atttaacagc tcacacaaag gacggattaa ctgtaatagt 3601 aacactacac ccatagtaca tttaaaaggt gatgctaata ctttaaaatg tttaagatat 3661 agatttaaaa agcattgtac attgtatact gcagtgtcgt ctacatggca ttggacagga 3721 cataatgtaa aacataaaag tgcaattgtt acacttacat atgatagtga atggcaacgt 3781 gaccaatttt tgtctcaagt taaaatacca aaaactatta cagtgtctac tggatttatg 3841 tctatatgac aaatcttgat actgcatcca caacattact ggcgtgcttt ttgctttgct 3901 tttgtgtgct tttgtgtgtc tgcctattaa tacgtccgct gcttttnnnn nnnnnnnnnn 3961 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn gcgtttaggt 4021 gttttattgt atatattgta tttgtttata taccattatt tttaatacat acacatgcac 4081 gctttttaat tacataatgt atatgtacat aatgtaattg ttacatataa ttgttgtata 4141 ccataactta ctattttttc ttttttattt tcatatataa tttttttttt tgtttgtttg 4201 tttgtttttt aataaactgt tatcacttaa caatgcgaca caaacgttct gcaaaacgca 4261 caaaacgtgc atcggctacc caactttata aaacatgcaa acaggcaggt acatgtccac 4321 ctgacattat acctaaggtt gaaggcaaaa ctattgctga tcaaatatta caatatggaa 4381 gtatgggtgt attttttggt gggttaggaa ttggaacagg gtcgggtaca ggcggacgca 4441 ctgggtatat tccattggga acaaggcctc ccacagctac agatacactt gctcctgtaa 4501 gacccccttt aacagtagat cctgtgggcc cttctgatcc ttctatagtt tctttagtgg 4561 aagaaactag ttttattgat gctggtgcac caacatctgt accttccatt cccccagatg 4621 tatcaggatt tagtattact acttcaactg ataccacacc tgctatatta gatattaata 4681 atactgttac tactgttact acacataata atcccacttt cactgaccca tctgtattgc 4741 agcctccaac acctgcagaa actggagggc attttacact ttcatcatcc actattagta 4801 cacataatta tgaagaaatt cctatggata catttattgt tagcacaaac cctaacacag 4861 taactagtag cacacccata ccagggtctc gcccagtggc acgcctagga ttatatagtc 4921 gcacaacaca acaagttaaa gttgtagacc ctgcttttgt aaccactccc actaaactta 4981 ttacatatga taatcctgca tatgaaggta tagatgtgga taatacatta tatttttcta 5041 gtaatgataa tagtattaat atagctccag atcctgactt tttggatata gttgctttac 5101 ataggccagc attaacctct aggcgtactg gcattaggta cagtagaatt ggtaataaac 5161 aaacactacg tactcgtagt ggaaaatcta taggtgctaa ggtacattat tattatgatt 5221 taagtactat tgatcctgca gaagaaatag aattacaaac tataacacct tctacatata 5281 ctaccacttc acatgcagcc tcacctactt ctattaataa tggattatat gatatttatg 5341 cagatgactt tattacagat acttctacaa ccccggtacc atctgtaccc tctacatctt 5401 tatcaggtta tattcctgca aatacaacaa ttccttttgg tggtgcatac aatattcctt 5461 tagtatcagg tcctgatata cccattaata taactgacca agctccttca ttaattccta 5521 tagttccagg gtctccacaa tatacaatta ttgctgatgc aggtgacttt tatttacatc 5581 ctagttatta catgttacga aaacgacgta aacgtttacc atattttttt tcagatgtct 5641 ctttggctgc ctagtgaggc cactgtctac ttgcctcctg tcccagtatc taaggttgta 5701 agcacggatg aatatgttgc acgcacaaac atatattatc atgcaggaac atccagacta 5761 cttgcagttg gacatcccta ttttcctatt aaaaaaccta acaataacaa aatattagtt 5821 cctaaagtat caggattaca atacagggta tttagaatac atttacctga ccccaataag 5881 tttggttttc ctgacacctc attttataat ccagatacac agcggctggt ttgggcctgt 5941 gtaggtgttg aggtaggtcg tggtcagcca ttaggtgtgg gcattagtgg ccatccttta 6001 ttaaataaat tggatgacac agaaaatgct agtgcttatg cagcaaatgc aggtgtggat 6061 aatagagaat gtatatctat ggattacaaa caaacacaat tgtgtttaat tggttgcaaa 6121 ccacctatag gggaacactg gggcaaagga tccccatgta ccaatgttgc agtaaatcca 6181 ggtgattgtc caccattaga gttaataaac acagttattc aggatggtga tatggttgat 6241 actggctttg gtgctatgga ctttactaca ttacaggcta acaaaagtga agttccactg 6301 gatatttgta catctatttg caaatatcca gattatatta aaatggtgtc agaaccatat 6361 ggcgacagct tattttttta tttacgaagg gaacaaatgt ttgttagaca tttatttaat 6421 agggctggtg ctgttggtga aaatgtacca gacgatttat acattaaagg ctctgggtct 6481 actgcaaatt tagccagttc aaattatttt cctacaccta gtggttctat ggttacctct 6541 gatgcccaaa tattcaataa accttattgg ttacaacgag cacagggcca caataatggc 6601 atttgttggg gtaaccaact atttgttact gttgttgata ctacacgcag tacaaatatg 6661 tcattatgtg ctgccatatc tacttcagaa actacatata aaaatactaa ctttaaggag 6721 tacctacgac atggggagga atatgattta cagtttattt ttcaactgtg caaaataacc 6781 ttaactgcag acgttatgac atacatacat tctatgaatt ccactatttt ggaggactgg 6841 aattttggtc tacaacctcc cccaggaggc acactagaag atacttatag gtttgtaaca 6901 tcccaggcaa ttgcttgtca aaaacataca cctccagcac ctaaagaaga tccccttaaa 6961 aaatacactt tttgggaagt aaatttaaag gaaaagtttt ctgcagacct agatcagttt 7021 cctttaggac gcaaattttt actacaagca ggattgaagg ccaaaccaaa atttacatta 7081 ggaaaacgaa aagctacacc caccacctca tctacctcta caactgctaa acgcaaaaaa 7141 cgtaagctgt aagtattgta tgtatgttga attagtgttg tttgttgttt atatgtttgt 7201 atgtgcttgt atgtgcttgt aaatattaag ttgtatgtgt gtttgtatgt atggtataat 7261 aaacacgtgt gtatgtgttt ttaaatgctt gtgtaactat tgtgtcatgc aacataaata 7321 aacttattgt ttcaacacct actaattgtg ttgtggttat tcattgtata taaactatat 7381 ttgctacatc ctgtttttgt tttatatata ctatattttg tagcgccagc gnnnnnnnnn 7441 nnnnnnnnnn nnaattcggt tgcatgcttt ttggcacaaa atgtgttttt ttaaatagtt 7501 ctatgtcagc aactatagtt taaacttgta cgtttcctgc ttgccatgcg tgccaaatcc 7561 ctgttttcct gacctgcact gcttgccaac cattccattg ttttttacac tgcactatgt 7621 gcaactactg aatcactatg tacattgtgt catataaaat aaatcactat gcgccaacgc 7681 cttacatacc gctgttaggc acatattttt ggcttgtttt aactaaccta attgcatatt 7741 tggcataagg tttaaacttc taaggccaac taaatgtcac cctagttcat acatgaactg 7801 tgtaaaggtt agtcatacat tgttcatttg taaaactgca catgggtgtg tgcaaaccgt 7861 tttgggttac acatttacaa gcaacttata taataatact a //