LOCUS       AEE86573.1              1839 aa    PRT              PLN 23-MAR-2023
DEFINITION  Arabidopsis thaliana RNA polymerase II large subunit protein.
ACCESSION   CP002687-6654
PROTEIN_ID  AEE86573.1
SOURCE      Arabidopsis thaliana (thale cress)
  ORGANISM  Arabidopsis thaliana
            Eukaryota; Viridiplantae; Streptophyta; Embryophyta; Tracheophyta;
            Spermatophyta; Magnoliopsida; eudicotyledons; Gunneridae;
            Pentapetalae; rosids; malvids; Brassicales; Brassicaceae;
            Camelineae; Arabidopsis.
REFERENCE   1  (bases 1 to 18585056)
  AUTHORS   Mayer,K., Schuller,C., Wambutt,R., Murphy,G., Volckaert,G.,
            Pohl,T., Dusterhoft,A., Stiekema,W., Entian,K.D., Terryn,N.,
            Harris,B., Ansorge,W., Brandt,P., Grivell,L., Rieger,M.,
            Weichselgartner,M., de Simone,V., Obermaier,B., Mache,R.,
            Muller,M., Kreis,M., Delseny,M., Puigdomenech,P., Watson,M.,
            Schmidtheini,T., Reichert,B., Portatelle,D., Perez-Alonso,M.,
            Boutry,M., Bancroft,I., Vos,P., Hoheisel,J., Zimmermann,W.,
            Wedler,H., Ridley,P., Langham,S.A., McCullagh,B., Bilham,L.,
            Robben,J., Van der Schueren,J., Grymonprez,B., Chuang,Y.J.,
            Vandenbussche,F., Braeken,M., Weltjens,I., Voet,M., Bastiaens,I.,
            Aert,R., Defoor,E., Weitzenegger,T., Bothe,G., Ramsperger,U.,
            Hilbert,H., Braun,M., Holzer,E., Brandt,A., Peters,S., van
            Staveren,M., Dirske,W., Mooijman,P., Klein Lankhorst,R., Rose,M.,
            Hauf,J., Kotter,P., Berneiser,S., Hempel,S., Feldpausch,M.,
            Lamberth,S., Van den Daele,H., De Keyser,A., Buysshaert,C.,
            Gielen,J., Villarroel,R., De Clercq,R., Van Montagu,M., Rogers,J.,
            Cronin,A., Quail,M., Bray-Allen,S., Clark,L., Doggett,J., Hall,S.,
            Kay,M., Lennard,N., McLay,K., Mayes,R., Pettett,A.,
            Rajandream,M.A., Lyne,M., Benes,V., Rechmann,S., Borkova,D.,
            Blocker,H., Scharfe,M., Grimm,M., Lohnert,T.H., Dose,S., de
            Haan,M., Maarse,A., Schafer,M., Muller-Auer,S., Gabel,C., Fuchs,M.,
            Fartmann,B., Granderath,K., Dauner,D., Herzl,A., Neumann,S.,
            Argiriou,A., Vitale,D., Liguori,R., Piravandi,E., Massenet,O.,
            Quigley,F., Clabauld,G., Mundlein,A., Felber,R., Schnabl,S.,
            Hiller,R., Schmidt,W., Lecharny,A., Aubourg,S., Chefdor,F.,
            Cooke,R., Berger,C., Montfort,A., Casacuberta,E., Gibbons,T.,
            Weber,N., Vandenbol,M., Bargues,M., Terol,J., Torres,A.,
            Perez-Perez,A., Purnelle,B., Bent,E., Johnson,S., Tacon,D.,
            Jesse,T., Heijnen,L., Schwarz,S., Scholler,P., Heber,S., Francs,P.,
            Bielke,C., Frishman,D., Haase,D., Lemcke,K., Mewes,H.W.,
            Stocker,S., Zaccaria,P., Bevan,M., Wilson,R.K., de la Bastide,M.,
            Habermann,K., Parnell,L., Dedhia,N., Gnoj,L., Schutz,K., Huang,E.,
            Spiegel,L., Sehkon,M., Murray,J., Sheet,P., Cordes,M.,
            Abu-Threideh,J., Stoneking,T., Kalicki,J., Graves,T., Harmon,G.,
            Edwards,J., Latreille,P., Courtney,L., Cloud,J., Abbott,A.,
            Scott,K., Johnson,D., Minx,P., Bentley,D., Fulton,B., Miller,N.,
            Greco,T., Kemp,K., Kramer,J., Fulton,L., Mardis,E., Dante,M.,
            Pepin,K., Hillier,L., Nelson,J., Spieth,J., Ryan,E., Andrews,S.,
            Geisel,C., Layman,D., Du,H., Ali,J., Berghoff,A., Jones,K.,
            Drone,K., Cotton,M., Joshu,C., Antonoiu,B., Zidanic,M., Strong,C.,
            Sun,H., Lamar,B., Yordan,C., Ma,P., Zhong,J., Preston,R., Vil,D.,
            Shekher,M., Matero,A., Shah,R., Swaby,I.K., O'Shaughnessy,A.,
            Rodriguez,M., Hoffmann,J., Till,S., Granat,S., Shohdy,N.,
            Hasegawa,A., Hameed,A., Lodhi,M., Johnson,A., Chen,E., Marra,M.,
            Martienssen,R. and McCombie,W.R.
  TITLE     Sequence and analysis of chromosome 4 of the plant Arabidopsis
            thaliana
  JOURNAL   Nature 402 (6763), 769-777 (1999)
   PUBMED   10617198
REFERENCE   2  (bases 1 to 18585056)
  AUTHORS   Swarbreck,D., Lamesch,P., Wilks,C. and Huala,E.
  CONSRTM   TAIR
  TITLE     Direct Submission
  JOURNAL   Submitted (18-FEB-2011) Department of Plant Biology, Carnegie
            Institution, 260 Panama Street, Stanford, CA, USA
REFERENCE   3  (bases 1 to 18585056)
  AUTHORS   Krishnakumar,V., Cheng,C.-Y., Chan,A.P., Schobel,S., Kim,M.,
            Ferlanti,E.S., Belyaeva,I., Rosen,B.D., Micklem,G., Miller,J.R.,
            Vaughn,M. and Town,C.D.
  TITLE     Direct Submission
  JOURNAL   Submitted (17-MAY-2016) Plant Genomics, J. Craig Venter Institute,
            9704 Medical Center Dr, Rockville, MD 20850, USA
  REMARK    Protein update by submitter
FEATURES             Qualifiers
     source          /organism="Arabidopsis thaliana"
                     /mol_type="genomic DNA"
                     /db_xref="taxon:3702"
                     /chromosome="4"
                     /ecotype="Columbia"
     protein         /gene="NRPB1"
                     /locus_tag="AT4G35800"
                     /gene_synonym="F4B14.70"
                     /gene_synonym="F4B14_70"
                     /gene_synonym="RNA polymerase II large subunit"
                     /gene_synonym="RNA POLYMERASE II LARGE SUBUNIT"
                     /gene_synonym="RNA_POL_II_LS"
                     /gene_synonym="RNA_POL_II_LSRNA_POL_II_LS"
                     /gene_synonym="RPB1"
                     /inference="Similar to RNA sequence,
                     EST:INSD:AV529513.1,INSD:EL001743.1,INSD:ES076744.1,
                     INSD:ES014751.1,INSD:EH856227.1,INSD:EL022232.1,
                     INSD:AV554811.1,INSD:AV546119.1,INSD:EH961547.1,
                     INSD:AV523886.1,INSD:EL295851.1,INSD:ES081344.1,
                     INSD:EL033260.1,INSD:ES040114.1,INSD:T45030.1,
                     INSD:AV547057.1,INSD:AV548708.1,INSD:EH808821.1,
                     INSD:AV554798.1,INSD:DR354557.1,INSD:EL237055.1,
                     INSD:EH813706.1,INSD:ES030120.1,INSD:ES186100.1,
                     INSD:AV545826.1,INSD:EL250326.1,INSD:AV529650.1,
                     INSD:AV556763.1,INSD:EL027500.1,INSD:AV530018.1,
                     INSD:EL143612.1,INSD:EH992720.1,INSD:ES104750.1,
                     INSD:EL316983.1,INSD:BP833580.1,INSD:BP636667.1,
                     INSD:ES030860.1,INSD:EH969346.1,INSD:AI994941.1,
                     INSD:EL126186.1,INSD:AV545813.1,INSD:EL022096.1,
                     INSD:EL023179.1,INSD:AV533967.1,INSD:EL098473.1,
                     INSD:ES078743.1,INSD:EL058948.1,INSD:EH974765.1,
                     INSD:AV560562.1,INSD:EH976537.1,INSD:AV523757.1,
                     INSD:EL066882.1,INSD:EG503883.1"
                     /inference="Similar to RNA sequence,
                     mRNA:INSD:BX828750.1,INSD:AK221166.1"
                     /note="RNA polymerase II large subunit (NRPB1); FUNCTIONS
                     IN: DNA-directed RNA polymerase activity, DNA binding;
                     INVOLVED IN: transcription, transcription from RNA
                     polymerase II promoter; LOCATED IN: nucleus, chloroplast,
                     DNA-directed RNA polymerase II, core complex, vacuole;
                     EXPRESSED IN: 26 plant structures; EXPRESSED DURING: 15
                     growth stages; CONTAINS InterPro DOMAIN/s: RNA polymerase,
                     N-terminal (InterPro:IPR006592), RNA polymerase, alpha
                     subunit (InterPro:IPR000722), RNA polymerase II,
                     heptapeptide repeat, eukaryotic (InterPro:IPR000684), RNA
                     polymerase Rpb1, domain 7 (InterPro:IPR007073), RNA
                     polymerase Rpb1, domain 3 (InterPro:IPR007066), RNA
                     polymerase Rpb1, domain 1 (InterPro:IPR007080), RNA
                     polymerase Rpb1, domain 4 (InterPro:IPR007083), RNA
                     polymerase Rpb1, domain 5 (InterPro:IPR007081), RNA
                     polymerase Rpb1, domain 6 (InterPro:IPR007075); BEST
                     Arabidopsis thaliana protein match is: nuclear RNA
                     polymerase C1 (TAIR:AT5G60040.1); Has 181834 Blast hits to
                     82224 proteins in 9254 species: Archae - 731; Bacteria -
                     33255; Metazoa - 56600; Fungi - 34284; Plants - 19037;
                     Viruses - 3576; Other Eukaryotes - 34351 (source: NCBI
                     BLink)."
                     /db_xref="TAIR:AT4G35800"
                     /db_xref="Araport:AT4G35800"
     intron_pos      29:0 (1/12)
     intron_pos      117:0 (2/12)
     intron_pos      218:0 (3/12)
     intron_pos      271:0 (4/12)
     intron_pos      325:0 (5/12)
     intron_pos      398:2 (6/12)
     intron_pos      446:0 (7/12)
     intron_pos      558:0 (8/12)
     intron_pos      653:2 (9/12)
     intron_pos      740:0 (10/12)
     intron_pos      1760:2 (11/12)
     intron_pos      1784:2 (12/12)
BEGIN
        1 MDTRFPFSPA EVSKVRVVQF GILSPDEIRQ MSVIHVEHSE TTEKGKPKVG GLSDTRLGTI
       61 DRKVKCETCM ANMAECPGHF GYLELAKPMY HVGFMKTVLS IMRCVCFNCS KILADEEEHK
      121 FKQAMKIKNP KNRLKKILDA CKNKTKCDGG DDIDDVQSHS TDEPVKKSRG GCGAQQPKLT
      181 IEGMKMIAEY KIQRKKNDEP DQLPEPAERK QTLGADRVLS VLKRISDADC QLLGFNPKFA
      241 RPDWMILEVL PIPPPPVRPS VMMDATSRSE DDLTHQLAMI IRHNENLKRQ EKNGAPAHII
      301 SEFTQLLQFH IATYFDNELP GQPRATQKSG RPIKSICSRL KAKEGRIRGN LMGKRVDFSA
      361 RTVITPDPTI NIDELGVPWS IALNLTYPET VTPYNIERLK ELVDYGPHPP PGKTGAKYII
      421 RDDGQRLDLR YLKKSSDQHL ELGYKVERHL QDGDFVLFNR QPSLHKMSIM GHRIRIMPYS
      481 TFRLNLSVTS PYNADFDGDE MNMHVPQSFE TRAEVLELMM VPKCIVSPQA NRPVMGIVQD
      541 TLLGCRKITK RDTFIEKDVF MNTLMWWEDF DGKVPAPAIL KPRPLWTGKQ VFNLIIPKQI
      601 NLLRYSAWHA DTETGFITPG DTQVRIERGE LLAGTLCKKT LGTSNGSLVH VIWEEVGPDA
      661 ARKFLGHTQW LVNYWLLQNG FTIGIGDTIA DSSTMEKINE TISNAKTAVK DLIRQFQGKE
      721 LDPEPGRTMR DTFENRVNQV LNKARDDAGS SAQKSLAETN NLKAMVTAGS KGSFINISQM
      781 TACVGQQNVE GKRIPFGFDG RTLPHFTKDD YGPESRGFVE NSYLRGLTPQ EFFFHAMGGR
      841 EGLIDTAVKT SETGYIQRRL VKAMEDIMVK YDGTVRNSLG DVIQFLYGED GMDAVWIESQ
      901 KLDSLKMKKS EFDRTFKYEI DDENWNPTYL SDEHLEDLKG IRELRDVFDA EYSKLETDRF
      961 QLGTEIATNG DSTWPLPVNI KRHIWNAQKT FKIDLRKISD MHPVEIVDAV DKLQERLLVV
     1021 PGDDALSVEA QKNATLFFNI LLRSTLASKR VLEEYKLSRE AFEWVIGEIE SRFLQSLVAP
     1081 GEMIGCVAAQ SIGEPATQMT LNTFHYAGVS AKNVTLGVPR LREIINVAKR IKTPSLSVYL
     1141 TPEASKSKEG AKTVQCALEY TTLRSVTQAT EVWYDPDPMS TIIEEDFEFV RSYYEMPDED
     1201 VSPDKISPWL LRIELNREMM VDKKLSMADI AEKINLEFDD DLTCIFNDDN AQKLILRIRI
     1261 MNDEGPKGEL QDESAEDDVF LKKIESNMLT EMALRGIPDI NKVFIKQVRK SRFDEEGGFK
     1321 TSEEWMLDTE GVNLLAVMCH EDVDPKRTTS NHLIEIIEVL GIEAVRRALL DELRVVISFD
     1381 GSYVNYRHLA ILCDTMTYRG HLMAITRHGI NRNDTGPLMR CSFEETVDIL LDAAAYAETD
     1441 CLRGVTENIM LGQLAPIGTG DCELYLNDEM LKNAIELQLP SYMDGLEFGM TPARSPVSGT
     1501 PYHEGMMSPN YLLSPNMRLS PMSDAQFSPY VGGMAFSPSS SPGYSPSSPG YSPTSPGYSP
     1561 TSPGYSPTSP GYSPTSPTYS PSSPGYSPTS PAYSPTSPSY SPTSPSYSPT SPSYSPTSPS
     1621 YSPTSPSYSP TSPSYSPTSP AYSPTSPAYS PTSPAYSPTS PSYSPTSPSY SPTSPSYSPT
     1681 SPSYSPTSPS YSPTSPAYSP TSPGYSPTSP SYSPTSPSYG PTSPSYNPQS AKYSPSIAYS
     1741 PSNARLSPAS PYSPTSPNYS PTSPSYSPTS PSYSPSSPTY SPSSPYSSGA SPDYSPSAGY
     1801 SPTLPGYSPS STGQYTPHEG DKKDKTGKKD ASKDDKGNP
//