LEGACY CONTENT. If you are looking for Voteview.com, PLEASE CLICK HERE

This site is an archived version of Voteview.com archived from University of Georgia on May 23, 2017. This point-in-time capture includes all files publicly linked on Voteview.com at that time. We provide access to this content as a service to ensure that past users of Voteview.com have access to historical files. This content will remain online until at least January 1st, 2018. UCLA provides no warranty or guarantee of access to these files.

POLI 279 MEASUREMENT THEORY
Sixth Assignment
Due 23 May 2006


  1. In this problem we are going to use MCMCPack developed by Andrew Martin and Kevin Quinn to do a one-dimensional parametric unfolding of the 104th U.S. Senate. Download the two R programs below along with the Stata file for the 104th Senate roll calls and put them in the same directory:

    keith2.r -- Program to create R dataframe from Stata file
    keithMCMC.r -- Program to run Bayesian MCMC using MCMCPack
    Here is what keith2.r looks like:
    #
    #
    #    keith2.r -- This Program was written by Kevin Quinn and modfied by
    #                 Keith Poole to run on WINDOZE machines
    #
    rm(list=ls(all=TRUE))
    #
    library(foreign)              Be sure to load this library
    #
    ## does the 104th senate
    ## (assumes the relevant .dta files are in the current directory)
    #
    ## write out the filehead (only need to do that here)
    filehead <- "c:/ucsd_homework_5/sen104kh"   Change Path Statements as Appropriate
    filehead2 <- "sen104kh"                     This line needs to be changed as well
    #
    ## SHOULDN'T NEED TO CHANGE ANYTHING BELOW THIS LINE
    ## #######################################################
    #
    ## read data in
    filestring <- paste(filehead, ".dta", sep="")  Creates "c:/ucsd_homework_5/sen104kh.dta"
    mydata <- read.dta(filestring)                 read.dta is in the foreign library
    #
    ## add rownames corresponding to legislators' names
    rownames(mydata) <- gsub(" ", "", mydata$name) Pattern Match and Replacement -- takes the "name" column 
    #                                                in the Stata file and deletes all spaces  
    ## recode votes to 0/1/NA                        rownames uses the "name" variable as labels for the rows of the matrix 
    ## 1 yea, 0 nay, NA missing
    ncdata <- ncol(mydata)                         ncol gets the number of columns of the matrix 
    submat <- mydata[,10:ncdata]                   Note that the first roll call is the 10th variable in Stata
    submat[submat==0] <- -999
    submat[submat==7] <- -999
    submat[submat==8] <- -999
    submat[submat==9] <- -999
    submat[submat==4] <- 0
    submat[submat==5] <- 0
    submat[submat==6] <- 0
    submat[submat==2] <- 1
    submat[submat==3] <- 1
    submat[submat==-999] <- NA
    mydata[,10:ncdata] <- submat
    #
    ## change name of mydata to whatever filehead is    parse creates the character string "sen104kh <- mydata"
    expr <- parse(text=paste(filehead2, " <- mydata"))    using the paste command to concatenate the strings
    eval(expr)                                          expr (expression) creates an unevaluated call and eval(expression) 
    #                                                      evaluates the unevaluated expression.  Note that this is a clever way
    ## save as an .rda file                                 of passing R a command that would normally be entered from the command line     
    expr <- parse(text=paste("save(", filehead2, ", file=\"",
                    filehead, ".rda\", ascii=TRUE)", sep="")) This produces 
    eval(expr)                                         expression(save(sen104kh, file = "c:/ucsd_homework_5/sen104kh.rda", ascii = TRUE))
                                                        and eval(expr) executes the save(...) command
                                                        Note that the \" says use " as part of this string
    When you run keith2.r in R it creates an ASCII roll call file -- sen104kh.rda -- that MCMCPack reads.

    keithMCMC.r actually runs the Bayesian MCMC analysis of the roll call matrix created by keith2.r.
    #
    #
    #  keithMCMC.r -- Program to analyze Congressional Roll Calls using MCMCPack.
    #                    Program written by Kevin Quinn and modified by Keith Poole
    #
    library(MCMCpack)
    #
    load("c:/ucsd_homework_5/sen104kh.rda")
    #
    Sen.rollcalls <- sen104kh[,10:ncol(sen104kh)]    Pull in the roll calls
    #
    posterior <- MCMCirt1d(Sen.rollcalls,            Here is the call to the One-Dimensional IRT model in MCMCPack
                           theta.constraints=list("KENNEDY,ED"=-2, "HELMS"=2),  Note the Kennedy-Helms Constraint
                           burnin=2000, mcmc=100000, thin=20, verbose=500)      The Verbose switch writes to the screen every 500 iterations
    geweke.diag(posterior)        geweke.diag is a convergence diagnostic for Markov Chains
    pdf(file="c:/ucsd_homework_5/keithplots3.pdf")  This opens a PDF file for writing plots
    plot(posterior)    Note that posterior is 100000/20=5000 rows and 104 columns. This does a simple two panel plot of the Chain for every Senator
    summary(posterior) summary produces two tables -- a table of means and standard deviations, and a table of quantiles for the chain
    sss <- summary(posterior)
    write.table(sss$statistics,"c:/ucsd_homework_5/tab1.txt")  This just writes the statistics to a file
    write.table(sss$quantiles,"c:/ucsd_homework_5/tab2.txt")   This just writes the quantiles to a file
    dev.off()          This turns off the PDF output file
    plot(posterior[,1]) This produces a two panel plot of President Clinton -- the first column of the Markov Chain Matrix
    
    Run keithMCMC.r and you should see something like this:



    The quantiles file of the chains should look something like this:



    The means and standard deviations should look something like this:



    1. We need to get the Bayesian MCMC senator medians out of the file tab2.txt. To do this we are going to use a keyboard macro written as a text file like the one we used in Homework 4 Question 1. Using Epsilon bring up tab2.txt in a window and then split the screen (C-X2) and set it up so that you have tab2.txt in the top screen, sen104kh.ord the next window down, a blank file -- sen104MCMC.txt (you can call it something else if you want) -- and our macro, macro_hw5.txt (shown blank). Be sure to position the cursor at the top of each file except tab2.txt where it should be positioned in front of the record for President Clinton. It should look like this:



      Here is the first part of the macro and what it looks like executed. The nice thing about constructing a complex macro is that you can do it piecemeal and watch what happens as long as you do not overwrite your files!!! Note that in the macro SPACE means Space Bar, and LEFT, RIGHT, UP, and DOWN refer to the corresponding Arrow Keys. Also note that the macro is written in two lines. Epsilon interprets this as a C-M, a Line Feed or an Enter. This is handy because your macro does not have to be one long line!



      Here is what the completed macro looks like executed one time:



      And here it is at the finish:



      Here is the macro itself:

      macro_hw5.txt

      Save all of the above files and turn in a listing of your sen104MCMC.txt file.

    2. sen104MCMC.txt is still not in a form that we can use because note that the decimal points in the ideal point medians are not lined up! This needs to be fixed. To begin, use Epsilon to change Kennedy's coordinate from -2 to -2.00000 and Helms' coordinate from 2 to 2.00000 (the number of zeroes is not crucial). Now, as in the previous question, bring sen104MCMC.txt up in Epsilon, split the window twice, and create the dummy file fixit.txt with a window at the bottom for another Macro. It should look like this:



      The strategy here is to cut the coordinate out of sen104MCMC.txt, take it down to the fixit.txt window below and process it. The "trick" is to add some spaces at the beginning of the number, back up from the decimal point a set number of spaces and break the line. Note that this sets the decimal point a fixed number of spaces to the right. The number is then put into the kill buffer and returned to sen104MCMC.txt. This produces coordinates that are lined up.

      This is what it looks like mid-way through the macro. Note that I added some spaces at the end of the line in sen104MCMC.txt so that I have plenty of room when I return to the file with the adjusted number.



      Here it is at the end. Note that the coordinate medians are lined up appropriately.



      Here is the macro itself:

      macro_hw5_1.txt

      One last thing needs to be done. Note that R puts lots of digits after the decimal point! Write a simple keyboard macro to limit the number of digits after the decimal point to 5. Turn in a listing of that macro and the final version of your sen104MCMC.txt file. (Note that you will need this file for the next problem!)

    3. Perform a test to see if Ben Nighthorse Campbell (CO) significantly changed his voting behavior after he switched parties from Democrat to Republican. In particular, note that Campbell (D-CO) is column 13 of posterior and Campbell (R-CO) is column 14 of posterior. To do the test, compute the difference between the two chains and use the summary command to get the statistics. In particular, in R issue the three commands:

      campbelldiff <- posterior[,14]-posterior[,13]
      summary(campbelldiff)
      plot(campbelldiff)


      Turn in the results of the summary command and the plot.

      From a statistical standpoint what do these results show?

  2. The aim of this problem is to compare the rank-ordering from my Optimal Classification (OC) program with the ordering of the medians from MCMCPack. Download the program and the "control card" file:

    PERFL_2006.EXE -- Maximum Classification Scaling Program
    and place them in the same folder on a WINDOZE machine. Recall that PERFSTRT_2006.DAT was used in homework 5 problem 1 to scale the 108th Senate so you will have to use Epsilon to change the values in PERFSTRT_2006.DAT so it looks like this:
    SEN104KH.ORD
    NON-PARAMETRIC MULTIDIMENSIONAL UNFOLDING OF 104TH SENATE
        1  919   20   36    1    2   20 0.005    5
    (36A1,12000I1)
    (I5,1X,36A1,2I5,50F8.3)
    
    The first line is the roll call data file, the second line is the title, the third line, in order, is the number of dimensions, the number of roll calls, the number of iterations (leave as is), the number of "characters" to read off the front of each legislator's record (leave as is), the record number of a Senator that you believe is on the "Left" (this just sets "liberal" on the left -- lower ranks or negative values on the first dimension if the scaling is in two or more dimensions -- and "conservative" on the right), and the record number of a Senator that you believe is on "Up" or on positive side of the second dimension (in the U.S. context, these are the Southern Democrats in an earlier era). The last line is the format for the coordinate output file. Please study the examples on the Optimal Classification Program page because we will be using this program several more times!

    1. At the command line type

      PERFL_2006

      The program should only take a few seconds to run and it produces three output files -- PERF21.DAT, PERF23.DAT, and PERF25.DAT. PERF21.DAT should look something like this:
      
       09 MAY       2007  13.38.10.53.
       RANDOM NUMBER SEED     62600
      SEN104KH.ORD                                          
      NON-PARAMETRIC MULTIDIMENSIONAL UNFOLDING OF 104TH SENATE
          1  919   20   36    1    2   20 0.005    5
      (36A1,25000I1)                                                  
      (I5,1X,36A1,2I5,50F8.3)                                         
       ******************************************************************************
        1 ROLL CALLS   1    8178   81250  0.10065  0.89935  0.71583
        2 LEGISLATORS  1    8096   81250  0.09964  0.90036  0.71868  0.99848
        3 ROLL CALLS   1    8025   81250  0.09877  0.90123  0.72115
        4 LEGISLATORS  1    8016   81250  0.09866  0.90134  0.72146  0.99973
        5 ROLL CALLS   1    7989   81250  0.09833  0.90167  0.72240
        6 LEGISLATORS  1    7988   81250  0.09831  0.90169  0.72244  0.99991
        7 ROLL CALLS   1    7981   81250  0.09823  0.90177  0.72268
        8 LEGISLATORS  1    7981   81250  0.09823  0.90177  0.72268  0.99992
        9 ROLL CALLS   1    7970   81250  0.09809  0.90191  0.72306
       10 LEGISLATORS  1    7970   81250  0.09809  0.90191  0.72306  0.99999
       SHARPEN         2    7957   81250
       SHARPEN         3    7921   81250
       SHARPEN         4    7886   81250
       SHARPEN         5    7875   81250
       L-PERMUTATIONS  5    7875   81250  0.09692  0.90308  0.72636
                            13.38.10.54.
       ELAPSED TIME OF JOB  13.44.37.32.
      
      The seventh column reports the correct classification. For the 104th Senate the algorithm quickly converges to 0.90308 (90.3%).

      Turn in a copy of this file.

    2. The file PERF25.DAT contains the output coordinates for the Senators and the roll calls. In one dimension, this is reported as a rank ordering. The first part of the file shows the Senators in rank order and then in the order that they appear in the roll call file with their ranking:
      
       09 MAY       2007  13.38.10.53.
          1 1041427521 0ILLINOI 10000SIMON         88  818   0.892   1.000
          2 1044930925 0WISCONS 10000FEINGOLD     109  827   0.868   2.000
          3 1044910133 0MINNESO 10000WELLSTONE     54  826   0.935   3.000
          4 1041487172 0OREGON  10000WYDEN         24  264   0.909   4.000
          5 1041501171 0CALIFOR 10000BOXER         56  813   0.931   5.000
          6 10410808 3 0MASSACH 10000KENNEDY, ED   53  806   0.934   6.000
          7 1041470923 0MICHIGA 10000LEVIN, CARL   57  826   0.931   7.000
          8 1041470212 0NEW JER 10000BRADLEY       98  757   0.871   8.000
          9 1041491412 0NEW JER 10000LAUTENBERG    70  821   0.915   9.000
         10 10414307 6 0VERMONT 10000LEAHY         71  807   0.912  10.000
                       etc.
                       etc.
                       etc.
        100 1042936733 0MINNESO 20000GRAMS         43  816   0.947  99.500
        101 1044930447 0NORTH C 20000FAIRCLOTH     45  800   0.944 101.000
        102 1041480362 0COLORAD 20000BROWN, HANK   81  828   0.902 102.000
        103 1041503961 0ARIZONA 20000MCCAIN        86  801   0.893 103.000
        104 1041542961 0ARIZONA 20000KYL           40  818   0.951 104.000
      
      Note that below the legislators are the rank positions for the roll call cutting points. Write an Epsilon macro to combine the rank ordering of the Senators with the Bayesian MCMC coordinates (the median from the quantiles file). Turn in the macro and a printout of the combined file. Leave the header on the combined file.

    3. Write an R program that graphs the rank ordering (horizontal axis) against the Bayesian MCMC medians (vertical axis). Label the axes appropriately and label a few of the Senators including Campbell (D-CO) and Campbell (R-CO).

    4. Report the correlation between the OC rank ordering and the Bayesian MCMC medians.

    5. Use Epsilon to cut the listing of the eigenvalues out of the PERF23.DAT file and make a skree plot of the eigenvalues with the R program we discussed in class:

        plot_eigenvalues.r -- Simple R Program that does a Skree Plot of Eigenvalues from OC Output

    6. Use Epsilon to change the number of dimensions from 1 to 2 in PERFSTRT_2006.DAT and run PERFL_2006 again (be sure to save the output files from the one-dimensional run first!!!! -- they will be overwritten by the new ones!). PERF21.DAT should look something like this:
      
       09 MAY       2007  14.14.01.93.
       RANDOM NUMBER SEED     35000
      SEN104KH.ORD                                          
      NON-PARAMETRIC MULTIDIMENSIONAL UNFOLDING OF 104TH SENATE
          2  919   20   36    1    2   20 0.005    5
      (36A1,25000I1)                                                  
      (I5,1X,36A1,2I5,50F8.3)                                         
       ******************************************************************************
        1 ROLL CALLS   2    7240   81250  0.08911  0.91089  0.74843           0.00000
          LEGISLATORS  2    7199   81250  0.08860  0.91140  0.74985  0.00000
        2 ROLL CALLS   2    7099   81250  0.08737  0.91263  0.75333           0.99610
          LEGISLATORS  2    7087   81250  0.08722  0.91278  0.75374  0.99859
        3 ROLL CALLS   2    7065   81250  0.08695  0.91305  0.75451           0.99800
          LEGISLATORS  2    7061   81250  0.08690  0.91310  0.75465  0.99981
        4 ROLL CALLS   2    7050   81250  0.08677  0.91323  0.75503           0.99886
          LEGISLATORS  2    7046   81250  0.08672  0.91328  0.75517  0.99995
        5 ROLL CALLS   2    7039   81250  0.08663  0.91337  0.75541           0.99927
          LEGISLATORS  2    7036   81250  0.08660  0.91340  0.75552  0.99992
        6 ROLL CALLS   2    7033   81250  0.08656  0.91344  0.75562           0.99848
          LEGISLATORS  2    7031   81250  0.08654  0.91346  0.75569  0.99998
        7 ROLL CALLS   2    7025   81250  0.08646  0.91354  0.75590           0.99952
          LEGISLATORS  2    7024   81250  0.08645  0.91355  0.75593  0.99999
        8 ROLL CALLS   2    7021   81250  0.08641  0.91359  0.75604           0.99960
          LEGISLATORS  2    7020   81250  0.08640  0.91360  0.75607  1.00000
        9 ROLL CALLS   2    7020   81250  0.08640  0.91360  0.75607           0.99938
          LEGISLATORS  2    7020   81250  0.08640  0.91360  0.75607  0.99999
       10 ROLL CALLS   2    7019   81250  0.08639  0.91361  0.75611           0.99950
          LEGISLATORS  2    7016   81250  0.08635  0.91365  0.75621  0.99810
       11 ROLL CALLS   2    6994   81250  0.08608  0.91392  0.75698           0.99906
          LEGISLATORS  2    6994   81250  0.08608  0.91392  0.75698  1.00000
       12 ROLL CALLS   2    6992   81250  0.08606  0.91394  0.75705           0.99927
          LEGISLATORS  2    6992   81250  0.08606  0.91394  0.75705  1.00000
       13 ROLL CALLS   2    6992   81250  0.08606  0.91394  0.75705           0.99946
          LEGISLATORS  2    6992   81250  0.08606  0.91394  0.75705  1.00000
       14 ROLL CALLS   2    6992   81250  0.08606  0.91394  0.75705           0.99961
          LEGISLATORS  2    6992   81250  0.08606  0.91394  0.75705  1.00000
       15 ROLL CALLS   2    6990   81250  0.08603  0.91397  0.75711           0.99921
          LEGISLATORS  2    6990   81250  0.08603  0.91397  0.75711  1.00000
       16 ROLL CALLS   2    6990   81250  0.08603  0.91397  0.75711           0.99998
          LEGISLATORS  2    6990   81250  0.08603  0.91397  0.75711  1.00000
       17 ROLL CALLS   2    6990   81250  0.08603  0.91397  0.75711           0.99961
          LEGISLATORS  2    6990   81250  0.08603  0.91397  0.75711  1.00000
       18 ROLL CALLS   2    6989   81250  0.08602  0.91398  0.75715           0.99960
          LEGISLATORS  2    6989   81250  0.08602  0.91398  0.75715  1.00000
       19 ROLL CALLS   2    6988   81250  0.08601  0.91399  0.75718           0.99996
          LEGISLATORS  2    6988   81250  0.08601  0.91399  0.75718  1.00000
       20 ROLL CALLS   2    6988   81250  0.08601  0.91399  0.75718           0.99998
          LEGISLATORS  2    6988   81250  0.08601  0.91399  0.75718  1.00000
       MEAN VOLUME LEG.   0.0026   0.0032
       MACHINE PREC.   2    6988   81250  0.08601  0.91399  0.75718
       MACHINE PREC.   2    6988   81250  0.08601  0.91399  0.75718
                            14.14.01.93.
       ELAPSED TIME OF JOB  14.15.35.76.
      The correct classification is now 91.399% (0.91399). The first few lines of PERF25.DAT should look very similar to this:
      
       09 MAY       2007  14.14.01.93.
          1 1049990999 0USA     10000CLINTON        9  134   0.933   0.034  -0.256   0.014
          2 1041470541 0ALABAMA 10000HEFLIN        95  803   0.882   0.002  -0.086   0.591
          3 1049465941 0ALABAMA 20000SHELBY        58  805   0.928   0.002   0.253   0.176
          4 1041490781 0ALASKA  20000MURKOWSKI     32  798   0.960   0.002   0.245   0.088
          5 1041210981 0ALASKA  20000STEVENS       52  799   0.935   0.002   0.218   0.081
          6 1041542961 0ARIZONA 20000KYL           40  818   0.951   0.002   0.301  -0.017
          7 1041503961 0ARIZONA 20000MCCAIN        84  801   0.895   0.002   0.272  -0.195
          8 1041430042 0ARKANSA 10000BUMPERS       70  794   0.912   0.003  -0.334  -0.005
          9 1041079142 0ARKANSA 10000PRYOR         67  762   0.912   0.002  -0.293   0.169
         10 1041501171 0CALIFOR 10000BOXER         53  813   0.935   0.002  -0.344  -0.120
                             etc etc etc
      The last two columns are the two-dimensional coordinates. Make a two-dimensional plot of the coordinates using R. The party code is just to the right of the state name: 100=Democrat, 200=Republican. The state codes are to the left of the column of zeroes at the beginning of the state names. Use "D" for non-Southern Democrats, "S" for Southern Democrats, "R" for Republicans, and "P" representing President Clinton. You can modify my R program:

      plot_oc_output_X_and_Z.r -- R Program to Plot OC Output

      to do this. This program shows what state codes correspond to "South" and "non-South". Note that you will have to add a line to the program to display the "P" for President Clinton. Be sure to correctly label the graph. You also might want to use mtext(.....) to do the labeling.

  3. Repeat problem (1) only now use the 108th Senate. Download the data files:

    Sen108kh.dta -- Stata Data file for 108th Senate
    Sen108kh.ord -- ASCII Data file for 108th Senate

    Use Epsilon to change keith2.r so it looks like this:
    #
    rm(list=ls(all=TRUE))
    #
    library(foreign)
    #
    ## does the 102nd senate
    ## (assumes the relevant .dta files are in the current directory)
    #
    ## write out the filehead (only need to do that here)
    filehead <- "c:/ucsd_homework_5/sen108kh"
    filehead2 <- "sen108kh"
    #
    ## SHOULDN'T NEED TO CHANGE ANYTHING BELOW THIS LINE
    ## #######################################################
    #
    ## read data in
    filestring <- paste(filehead, ".dta", sep="")
    mydata <- read.dta(filestring)
    #
    ## add rownames corresponding to legislators' names
    rownames(mydata) <- gsub(" ", "", mydata$name)
    #
    ## recode votes to 0/1/NA
    ## 1 yea, 0 nay, NA missing
    ncdata <- ncol(mydata)
    submat <- mydata[,10:ncdata]
    submat[submat==0] <- -999
    submat[submat==7] <- -999
    submat[submat==8] <- -999
    submat[submat==9] <- -999
    submat[submat==4] <- 0
    submat[submat==5] <- 0
    submat[submat==6] <- 0
    submat[submat==2] <- 1
    submat[submat==3] <- 1
    submat[submat==-999] <- NA
    mydata[,10:ncdata] <- submat
    
    ## change name of mydata to whatever filehead is
    expr <- parse(text=paste(filehead2, " <- mydata"))
    eval(expr)
    
    ## save as an .rda file
    expr <- parse(text=paste("save(", filehead2, ", file=\"",
                    filehead, ".rda\", ascii=TRUE)", sep="")) 
    eval(expr)
    This will create the SEN108KH.RDA file.

    Use Epsilon to change keithMCMC.r so it looks like this:
    #
    library(MCMCpack)
    
    load("c:/ucsd_homework_5/sen108kh.rda")
    
    Sen.rollcalls <- sen108kh[,10:ncol(sen108kh)]
    
    posterior <- MCMCirt1d(Sen.rollcalls,
                           theta.constraints=list("KENNEDYED"=-2, "KYL"=2),
                           burnin=2000, mcmc=100000, thin=20, verbose=500)
    geweke.diag(posterior)
    pdf(file="c:/ucsd_homework_5/keithplots3.pdf")
    plot(posterior)
    summary(posterior)
    sss <- summary(posterior)
    write.table(sss$statistics,"c:/ucsd_homework_5/tab1.txt")
    write.table(sss$quantiles,"c:/ucsd_homework_5/tab2.txt")
    dev.off()
    plot(posterior[,1])
    1. Repeat parts (a) and (b) of problem (1) to create a neat listing of sen108MCMC.txt. Turn in the lists of your modified macros and sen108MCMC.txt.

    2. Use PERFL_2006 to get the rank-ordering for the 108th Senate. Use the R program that you wrote for (1.c) above and graph the rank ordering (horizontal axis) against the Bayesian MCMC medians (vertical axis). Label the axes appropriately and label a few of the Senators.