Chapter 2 Markov Chain Simulation

Section 1 Introduction to the Markov Chain Monte Carlo approach in Bayesian Anaylsis

The reference for this chapter is Bayesian Data Analysis (chapter 11) by Gelman et al.

The Markov chain Monte Carlo method (MCMC) is a method for approximating the posterior distribution for parameters of interest in the Bayesian framework. This option is invoked by using the command line option -mcmc N where N is the number of simulations performed. You will proabably also want to include the option -mcscale which dynamically scales the covariance matrix until a reasonable acceptance rate is observed. You may also want to use the -mcmult n option which scales the initial covariances matrix if the initial values are so large that arithmetic errors occur. One advantage of AD Model Builder over some other implementations of MCMC is that the mode of the posterior distribution together with the hessian at the mode is available to use for the MCMC routine. This information is used to implement a version of the Hastings-Metropolis algorithm. Another advantage is that with AD Model Builder it is possible to calculate the profile likelihood for a parameter of interest and compare the distribution to the MCMC distribution for that parameter. A large discrepancy may indicate that one or both estimates are inadequate. If you wish to do more simulations (and to carry on from where the last one ended use the -mcr option. The following figure compares the profile likelihood for the projected biomass to the estimates produced by the MCMC method for different sample sizes (25,000 and 2,500,000 samples) for the catage example.

A report containing the observed distributions is produced in the file root.hst. All objects of type sdreport i.e number, vector or matrix are included. It is possible to save the results of every n'th simulation by using the -mcsave n option. Afterwords these values can be used by running the model with the -mceval option which will evaluate the userfunction once for every saved simulation value. At this time the function mceval_phase() will return the value true and can be used as a switch to perform desired calculations. The results are saved in a binary file root.psv. If you want to convert tis file into ASCII see the next section.

AD Model Builder uses the hessian to produce an (almost) multivariate normal distribution for the Metropolis-Hastings algorithm. It is not exacly multivariate normal because the random vectors produced are modified to satisfy any bounds on the parameters.

There is also an option for using a fatter tailed distribution. This distribution is a mixture of the multivariate normal and a fat-tailed distribution. It is invoked with the -mcgrope n option where n is the amount of fat-tailed distribution in the mixture. Proabably a value of n between .05 and 0.10 is best.

Section 2 Reading AD Model Builder binary files

Often the data which AD Model Builder needs to save are saved in the form of a binary file using the uistream and uostream classes. If these data consist of a series of vectors all of which have the same dimension they are often saved in this form where the dimension is saved at the top of the file ad the vectors are saved afterword. It may be useful to convert thes numbers into binary form so that they can be put into other programs such as spreadsheets. the following code will read the contents of these binary files. YOu should call the program readbin.cpp. It should be a simple matter to modify this program for other uses.

/*  program to read a binary file (using ADMB's uistream and
    uostream stream classes)  of vectors of length n.
    It is assumed that the size n is stored at the top of
    the file. there is no information about any many vectors
    are stored so we must check for an eof after each read
    To use the program you type:

readbin filename */ void produce_comma_delimited_output(dvector& v) { int i1=v.indexmin(); int i2=v.indexmax(); for (int i=i1;i<=i2;i++) { cout << v(i) << ","; } cout << endl; }

main(int argc, char * argv[]) { if (argc < 2) { cerr << " Usage: progname inputfilename" << endl; exit(1); } uistream uis = uistream(argv[1]); if (!uis) { cerr << " Error trying to open binary input file " << argv[1] << endl; exit(1); } int ndim; uis >> ndim; if (!uis) { cerr << " Error trying to read dimension of the vector" " from the top of the file " << argv[1] << endl; exit(1); } if (ndim <=0) { cerr << " Read invalid dimension for the vector" " from the top of the file " << argv[1] << " the number was " << ndim << endl; exit(1); }

int nswitch; cout << " 1 to see all records" << endl << " 2 then after the prompts n1 and n2 to see all" << endl << " records between n1 and n2 inclusive" << endl << " 3 to see the dimension of the vector" << endl << " 4 to see how many vectors there are" << endl; cin >> nswitch; dvector rec(1,ndim); int n1=0; int n2=0; int ii=0; switch(nswitch) { case 2: cout << " Put in the number for the first record you want to see" << endl; cin >> n1; cout << " Put in the number for the second record you want to see" << endl; cin >> n2; case 1: do { uis >> rec; if (uis.eof()) break; if (!uis) { cerr << " Error trying to read vector number " << ii << " from file " << argv[1] << endl; exit(1); } ii++; if (!n1) { // comment out the one you don't want //cout << rec << endl; produce_comma_delimited_output(rec); } else { if (n1<=ii && ii<=n2) { // comment out the one you don't want //cout << rec << endl; produce_comma_delimited_output(rec); } } } while (1); break; case 4: do { uis >> rec; if (uis.eof()) break; if (!uis) { cerr << " Error trying to read vector number " << ii << " from file " << argv[1] << endl; exit(1); } ii++; } while (1); cout << " There are " << ii << " vectors" << endl; break; case 3: cout << " Dimension = " << ndim << endl; default: ; } }