Let index fishing years and index age classes with . The instantaneous fishing mortality rate is assumed to have the form where is called the catchability, is the observed fishing effort, is an age-dependent effect termed the selectivity, and the are deviations from the expected relationship between the observed fishing effort and the resulting fishing mortality. The are assumed to be normally distributed with mean 0. The instantaneous natural mortality rate is assumed to be independent of year and age class. It is not estimated in this version of the model. The instantaneous total mortality rate is given by . The survival rate is given by . The number of age class fish in the population in year is denoted by . The relationship is assumed to hold. Note that using this relationship if one knows then all the can be calculated from knowledge of the initial population in year , and knowledge of the recruitment in each year .

The purpose of the model is to estimate quantities of interest to managers such as the population size and exploitation rates and to make projections about the population. In particular we can get an estimate of the numbers of fish in the population in year for age classes 2 or greater from the relationship . If we have estimates for the mean weight at age , then the projected biomass level of age class fish for year can be computed from the relationship .

Besides getting a point estimate for quantities of interest like we also want to get an idea of how well determined the estimate is. AD Model Builder has completely automated the process of deriving good confidence limits for these parameters in a Bayesian context. One simply needs to declare the parameter to be of type likeprof_number. The results are given in the section on Bayesian inference.

The code for the catch-at-age model is:

DATA_SECTION // the number of years of data init_int nyrs // the number of age classes in the population init_int nages // the catch-at-age data init_matrix obs_catch_at_age(1,nyrs,1,nages) //estimates of fishing effort init_vector effort(1,nyrs) // natural mortality rate init_number M // need to have relative weight at age to calculate biomass of 2+ vector relwt(2,nages) INITIALIZATION_SECTION log_q -1 log_P 5 PARAMETER_SECTION init_number log_q(1) // log of the catchability init_number log_P(1) // overall population scaling parameter init_bounded_dev_vector log_sel_coff(1,nages-1,-15.,15.,2) init_bounded_dev_vector log_relpop(1,nyrs+nages-1,-15.,15.,2) init_bounded_dev_vector effort_devs(1,nyrs,-5.,5.,3) vector log_sel(1,nages) vector log_initpop(1,nyrs+nages-1); matrix F(1,nyrs,1,nages) // the instantaneous fishing mortality matrix Z(1,nyrs,1,nages) // the instantaneous total mortality matrix S(1,nyrs,1,nages) // the survival rate matrix N(1,nyrs,1,nages) // the predicted numbers at age matrix C(1,nyrs,1,nages) // the predicted catch at age objective_function_value f sdreport_number avg_F sdreport_vector predicted_N(2,nages) sdreport_vector ratio_N(2,nages) likeprof_number pred_B PRELIMINARY_CALCS_SECTION // this is just to invent some relative average // weight numbers relwt.fill_seqadd(1.,1.); relwt=pow(relwt,.5); relwt/=max(relwt); PROCEDURE_SECTION // example of using FUNCTION to structure the procedure section get_mortality_and_survival_rates();This model employs several instances of the init_bounded_dev_vector type. This type consists of a vector of numbers which sum to 0, that is they are deviations from a common mean, and are bounded. For example the quantities log_relpop are used to parameterize the logarithm of the variations in year class strength of the fish population. Putting bounds on the magnitude of the deviations helps to improve the stability of the model. The bounds are from -15.0 to 15.0 which gives the estimates of relative year class strength a dynamic range of .get_numbers_at_age();

get_catch_at_age();

evaluate_the_objective_function();

FUNCTION get_mortality_and_survival_rates // calculate the selectivity from the sel_coffs for (int j=1;j<nages;j++) { log_sel(j)=log_sel_coff(j); } // the selectivity is the same for the last two age classes log_sel(nages)=log_sel_coff(nages-1);

// This is the same as F(i,j)=exp(log_q)*effort(i)*exp(log_sel(j)); F=outer_prod(mfexp(log_q)*effort,mfexp(log_sel)); if (active(effort_devs)) { for (int i=1;i<=nyrs;i++) { F(i)=F(i)*exp(effort_devs(i)); } } // get the total mortality Z=F+M; // get the survival rate S=mfexp(-1.0*Z);

FUNCTION get_numbers_at_age log_initpop=log_relpop+log_P; for (int i=1;i<=nyrs;i++) { N(i,1)=mfexp(log_initpop(i)); } for (int j=2;j<=nages;j++) { N(1,j)=mfexp(log_initpop(nyrs+j-1)); } for (i=1;i<nyrs;i++) { for (j=1;j<nages;j++) { N(i+1,j+1)=N(i,j)*S(i,j); } } // calculated predicted numbers at age for next year for (j=1;j<nages;j++) { predicted_N(j+1)=N(nyrs,j)*S(nyrs,j); ratio_N(j+1)=predicted_N(j+1)/N(1,j+1); } // calculate predicted biomass for profile // likelihood report pred_B=predicted_N *relwt;

FUNCTION get_catch_at_age C=elem_prod(elem_div(F,Z),elem_prod(1.-S,N));

FUNCTION evaluate_the_objective_function // penalty functions to ``regularize '' the solution f+=.01*norm2(log_relpop); avg_F=sum(F)/double(size_count(F)); if (last_phase()) { // a very small penalty on the average fishing mortality f+= .001*square(log(avg_F/.2)); } else { // use a large penalty during the initial phases to keep the // fishing mortality high f+= 1000.*square(log(avg_F/.2)); } // errors in variables type objective function with errors in // the catch at age and errors in the effort fishing mortality // relationship if (active(effort_devs) { // only include the effort_devs in the objective function if // they are active parameters f+=0.5*double(size_count(C)+size_count(effort_devs)) * log( sum(elem_div(square(C-obs_catch_at_age),.01+C)) + 0.1*norm2(effort_devs)); } else { // objective function without the effort_devs f+=0.5*double(size_count(C)) * log( sum(elem_div(square(C-obs_catch_at_age),.01+C))); } REPORT_SECTION report << "Estimated numbers of fish " << endl; report << N << endl; report << "Estimated numbers in catch " << endl; report << C << endl; report << "Observed numbers in catch " << endl; report << obs_catch_at_age << endl; report << "Estimated fishing mortality " << endl; report << F << endl;

The FUNCTION keyword has been employed a number of times in the PARAMETER_SECTION to help structure the code. A function is defined simply by using the FUNCTION keyword followed by the name of the function.

FUNCTION get_mortality_and_survival_ratesDon't include the parentheses or semicolon here. To use the function simply write its name in the procedure section.

get_mortality_and_survival_rates();You must include the parentheses and the semicolon here.

The REPORT_SECTION shows how to generate a report for an AD Model Builder program. The default report generating machinery utilizes the C++ stream formalism. You don't need to know much about streams to make a report, but a few comments are in order. The stream formalism associates stream object with a file. In this case the stream object associated with the AD Model Builder report file is report. To write an object xxx into the report file you insert the line

report << xxx;into the REPORT_SECTION. If you want to skip to a new line after writing the object you can include the stream manipulator endl as in

report << "Estimated numbers of fish " << endl;Notice that the stream operations know about common C objects such as strings, so that it is a simple matter to put comments or labels into the report file.

AD Model Builder enables one to quickly build models with large numbers of parameters -- this is especially useful for employing Bayesian analysis. Traditionally however it has been difficult to interpret the results of analysis using such models. In a Bayesian context the results are represented by the posterior probability distribution for the model parameters. To get exact results from the posterior distribution it is necessary to evaluate integrals over large dimensional spaces and this can be computationally intractable. AD Model Builder provides an approximations to these integrals in the form of the profile likelihood. The profile likelihood can be used to estimates for extreme values (such as estimating a value so that for a parameter the probability that or the probability that ) for any model parameter. To use this facility simply declare the parameter of interest to be of type likeprof_number in the PARAMETER_SECTION and assign the correct value to the parameter in the PROCEDURE SECTION.

The code for the catch at age model estimates the profile likelihood for the projected biomass of age class 2+ fish. (Age class 2+ has been used to avoid the extra problem of dealing with the uncertainty of the recruitment of age class 1 fish). As a typical application of the method, the user of the model can estimate the probability that the biomass of fish for next year will be larger or smaller than a certain value. Estimates like these are obviously of great interest to managers of natural resources.

The profile likelihood report for a variable is in a file with the same name as the variable (truncated to eight letters, if necessary, with the suffix .PLT appended). For this example the report is in the file PRED_B.PLT. Part of the file is shown here.

pred_B: Profile likelihood -1411.23 1.1604e-09 -1250.5 1.71005e-09 -1154.06 2.22411e-09 ................... // skip some here ................... 278.258 2.79633e-05 324.632 5.28205e-05 388.923 6.89413e-05 453.214 8.84641e-05 517.505 0.0001116 581.796 0.000138412 ................... ................... 1289 0.000482459 1353.29 0.000494449 1417.58 0.000503261 1481.87 0.000508715 1546.16 0.0005107 1610.45 0.000509175 1674.74 0.000504171 1739.03 0.000490788 1803.32 0.000476089 1867.61 0.000460214 1931.91 0.000443313 1996.2 0.000425539 2060.49 0.000407049 2124.78 0.000388 2189.07 0.00036855 ................... ................... 4503.55 2.27712e-05 4599.98 2.00312e-05 4760.71 1.48842e-05 4921.44 1.07058e-05 5082.16 7.45383e-06 ................... ................... 6528.71 6.82689e-07 6689.44 6.91085e-07 6850.17 7.3193e-07 Minimum width confidence limits: significance level lower bound upper bound 0.90 572.537 3153.43 0.95 453.214 3467.07 0.975 347.024 3667.76One sided confidence limits for the profile likelihood:

The probability is 0.9 that pred_B is greater than 943.214 The probability is 0.95 that pred_B is greater than 750.503 The probability is 0.975 that pred_B is greater than 602.507

The probability is 0.9 that pred_B is less than 3173.97 The probability is 0.95 that pred_B is less than 3682.75 The probability is 0.975 that pred_B is less than 4199.03

The file contains the probability density function and the approximate confidence limits for the profile likelihood, the profile likelihood and the normal approximation. Since the format is the same for all three, we only discuss the profile likelihood here. The first part of the report contains pairs of numbers which consist of values of the parameter in the report (in this case PRED_B and the estimated value for the probability density associated with that parameter at the point. The probability that the parameter lies in the interval where can be estimated from the sum

The the profile likelihood indicates the fact that the biomass can not be less than zero. The normal approximation is not very useful for calculating the probability that the biomass is very low -- a question of great interest to managers who are probably not going to be impressed by the knowledge that there is an estimated probability of 0.975 that the biomass is greater than -52.660.

One sided confidence limits for the normal approximationThe probability is 0.9 that pred_B is greater than 551.235 The probability is 0.95 that pred_B is greater than 202.374 The probability is 0.975 that pred_B is greater than -52.660

PRELIMINARY_CALCS_SECTION u.set_stepnumber(10); // default value is 8 u.set_stepsize(0.2); // default value is 0.5set_stepnumber option set_stepsize option will set the number of steps equal to 21 (from -10 to 10) and will set the step size equal to 0.2 times the estimated standard deviation for the parameter u.

DATA_SECTION // will read data from file catchdat.dat !!USER_CODE ad_comm::change_datafile_name("catchdat.dat"); init_int nyrs init_int nages init_matrix obs_catch_at_age(1,nyrs,1,nages) // now read the effort data from the file effort.dat and save the current // file position in catchdat.dat in the object tmp !!USER_CODE streampos tmp = ad_comm::change_datafile_name("effort.dat"); init_vector effort(1,nyrs) // now read the rest of the data from the file catchdat.dat // including the ioption argument tmp will reset the file to that position !!USER_CODE ad_comm::change_datafile_name("catchdat.dat",tmp); init_number M// ....

PARAMETER_SECTION // will read parameters from file catch.par !!USER_CODE ad_comm::change_parfile_name("catch.par");

// calculate the selectivity from the sel_coffs for (int j=1;j<nages;j++) { log_sel(j)=log_sel_coff(j); } // the selectivity is the same for the last two age classes log_sel(nages)=log_sel_coff(nages-1);Notice that log_sel(1,nage-1) is not a distinct vector from log_sel. This means that an assignment to log_sel(1,nage-1) is an assignment to a part of log_sel. The next example is a bit more complicated. It involves taking a row of a matrix, to form a vector, forming a subvector, and changing the valid index range for the vector. \bestbreak// same code using the subvector operation log_sel(1,nage-1)=log_sel_coff; // the selectivity is the same for the last two age classes log_sel(nages)=log_sel_coff(nages-1);

// loop form of the code for (i=1;i<nyrs;i++) { for (j=1;j<nages;j++) { N(i+1,j+1)=N(i,j)*S(i,j); } }Notice that N(i+1) is a vector object so that N(i+1)(2,nyrs) is a subvector of N(i). Another point is that elem_prod(N(i)(1,nage-1),S(i)(1,nage-1)) is a vector object with minimum valid index 1 and maximum valid index nyrs-1. The operator ++ applied to a subvector increments the valid index range by 1 so that it has the same range of valid index values as N(i+1)(2,nyrs). The operator -- would decrement the valid index range by 1.// can only eliminate the inside loop for (i=1;i<nyrs;i++) { // ++ increments the index bounds by 1 N(i+1)(2,nyrs)=++elem_prod(N(i)(1,nage-1),S(i)(1,nage-1)); }

DATA_SECTION init_4darray d4(1,2,1,2,1,3,1,3) init_3darray d3(1,2,1,3,1,3) PARAMETER_SECTION init_matrix M(1,3,1,3) 4darray p4(1,2,1,2,1,3,2,3) objective_function_value f PRELIMINARY_CALCS_SECTION for (int i=1;i<=3;i++) { M(i,i)=1; // set M equal to the identity matrix to start } PROCEDURE_SECTION for (int i=1;i<=2;i++) { for (int j=1;j<=2;j++) { // d4(i,j) is a 3x3 matrix -- d3(i) is a 3x3 matrix // d4(i,j)*M is matrix multiplication -- inv(M) is matrix inverse f+= norm2( d4(i,j)*M + d3(i)+ inv(M) ); } } REPORT_SECTION report << "Printout of a 4 dimensional array" << endl << endl; report << d4 << endl << endl; report << "Printout of a 3 dimensional array" << endl << endl; report << d3 << endl << endl;In the DATA_SECTION you can use 3darrays, 4darrays, init_3darrays, init_4darrays. In the PARAMETER_SECTION you can use 3darrays, 4darrays, but at the time of writing init_3darrays, init_4darrays had not been implemented for the PARAMETER_SECTION.

If d4 is a 4darray then d4(i) is a three dimensional array and d4(i,j) is a matrix object so that d4(i,j)*M is matrix multiplication. Similarly if d3 is a 3darray then d3(i) is a matrix object so that d4(i,j)*M + d3(i) +inv(M) combines matrix multiplication, matrix inversion, and matrix addition.