PREV
Let
index fishing years
and
index age classes
with
.
The instantaneous fishing mortality rate is assumed to have the
form
where
is called the catchability,
is the observed fishing effort,
is an age-dependent effect
termed the selectivity, and the
are deviations from the
expected relationship between the observed fishing effort and
the resulting fishing mortality. The
are assumed to
be normally distributed with mean 0. The instantaneous
natural mortality rate
is assumed to be independent of
year and age class. It is not estimated in this version of the model.
The instantaneous total
mortality rate is given by
. The survival rate is given
by
. The number of age class
fish
in the population in year
is denoted by
.
The relationship
is assumed to hold.
Note that using this relationship if one knows
then all
the
can be calculated from knowledge of the
initial population in year
,
and knowledge of the recruitment in each year
.
The purpose of the model is to estimate quantities of interest to
managers such as the population size and exploitation rates
and to make projections about the population. In particular we can get
an estimate of the numbers of fish in the population in year
for age classes 2 or greater from the relationship
. If we have estimates
for the mean weight
at age
, then the projected biomass level
of age class
fish for year
can be computed from the relationship
.
Besides getting a point estimate for quantities of interest like
we also want to get an idea of how well determined the
estimate is. AD Model Builder has completely automated the process of deriving
good confidence limits for these parameters in a Bayesian context.
One simply needs to declare the parameter to be of type
likeprof_number. The results are given in the section
on Bayesian inference.
The code for the catch-at-age model is:
DATA_SECTION // the number of years of data init_int nyrs // the number of age classes in the population init_int nages // the catch-at-age data init_matrix obs_catch_at_age(1,nyrs,1,nages) //estimates of fishing effort init_vector effort(1,nyrs) // natural mortality rate init_number M // need to have relative weight at age to calculate biomass of 2+ vector relwt(2,nages) INITIALIZATION_SECTION log_q -1 log_P 5 PARAMETER_SECTION init_number log_q(1) // log of the catchability init_number log_P(1) // overall population scaling parameter init_bounded_dev_vector log_sel_coff(1,nages-1,-15.,15.,2) init_bounded_dev_vector log_relpop(1,nyrs+nages-1,-15.,15.,2) init_bounded_dev_vector effort_devs(1,nyrs,-5.,5.,3) vector log_sel(1,nages) vector log_initpop(1,nyrs+nages-1); matrix F(1,nyrs,1,nages) // the instantaneous fishing mortality matrix Z(1,nyrs,1,nages) // the instantaneous total mortality matrix S(1,nyrs,1,nages) // the survival rate matrix N(1,nyrs,1,nages) // the predicted numbers at age matrix C(1,nyrs,1,nages) // the predicted catch at age objective_function_value f sdreport_number avg_F sdreport_vector predicted_N(2,nages) sdreport_vector ratio_N(2,nages) likeprof_number pred_B PRELIMINARY_CALCS_SECTION // this is just to invent some relative average // weight numbers relwt.fill_seqadd(1.,1.); relwt=pow(relwt,.5); relwt/=max(relwt); PROCEDURE_SECTION // example of using FUNCTION to structure the procedure section get_mortality_and_survival_rates();This model employs several instances of the init_bounded_dev_vector type. This type consists of a vector of numbers which sum to 0, that is they are deviations from a common mean, and are bounded. For example the quantities log_relpop are used to parameterize the logarithm of the variations in year class strength of the fish population. Putting bounds on the magnitude of the deviations helps to improve the stability of the model. The bounds are from -15.0 to 15.0 which gives the estimates of relative year class strength a dynamic range ofget_numbers_at_age();
get_catch_at_age();
evaluate_the_objective_function();
FUNCTION get_mortality_and_survival_rates // calculate the selectivity from the sel_coffs for (int j=1;j<nages;j++) { log_sel(j)=log_sel_coff(j); } // the selectivity is the same for the last two age classes log_sel(nages)=log_sel_coff(nages-1);
// This is the same as F(i,j)=exp(log_q)*effort(i)*exp(log_sel(j)); F=outer_prod(mfexp(log_q)*effort,mfexp(log_sel)); if (active(effort_devs)) { for (int i=1;i<=nyrs;i++) { F(i)=F(i)*exp(effort_devs(i)); } } // get the total mortality Z=F+M; // get the survival rate S=mfexp(-1.0*Z);
FUNCTION get_numbers_at_age log_initpop=log_relpop+log_P; for (int i=1;i<=nyrs;i++) { N(i,1)=mfexp(log_initpop(i)); } for (int j=2;j<=nages;j++) { N(1,j)=mfexp(log_initpop(nyrs+j-1)); } for (i=1;i<nyrs;i++) { for (j=1;j<nages;j++) { N(i+1,j+1)=N(i,j)*S(i,j); } } // calculated predicted numbers at age for next year for (j=1;j<nages;j++) { predicted_N(j+1)=N(nyrs,j)*S(nyrs,j); ratio_N(j+1)=predicted_N(j+1)/N(1,j+1); } // calculate predicted biomass for profile // likelihood report pred_B=predicted_N *relwt;
FUNCTION get_catch_at_age C=elem_prod(elem_div(F,Z),elem_prod(1.-S,N));
FUNCTION evaluate_the_objective_function // penalty functions to ``regularize '' the solution f+=.01*norm2(log_relpop); avg_F=sum(F)/double(size_count(F)); if (last_phase()) { // a very small penalty on the average fishing mortality f+= .001*square(log(avg_F/.2)); } else { // use a large penalty during the initial phases to keep the // fishing mortality high f+= 1000.*square(log(avg_F/.2)); } // errors in variables type objective function with errors in // the catch at age and errors in the effort fishing mortality // relationship if (active(effort_devs) { // only include the effort_devs in the objective function if // they are active parameters f+=0.5*double(size_count(C)+size_count(effort_devs)) * log( sum(elem_div(square(C-obs_catch_at_age),.01+C)) + 0.1*norm2(effort_devs)); } else { // objective function without the effort_devs f+=0.5*double(size_count(C)) * log( sum(elem_div(square(C-obs_catch_at_age),.01+C))); } REPORT_SECTION report << "Estimated numbers of fish " << endl; report << N << endl; report << "Estimated numbers in catch " << endl; report << C << endl; report << "Observed numbers in catch " << endl; report << obs_catch_at_age << endl; report << "Estimated fishing mortality " << endl; report << F << endl;
.
The FUNCTION keyword has been employed a number of times in the PARAMETER_SECTION to help structure the code. A function is defined simply by using the FUNCTION keyword followed by the name of the function.
FUNCTION get_mortality_and_survival_ratesDon't include the parentheses or semicolon here. To use the function simply write its name in the procedure section.
get_mortality_and_survival_rates();You must include the parentheses and the semicolon here.
The REPORT_SECTION shows how to generate a report for an AD Model Builder program. The default report generating machinery utilizes the C++ stream formalism. You don't need to know much about streams to make a report, but a few comments are in order. The stream formalism associates stream object with a file. In this case the stream object associated with the AD Model Builder report file is report. To write an object xxx into the report file you insert the line
report << xxx;into the REPORT_SECTION. If you want to skip to a new line after writing the object you can include the stream manipulator endl as in
report << "Estimated numbers of fish " << endl;Notice that the stream operations know about common C objects such as strings, so that it is a simple matter to put comments or labels into the report file.
AD Model Builder enables one to quickly build models with large numbers of
parameters -- this is especially useful for employing
Bayesian analysis. Traditionally however it has been difficult to
interpret the results of analysis using such models.
In a Bayesian context the results are represented
by the posterior probability distribution for the
model parameters. To get exact results from
the posterior distribution it is necessary to evaluate
integrals over large dimensional spaces and this can be computationally
intractable. AD Model Builder provides an
approximations to these
integrals in the form of the profile likelihood. The
profile likelihood can be used to
estimates for extreme values
(such as estimating a value
so that for a
parameter
the probability that
or the probability that
) for
any model parameter.
To use this facility simply declare the parameter of interest
to be of type likeprof_number in the PARAMETER_SECTION and assign
the correct value to the parameter in the PROCEDURE SECTION.
The code for the catch at age model estimates the profile likelihood for the projected biomass of age class 2+ fish. (Age class 2+ has been used to avoid the extra problem of dealing with the uncertainty of the recruitment of age class 1 fish). As a typical application of the method, the user of the model can estimate the probability that the biomass of fish for next year will be larger or smaller than a certain value. Estimates like these are obviously of great interest to managers of natural resources.
The profile likelihood report for a variable is in a file with the same name as the variable (truncated to eight letters, if necessary, with the suffix .PLT appended). For this example the report is in the file PRED_B.PLT. Part of the file is shown here.
pred_B:
Profile likelihood
-1411.23 1.1604e-09
-1250.5 1.71005e-09
-1154.06 2.22411e-09
................... // skip some here
...................
278.258 2.79633e-05
324.632 5.28205e-05
388.923 6.89413e-05
453.214 8.84641e-05
517.505 0.0001116
581.796 0.000138412
...................
...................
1289 0.000482459
1353.29 0.000494449
1417.58 0.000503261
1481.87 0.000508715
1546.16 0.0005107
1610.45 0.000509175
1674.74 0.000504171
1739.03 0.000490788
1803.32 0.000476089
1867.61 0.000460214
1931.91 0.000443313
1996.2 0.000425539
2060.49 0.000407049
2124.78 0.000388
2189.07 0.00036855
...................
...................
4503.55 2.27712e-05
4599.98 2.00312e-05
4760.71 1.48842e-05
4921.44 1.07058e-05
5082.16 7.45383e-06
...................
...................
6528.71 6.82689e-07
6689.44 6.91085e-07
6850.17 7.3193e-07
Minimum width confidence limits:
significance level lower bound upper bound
0.90 572.537 3153.43
0.95 453.214 3467.07
0.975 347.024 3667.76
One sided confidence limits for the profile likelihood:
The probability is 0.9 that pred_B is greater than 943.214
The probability is 0.95 that pred_B is greater than 750.503
The probability is 0.975 that pred_B is greater than 602.507
The probability is 0.9 that pred_B is less than 3173.97
The probability is 0.95 that pred_B is less than 3682.75
The probability is 0.975 that pred_B is less than 4199.03
The file contains the probability density function and the
approximate confidence limits for the profile likelihood, the profile
likelihood and the normal approximation. Since the format is the same
for all three, we only discuss the profile likelihood here.
The first part of the report contains pairs of numbers
which consist of values of the parameter in the
report (in this case PRED_B and the estimated value for
the probability density associated with that parameter at the point.
The probability that the parameter lies in the interval
where
can be estimated from the sum

verses
gives the user an indication of what
the probability distribution of the parameter looks like.

The the profile likelihood indicates the fact that the biomass can not be less than zero. The normal approximation is not very useful for calculating the probability that the biomass is very low -- a question of great interest to managers who are probably not going to be impressed by the knowledge that there is an estimated probability of 0.975 that the biomass is greater than -52.660.
One sided confidence limits for the normal approximationThe probability is 0.9 that pred_B is greater than 551.235 The probability is 0.95 that pred_B is greater than 202.374 The probability is 0.975 that pred_B is greater than -52.660
PRELIMINARY_CALCS_SECTION u.set_stepnumber(10); // default value is 8 u.set_stepsize(0.2); // default value is 0.5set_stepnumber option set_stepsize option will set the number of steps equal to 21 (from -10 to 10) and will set the step size equal to 0.2 times the estimated standard deviation for the parameter u.
DATA_SECTION
// will read data from file catchdat.dat
!!USER_CODE ad_comm::change_datafile_name("catchdat.dat");
init_int nyrs
init_int nages
init_matrix obs_catch_at_age(1,nyrs,1,nages)
// now read the effort data from the file effort.dat and save the current
// file position in catchdat.dat in the object tmp
!!USER_CODE streampos tmp = ad_comm::change_datafile_name("effort.dat");
init_vector effort(1,nyrs)
// now read the rest of the data from the file catchdat.dat
// including the ioption argument tmp will reset the file to that position
!!USER_CODE ad_comm::change_datafile_name("catchdat.dat",tmp);
init_number M
// ....
PARAMETER_SECTION
// will read parameters from file catch.par
!!USER_CODE ad_comm::change_parfile_name("catch.par");
// calculate the selectivity from the sel_coffs
for (int j=1;j<nages;j++)
{
log_sel(j)=log_sel_coff(j);
}
// the selectivity is the same for the last two age classes
log_sel(nages)=log_sel_coff(nages-1);
// same code using the subvector operation
log_sel(1,nage-1)=log_sel_coff;
// the selectivity is the same for the last two age classes
log_sel(nages)=log_sel_coff(nages-1);
Notice that log_sel(1,nage-1) is not a distinct vector
from log_sel. This means that an assignment to log_sel(1,nage-1)
is an assignment to a part of log_sel.
The next example is a bit more complicated. It involves taking a row of a matrix,
to form a vector, forming a subvector, and changing the valid index range
for the vector.
\bestbreak
// loop form of the code
for (i=1;i<nyrs;i++)
{
for (j=1;j<nages;j++)
{
N(i+1,j+1)=N(i,j)*S(i,j);
}
}
// can only eliminate the inside loop
for (i=1;i<nyrs;i++)
{
// ++ increments the index bounds by 1
N(i+1)(2,nyrs)=++elem_prod(N(i)(1,nage-1),S(i)(1,nage-1));
}
Notice that N(i+1) is a vector object so that
N(i+1)(2,nyrs) is a subvector of N(i). Another
point is that elem_prod(N(i)(1,nage-1),S(i)(1,nage-1))
is a vector object with minimum valid index 1 and maximum valid
index nyrs-1. The operator ++ applied to a subvector
increments the valid index range by 1 so that it has the same
range of valid index values as N(i+1)(2,nyrs).
The operator -- would decrement the valid index range by 1.
DATA_SECTION
init_4darray d4(1,2,1,2,1,3,1,3)
init_3darray d3(1,2,1,3,1,3)
PARAMETER_SECTION
init_matrix M(1,3,1,3)
4darray p4(1,2,1,2,1,3,2,3)
objective_function_value f
PRELIMINARY_CALCS_SECTION
for (int i=1;i<=3;i++)
{
M(i,i)=1; // set M equal to the identity matrix to start
}
PROCEDURE_SECTION
for (int i=1;i<=2;i++)
{
for (int j=1;j<=2;j++)
{
// d4(i,j) is a 3x3 matrix -- d3(i) is a 3x3 matrix
// d4(i,j)*M is matrix multiplication -- inv(M) is matrix inverse
f+= norm2( d4(i,j)*M + d3(i)+ inv(M) );
}
}
REPORT_SECTION
report << "Printout of a 4 dimensional array" << endl << endl;
report << d4 << endl << endl;
report << "Printout of a 3 dimensional array" << endl << endl;
report << d3 << endl << endl;
In the DATA_SECTION you can use 3darrays, 4darrays,
init_3darrays, init_4darrays.
In the PARAMETER_SECTION you can use 3darrays, 4darrays, but at the
time of writing init_3darrays, init_4darrays had
not been implemented for the PARAMETER_SECTION.
If d4 is a 4darray then d4(i) is a three dimensional array and d4(i,j) is a matrix object so that d4(i,j)*M is matrix multiplication. Similarly if d3 is a 3darray then d3(i) is a matrix object so that d4(i,j)*M + d3(i) +inv(M) combines matrix multiplication, matrix inversion, and matrix addition.