% Upper-case    A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
% Lower-case    a b c d e f g h i j k l m n o p q r s t u v w x y z
% Digits        0 1 2 3 4 5 6 7 8 9
% Exclamation   !           Double quote "          Hash (number) #
% Dollar        $           Percent      %          Ampersand     &
% Acute accent  '           Left paren   (          Right paren   )
% Asterisk      *           Plus         +          Comma         ,
% Minus         -           Point        .          Solidus       /
% Colon         :           Semicolon    ;          Less than     <
% Equals        =           Greater than >          Question mark ?
% At            @           Left bracket [          Backslash     \
% Right bracket ]           Circumflex   ^          Underscore    _
% Grave accent  `           Left brace   {          Vertical bar  |
% Right brace   }           Tilde        ~

% ---------------------------------------------------------------------|
% --------------------------- 72 characters ---------------------------|
% ---------------------------------------------------------------------|
%
% Optimal Foraging Theory Revisited: (old) Chapter 3. Review of
%                                    Classical Optimal Foraging Theory
% (this material not in final version of document; papers based on this
%  material may be generated later)
%
% (c) Copyright 2007 by Theodore P. Pavlic
%

\chapter{Review of Classical Optimal Foraging Theory}
\label{ch:review_classical_oft}

In this \chname{}, we review the classical \ac{OFT} approach to
optimization. Where possible, we will use notation from
\longref{ch:model}; this will let us examine strengths and weaknesses of
the classical \ac{OFT} approach and give a basis for comparison to our
approach. In \longref{sec:oft_long_term_rate}, we present the
optimization criterion used by classical \ac{OFT} and the justifications
for it.
%We present and respond to the conventional criticisms of
%classical \ac{OFT} in \longref{sec:oft_criticisms}.

\section{OFT Focus: Long-Term Rate of Net Gain}
\label{sec:oft_long_term_rate}

In biological contexts, it is expected that natural selection will favor
foraging behaviors that provide greater future reproductive success, a
common surrogate for Darwinian fitness. So, functions mapping specific
behaviors to quantitative measures of reproductive success can be
optimized to predict behaviors should result from a long run of natural
selection. \Citet{Schoener71} defines such a model, and while quantities
in the model are too difficult to define for most cases, behaviors
predicted by the model fall on a continuum from foraging time minimizers
(when energy is held constant) to energy maximizers (when foraging time
is held constant). \Citet{PPC77} argue that the \emph{rate} of net
energy intake is the most general function to be maximized as it
captures both extremes on the \citeauthor{Schoener71} continuum by
asserting an upward pressure on energy intake and a downward pressure on
foraging time (assuming net energetic intake is always positive). This
will allow a forager to achieve its energy consumption needs while also
leaving it enough time for other activities such as reproduction and
predator avoidance. Defining the rate of net energy intake can be done
in a number of different ways. Using the terms from \longref{ch:model},
it could be defined for any $t \in \R_{\geq0}$ or $N \in \N$ as
$\oft{G}^N/\oft{T}^N$ or $\E( \oft{G}^N/\oft{T}^N )$ or $\oft{G}(t)/t$
or $\E( \oft{G}(t) )/t$. However, \citeauthor{PPC77} also argue that
rates should be calculated over the entire life history of the forager.
Thus, rather than taking a particular $t \in \R_{\geq0}$ or $N \in \N$,
the asymptotic limits can be taken. Conveniently,
\longrefs{eq:oft_payoff_long_rate} shows that all of these are
equivalent. In fact, by \longref{eq:oft_payoff_long_rate_equiv}, 
%
\begin{equation} 
        \begin{split}
        \frac{\E(\oft{G}_1)}{\E(\oft{T}_1)} 
        &= 
        \frac{ \E(\oft{G}^{N_*}) }{\E(\oft{T}^{N_*})}
        =
        \aslim\limits_{N \to \infty} \frac{\oft{G}^N}{\oft{T}^N} 
        =
        \lim\limits_{N \to \infty} 
        \E\left(\frac{\oft{G}^N}{\oft{T}^N}\right)\\
        &=
        \frac{\E(\oft{G}(t_*))}{\E(\oft{T}(t_*))} = 
        \aslim\limits_{t \to \infty} \frac{\oft{G}(t)}{t} =
        \lim\limits_{t \to \infty} \frac{\E( \oft{G}(t) )}{t}\\
        \end{split}
        \label{eq:oft_payoff_long_rate_summary} 
\end{equation}
%
for any $t_* \in \R_{>0}$ and $N_* \in \N$. For this reason, the ratio
$\E(\oft{G}_1/\E(\oft{T}_1)$ has received significant interest in
classical \ac{OFT} \citep[\eg,][]{HouMc99,SC82,SK86}. We call this ratio
the \emph{long-term (average) rate of net gain}. Note that by
\longref{eq:RoE_equivalence} this ratio plays an identical role in our
analysis approach when we consider the asymptotic case.

\subsubsection{Opportunity Cost Interpretation}

%\Citet[ch.~4]{HouMc99} 
\Citet{HouMc99} provide an interesting interpretation of
$\E(\oft{G}_1)/\E(\oft{T}_1)$. They define constant $\gamma^* \in \R$ to
be the maximum value of $\E(\oft{G}_1)/\E(\oft{T}_1)$ (\ie, the
long-term rate of net gain) over the set of feasible agent behaviors.
They then treat rate $\gamma^*$ as factor converting time spent between
encounters to maximum points possible from that time. Therefore,
$\gamma^*$ converts time into its equivalent \emph{opportunity cost}.
They then show that during a single \ac{OFT} cycle the behavior for that
cycle that maximizes
%
\begin{equation}
        \E(\oft{G}_1 - \gamma^* \oft{T}_1)
        \label{eq:HouMc99_approach}
\end{equation}
%
will be the behavior that achieves the maximum long-term rate of gain
$\gamma^*$. So, maximizing the long-term rate of gain is equivalent to
maximizing the per-cycle \emph{gain} after being discounted by the
opportunity cost of the cycle time. Solving for this behavior can only
be done analytically if $\gamma^*$ is known, and so the method of
\citeauthor{HouMc99} numerically solves for the optimal behavior using
iteration, which could be a weakness of this approach. However, it
demonstrates an important interpretation of
$\E(\oft{G}_1)/\E(\oft{T}_1)$ as the \emph{opportunity cost} of
searching.

\subsubsection{Equilibrium Renewal Process as Attractive Alternative}

\Citet{Cha73} note that it is desirable to derive the \emph{equilibrium
renewal process} rate of net gain. That is, introduce a $T_1 \in
\R_{>0}$ and redefine the process to start after $T_1$ foraging time has
past. Hence, runtime $t$ represents the length of the interval
immediately after time $T_1$. The quantity of interest to
\citeauthor{Cha73} is then $\E(G)/t$, which represents the average rate
of net gain returned to an agent when an agent is in equilibrium with
its environment (\ie, after the decay of any initial transients).
However, they point out that this rate is only known for such a process
if it is additionally assumed that the net gain on each \ac{OFT} cycle
is independent of the total time of each \ac{OFT} cycle (in particular,
the processing time of each cycle). In that case, $\E(G)/t$ can also be
expressed as the ratio $\E(\oft{G}_1)/\E(\oft{T}_1)$. Unfortunately, it
is rare that net gain and processing time time will be independent in a
practical system. Analytical results are not available otherwise.
Because of this, when $\E(\oft{G}_1)/\E(\oft{T}_1)$ is used it is
usually assumed to be a limiting case (\ie, a rate over a long time
rather than a short-term rate after a long time).

\section{Classical OFT as a Fallacy of the Averages}
\label{sec:oft_fallacy_of_averages}

\Citet{TL81} suggest that the use of $\E(\oft{G}_1)/\E(\oft{T}_1)$
throughout classical \ac{OFT} is due to \emph{the fallacy of averages}.
This describes the incorrect assumption that for an arbitrary random
vector $\v{X}$ and (measurable) function $f$,
%
\begin{equation}
        \E( f(\v{X}) ) = f( \E( \v{X} ) )
        \label{eq:fallacy_of_averages}
\end{equation}
%
Behavioral ecologists are frequently criticized for committing this
error \citep[\eg{},][]{WPA88}. In many cases, the criticism or the error
itself arises from mistakenly dropping the limits in
\longref{eq:oft_payoff_long_rate_summary}. Sometimes the limiting
argument is implied by phrases like ``long term,'' but this is not
understood by critics. As explained in \longref{sec:oft_long_term_rate},
the ratio $\E(\oft{G}_1)/\E(\oft{T}_1)$ (\ie, the long-term rate of net
pain) is a reasonable optimization objective for evolutionary analysis.
However, the major responses to \citet{TL81} have not presented this
simple evolutionary argument. In fact, many of these responses only show
a misunderstanding of the true justification for using this criterion
and also neglect to point out the actual serious flaws in the argument
of \citeauthor{TL81}. Unfortunately, these incomplete or incorrect
responses have been the source of a great deal of confusion and mistrust
of classical \ac{OFT} in recent work.

\Citeauthor{TL81} criticize \citet{Cha76} and \citet{Pulliam74}, and
that criticism receives counter-arguments from \citet{GGP82} and
\citet{TGS82}. The claims of \citeauthor{TL81} are based in a
misunderstanding of \citet{Cha76}; unfortunately, while the responses
given by \citeauthor{GGP82} and \citeauthor{TGS82} lead to the correct
conclusion that some arguments of \citeauthor{TL81} are flawed, their
own counter-arguments are flawed due to a misunderstanding of
\citet{TL81}, \citet{Cha76}, or both. Even the direct response to
\citeauthor{TL81} from \citet{Cha81} is inadequate. Thus, the issue of
the fallacy of the averages lacks definitive closure and continues to
cause confusion in recent literature, as discussed in
\longref{sec:oft_short_term_rate}. We present the \citeauthor{Cha76}
model and the flawed criticism of it in \longref{sec:fa_charnov_model}.
We present the \citeauthor{Pulliam74} model and the flawed criticism of
it in \longref{sec:fa_pulliam_model}. We present issues with the
responses to these criticisms in \longref{sec:fa_responses}. 

\subsection{A Deterministic Foraging Model}
\label{sec:fa_charnov_model}

\Citet{Cha76} presents a ``completely deterministic''
\citep[p.~130]{Cha76} model for optimal foraging theory. The forager is
``assumed to make decisions as to maximize the net rate of energy intake
during a foraging bout'' \citep[p.~131]{Cha76}. The model analyzes a
forager searching for ``patches'' of which there are $n \in \N$ patch
types. The model calls $t$ the ``interpatch travel time,'' which
represents a deterministic time spent before encountering an additional
patch. For each $i \in \{1,2,\dots,n\}$, the model calls $P_i$ the
``proportion of the visited patches that are of type $i$'', $E_T$ the
``energy cost per unit time traveling between patches'', and $g_i(T_i)$
the  ``assimilated energy'' from handling a patch of type $i$ for a
total of $T_i$ time units. Therefore, if the agent encounters $N$
patches, $N P_i$ is the exact number (\ie, a deterministic quantity) of
patches of type $i$ and the total assimilated energy from all patches
divided by the total time spent searching for and handling all patches
is 
%
\begin{equation}
        \frac
        { \sum\limits_{i=1}^n N P_i g_i(T_i) - N t E_T }
        { N t + \sum\limits_{i=1}^n N P_i T_i }
        =
        \frac
        { \sum\limits_{i=1}^n P_i g_i(T_i) - t E_T }
        { t + \sum\limits_{i=1}^n P_i T_i }
        \label{eq:deterministic_net_rate}
\end{equation}
%
\Citeauthor{Cha76} calls this the \emph{net energy intake rate} for the
single bout. Note that while this rate looks similar in form to
$\E(\oft{G}_1)/\E(\oft{T}_1)$, this rate is \emph{not} meant to be an
expression of an expectation or even a quotient of expectations. This is
a ratio of two deterministic quantities representing total energy gained
over total time spent during an entire foraging bout. The only valid
criticism of \citeauthor{Cha76} might refer to some unnecessary uses of
the term \emph{average} in the description
\longref{eq:deterministic_net_rate}; however, the form of the expression
is definitely correct. In fact, the fallacy of the averages is
impossible in a deterministic context like this one.

\subsubsection{Criticism and the Per-Patch Rate}

In \citet{TL81}, all references to the $P_i$ from \citet{Cha76} are
replaced with $p_i$, which is defined by \citeauthor{TL81} as the
``probability of encountering patch type $i$.'' By substituting $P_i$
for $p_i$, they have made the mistake of turning a deterministic model
of foraging behavior into a stochastic one, which justifies the use of
probabilistic expectation. Their major claim is that \citeauthor{Cha76}
meant to use the ``rate of energy intake in patch type $i$'' and instead
use the ``average energy intake per patch divided by the average time
required to obtain the energy per patch.'' This claim is a result of a
fallacy of the averages. Therefore, with the caveat that interpatch
travel time $t$ is deterministic\footnote{\Citet{TL81} actually use the
term ``constant,'' which does not necessarily mean that $t$ is
deterministic; however, context implies that a constant and
deterministic $t$ is actually what is desired.}, \citeauthor{TL81} say
that the correct rate to use is
%
\begin{equation}
        \sum\limits_{i=1}^n p_i \frac{ g_i(T_i) - t E_T }{ t + T_i }
        \label{eq:tl_short_rate}
\end{equation}
%
If probability $p_i$ is replaced with the proportion $P_i$ from the
deterministic model of \citeauthor{Cha76} then this expression
represents the (arithmetic) mean \emph{per-patch} net energy intake
rate. However, \citet{Cha76} is very clear about optimizing net energy
intake over net time for an \emph{entire foraging bout} and not an
average per-patch rate. Thus, the criticisms that \citeauthor{TL81} make
of \citet{Cha76} are entirely due to a misunderstanding. 

\paragraph{Contradictory References:} \citet{TL81} also briefly
reference the stochastic derivations by \citet{Cha73} in order to
criticize another work. This implies that they simultaneously agree with
\citet{Cha73} and disagree with the similar work of \citet{Cha76}. This
is an interesting contradiction. A proper response to \citeauthor{TL81}
should use the model from \citeauthor{Cha73}. However, the prominent
responses all assume that \citeauthor{TL81} are criticizing
\citet{Cha73} and defend that work instead. Since \citeauthor{TL81}
clearly agree with \citeauthor{Cha73} then all such responses are
nonsense.

\paragraph{Use of Per-Patch Rate in Literature:} As we will discuss in
\longref{sec:oft_short_term_rate}, some authors have presented
qualitative observational justifications for optimizing a mean per-patch
net energy intake rate, and those authors have used
\longref{eq:tl_short_rate} as a template for such a rate. Interestingly,
these authors neglect that \citeauthor{TL81} state that $t$ must be
deterministic. In fact, if the interpatch travel time is a random
variable, using \longref{eq:tl_short_rate} to represent a mean per-patch
net energy intake is itself a fallacy of the averages. Thus, it is
extremely rare that the form of \longref{eq:tl_short_rate} has any value
in analysis of real systems.

\subsection{A Stochastic Foraging Model}
\label{sec:fa_pulliam_model}

\Citet{Pulliam74} presents a stochastic model of foraging that is almost
identical the one given by \citet{Cha73}. However, while
\citeauthor{Cha73} makes use of the theory of Poisson processes,
\citeauthor{Pulliam74} uses a notion of a ``random scatter of prey
items'' \citep[p.~62]{Pulliam74}. This notion generates something
similar to a Poisson process (\ie, it has the \emph{memorylessness}
property); however, it does not have the \emph{orderliness} property
that, among other things, forces simultaneous encounters to occur with
probability zero. Without this property, many of the arguments of
\citeauthor{Pulliam74} do not make rigorous mathematical sense. On the
other hand, the model assumes movement in such a way as to ``minimize
the probability of covering the same area twice''
\citep[p.~60]{Pulliam74} which could be argued to resemble orderliness
since prey are assumed to be stationary. Additionally, while some
language implies that limiting arguments are used, other language
implies that a limiting argument is not being used.  However, all of the
end results are correct from a limiting interpretation if the Poisson
assumption is made (\ie, orderliness is explicitly added to the existing
memorylessness assumption). Therefore, while the language of
\citeauthor{Pulliam74} can certainly be criticized, the end results
cannot. With a slight modification and an explicit limiting
interpretation, no fallacy of the averages has been committed. 

\subsubsection{Criticism and the Poisson Process}

\Citet{TL81} assume that \citet{Pulliam74} do not use a limiting
argument, and so they assume that a fallacy of averages has been
committed. In fact, they use the work of \citet{Cha73} to argue that the
argument of \citeauthor{Pulliam74} should only apply to large time
intervals (\ie, a limiting argument must be used). \Citeauthor{TL81}
then move to criticize the assumptions that \citet{Cha73} use to make
this argument true. In particular, they note that prey encounters must
be generated according to a Poisson process. However, \citeauthor{TL81}
misinterpret Poisson process properties when they incorrectly state that
encounters must be such that ``the probability of encountering a prey
item tends to zero on any finite interval as the number of prey items
tends to infinity'' which leads to them to state that ``prey items be
rather rare'' \citep[p.~392]{TL81}. This conclusion is their
justification for why Poisson encounters poorly model most real systems.
However, for Poisson encounters of prey items, probability of
encountering a prey item tends to zero on any finite zero as the number
of \emph{trials} tends to infinity. In other words, the probability of
encountering a prey item at any \emph{instant} of time is zero. This
assumption is very realistic; in fact, it would be strange for a model
of a real system not to assume this. It is more conventional to
criticize the independence of interarrival intervals and memorylessness
of interarrival times of a Poisson process, and it is strange that
\citeauthor{TL81} do not make this criticism considering that
\citeauthor{Pulliam74} do not make the Poisson assumption and instead
make the explicit assumption of independent arrivals and memorylessness.
Therefore, while \citet{Pulliam74} certainly deserves criticism for the
language used, the \citeauthor{TL81} criticism of \citeauthor{Pulliam74}
comes primarily from a misunderstanding of Poisson processes.

\subsection{The Responses}
\label{sec:fa_responses}

\Citet{GGP82} come to the defense of \citet{Cha76} by arguing that
\citet{TL81} have committed mathematical mistakes in their calculations.
Of course, the mistakes of \citeauthor{TL81} are in their understanding
and not in their calculations. In fact, the flawed argument of
\citeauthor{GGP82} itself is the result of a number of their own
mathematical and conceptual mistakes. \Citet{TGS82} defend both
\citet{Cha76} and \citet{Pulliam74}. However, their defense of
\citeauthor{Cha76} primarily supports arguments that are never
introduced by \citet{Cha76} nor \citet{Cha73}. Their defense of
\citet{Pulliam74} is incomplete and commits a number of small
mathematical mistakes. \Citet{Cha81} gives a direct response to
\citeauthor{TL81}. While this latter response is much shorter than the
others, it still manages to make same major mistake. We discuss the
problems with these responses below. Unfortunately, \citet{GGP82} and
\citet{TGS82} continue to be considered authoritative
\citep[\eg,{}][]{BK95,BW96,PHM90,REH90} despite having little analytical
value. In \longref{sec:oft_short_term_rate}, we present one example
where an entire body of research has its foundation in a
misunderstanding of \citet{TL81} and these inadequate responses to it.

\subsubsection{Fallacy of the Fallacy of the Traffic Policeman}

\Citet{GGP82} use the notation $p_i$ to represent the ``fraction of
patches that are of type $i$,'' which is similar to the $P_i$ introduced
by \citet{Cha76}; however, despite referring to the ``deterministic
character'' \citep[p.~877]{GGP82} of the \citeauthor{Cha76} model, they
introduce random variables for the ``energy gained ($G$) per patch
visited'' and the ``time ($T$) per visit'' and claim that
\longref{eq:deterministic_net_rate} is $\E(G)/\E(T)$
\citep[p.~875]{GGP82}. Of course, the \citeauthor{Cha76} model is
deterministic and the rate in \longref{eq:deterministic_net_rate} is
defined explicitly as the total net gain divided by the total time of an
entire foraging bout, and so \citeauthor{GGP82} are doubly mistaken.
They then call $t$ the ``average travel time between patches'' and state
that \longref{eq:tl_short_rate} represents $\E(G/T)$
\citep[p.~875]{GGP82}. This definition of $t$ unique from any of the
previous definitions and implies that travel time between patches must
be a random variable. However, if that's the case then $\E(G/T)$ equal
\longref{eq:tl_short_rate} in general; in fact, to say that these two
are equal commits another kind of fallacy of the averages\footnote{For
random variable $X$, $\E(1/X) \neq 1/\E(X)$. To say otherwise is false
in general.}. \Citet{GGP82} series of errors continues when they claim
that \citeauthor{TL81} call \longref{eq:tl_short_rate} an ``average rate
of energy intake'' \citep[p.~875]{GGP82}; in fact, \citeauthor{TL81}
call this the ``average value'' of the ``rate of energy intake in patch
type $i$'' \citep[pp.~390--391]{TL81}. Finally, \citeauthor{GGP82}
suggest that the formulation of $\E(G/T)$ given by \citeauthor{TL81}
contains a ``conceptual flaw'' \citep[p.~876]{GGP82} that can be fixed
by replacing each $p_i$ with $p_i'$, which is called ``the probability
of finding an animal in or traveling to patch type $i$ at a given
instant'' \citep[p.~877]{TL81} and defined as
%
\begin{equation*}
        p_i' = \frac{ p_i ( t + T_i ) }
                    { \sum\limits_{j=1}^n p_j ( t + T_j ) }
\end{equation*}
%
This substitution lets $\E(G/T)$ be expressed by
\longref{eq:deterministic_net_rate}. However, $p_i'$ is not a
probability. If it were such a probability then it should be used in the
expression for $\E(G)/\E(T)$ as well. In that case, $\E(G)/\E(T)$ would
\emph{not} match \longref{eq:deterministic_net_rate} as it must in their
argument. Therefore, \citeauthor{GGP82} present a confusing,
contradictory, and counter-productive defense of \citeauthor{Cha76}.

\subsubsection{Fallacy of the Fallacy of the Fallacy of the Averages}

In \citet{TGS82}, it is correctly noted that \citet{Cha76} states that
the quantity to be maximized is the net rate of energy intake ``during a
foraging bout.'' However, much of the rest of their argument concerns
stochastic arguments that make no sense for the deterministic model of
\citeauthor{Cha76}. While it is possible that they mean to refer to
\citet{Cha73}, the particular stochastic limits used by
\citeauthor{TGS82} are never discussed by \citeauthor{Cha73}.
Additionally, it makes little sense to claim that \citeauthor{Cha73}
answers \citeauthor{TL81} since \citeauthor{TL81} use these same
arguments in their criticism of \citet{Pulliam74}. \Citeauthor{TGS82}
also reference \citet{GGP82} as giving ``the correct analysis of the
(short-term) average rate of energy intake'' \citep[p.~883]{TGS82} which
has already been shown to also be nonsensical, and so \citeauthor{TGS82}
inherit all of the previous objections to the argument of
\citeauthor{GGP82}.

\citeauthor{TGS82} defend \citet{Pulliam74} for the same reasons that we
do. In particular, they point out that \citeauthor{TL81} misunderstand
the mathematical description of a continuous-time Poisson counting
process. However, \citeauthor{TGS82} introduce a quantity analogous to
our $\oft{T}^N$ and state that $\E(N/\oft{T}^N)$ converges to
$1/\E(\oft{T}_1)$ as $N \to \infty$ in general. As shown by
\citet{JohnsMiller63}, this is only true when there exists some $K \in
\N$ with $\E(1/\oft{T}^K) < \infty$. \Citeauthor{TGS82} continue this
discussion of convergence by giving the Taylor series expansion
%
\begin{equation*}
        \E\left( \frac{N}{\oft{T}^N} \right)
        \approx 
        \frac{1}{\E(\oft{T}_1)} 
        + 
        \frac{\var(\oft{T}_1)}{N \E(\oft{T}_1)^3}
\end{equation*}
%
While this is certainly a valid expansion, they claim that it is the
result of expanding $1/\oft{T}^N$ about $\E(\oft{T}_1)$. In fact, the
expansion given above is the result of expadnding $1/\oft{T}^N$ about
$\E(\oft{T}^N)$ (\ie, $N \E(\oft{T}_1)$). These two expansions will only
be similar for small $N$. The purpose of this expansion is to justify
approximating $\E( N/\oft{T}^N )$ by $1/\E(\oft{T}_1)$ as the
approximation error will be small provided $N$ is sufficiently large.
However, as there is no similar Taylor series expansion for
$\E(\oft{G}^N/\oft{T}^N)$ because $\oft{G}^N$ and $\oft{T}^N$ will
covary in general\footnote{However, the covariance of these two random
variables should decrease quickly with increasing $N$.}, this is not a
helpful argument for justifying the fallacy of the averages. 

\subsubsection{A Direct Response}

Interestingly, a direct answer to \citet{TL81} is given by \citet{Cha81}
as a single paragraph. It states that \citeauthor{TL81} ``have chosen to
ignore the fact that cumulative renewal theory \emph{was} the basis for
using'' \longref{eq:tl_short_rate} as an average rate of energy intake.
It states that this ``is clearly stated in'' \citet{Cha76} and ``well
developed'' by \citet{Cha73}. However, as already discussed, the former
presents a ``completely deterministic'' model. Additionally, the latter
only uses \longref{eq:tl_short_rate} when describing deterministic
foraging models. Its treatment of stochastic models abandons the $P_i$
notation for split Poisson process probabilities $\lambda_i / \lambda$.
That is, the cumulative renewal theory results by \citet{Cha73} use
notation similar to what is used in our stochastic foraging model.
Thus, even \citet{Cha81} fails to provide an adequate response to
\citet{TL81}.

\section{The Short-Term Rate}
\label{sec:oft_short_term_rate}

\todo{Some biologists have argued from empirical data that $\E(G/T)$
should be used instead of $\E(G)/E(T)$. I show that the empirical data
does NOT imply this and that their expressions for $\E(G/T)$ are
incorrect. I need to include these results here.}

%then this expression would be similar to our $\E( \oft{G}_1 /
%\oft{T}_1 )$, which has commonly come to be known as the
%\emph{short-term rate} \citep{REH90,Real91,TGS82}, the \emph{per-patch
%rate} \citep{SK86}, or the \emph{\acro{EoR}{expectation of rates}}
%\citep{BK96,BW96}. However, there is nothing by \citet{Cha73} nor
%\citet{Cha76} indicating that \citeauthor{Cha76} intended on maximizing
%this short-term rate. So, again, the arguments of \citeauthor{TL81} are
%simply due to a misreading of \citet{Cha73} and \citet{Cha76}.

\section{On-line Estimation of Long-Term Rate}
\label{sec:oft_est_long_term_rate}

\todo{There is some confusion in the literature on how to estimate
long-term rate on-line. Few realize that the RATE of POINT GAIN is not
what is actually being actively estimated and maintained but rather then
ENCOUNTER RATE. In this section, I show two methods of estimating the
encounter rate and give stochastic results that show that both
estimation techniques will converge to the actual encounter rate with
probability 1. This section actually relates to
\longref{sec:oft_short_term_rate} because much of the discussion of
estimation of rates in the literature has been done by advocates of the
short-term rate approach.}

The extensive use of the expression $\E(\oft{G}_1)/\E(\oft{T}_1)$ in
foraging literature without reference to what it represents has led to
some strange conclusions. 

Observational evidence \citep[\eg,][]{BK95,HR87} has suggested that some
foragers behave in a way that optimizes
%
\begin{align}
        \E\left(\frac{\oft{G}_1}{\oft{T}_1}\right)
        \label{eq:oft_payoff_EoR}
\end{align}
%
By itself, it is not clear how the maximization of this expression would
capture any important measures of Darwinian fitness \citep{GGP82,TGS82}.

However,  
$\E(\oft{G}_1)/\E(\oft{T}_1)$ is not necessarily


Whereas the long-term rate of gain seems to encapsulate important
measures of Darwinian fitness, \longref{eq:oft_payoff_EoR} alone does
not appear to measure  

\citep{GGP82,TGS82}

We stress that the expression
$\E(\oft{G}_1)/\E(\oft{T}_1)$ is meant to be an
analytical tool for understanding the impact of the structure of more
fundamental stochastic rates $\oft{G}/\oft{T}$ and $\oft{G}/t$ on
behavior. If an agent is designed to monitor is long-term rate of gain
as part of an algorithm to maximize its performance in an unknown
environment, it need not estimate the expectations
$\E(\oft{G}_1)$ and $\E(\oft{T}_1)$. Instead, it only
needs to calculate $\oft{G}/\oft{T}$ or $\oft{G}/t$. The former
expression requires storage of two sums which can be updated at each new
encounter and the calculation of a quotient of those sums. The latter
expression requires storage of a single sum which can be updated at each
new encounter and the calculation of the quotient of that sum with the
agent's runtime, which may be inferred from internal (\eg, a clock) or
external (\eg, the sun) cues. Thus, the latter expression in particular
requires relatively low computational burden.

We do not wish to imply that it is necessary for foragers to have the
ability to calculate their long-term rate of gain on-line. If the
long-term rate of gain truly is a surrogate for Darwinian fitness,
agents that have heritable behaviors that happen to maximize long-term
rate of gain will be favored by natural selection. That is, being able
to perform calculations about the environment in order to adapt to it
will provide no survival advantage to foragers in a relatively static
environment. Thus, in these static environments, classical \ac{OFT} does
not predict \emph{how} foragers are making decisions but rather
\emph{why} certain decisions have been favored by the natural selection
accompanying a certain environment. However, there is some observational
evidence \citep[\eg,][]{BK95,HR87} that deviates from the expected
optimum behaviors predicted by classical \ac{OFT} (\ie, behaviors that
do not maximize long-term rate of gain). This has motivated some
behavioral ecologists to suggest that agents do perform quantitative
calculations internally \citep[\eg,][]{BK95,BK96,BW96,REH90}. These
ecologists claim that limitations in memory or computational ability
prevent some foragers from estimating $\E(\oft{G}_1)$ and
$\E(\oft{T}_1)$ accurately and so those foragers either use bad
estimates of long-term rate of gain

Observational evidence \citep[\eg,][]{BK95,HR87} has suggested that some
foragers have behaviors that happen to optimize
%
\begin{align}
        \E\left(\frac{\oft{G}_1}{\oft{T}_1}\right)
        \label{eq:oft_payoff_EoR_old}
\end{align}
%
where $k \in \N$ is an arbitrary foraging cycle. In an effort to explain
these deviations from classical \ac{OFT} (\ie, optimizing
$\E(\oft{G}_1)/\E(\oft{T}_1)$), a number of papers \citep[in
particular,][]{BK95,BK96,BW96,REH90} suggest that foragers actively
evaluate performance in real-time by calculating payoff function
estimates internally and adjust behavior in order to maximize those
internal metrics. From this assumption, the papers conclude that if
$\E(\oft{G}_1)/\E(\oft{T}_1)$ is too complex to calculate, the foragers
will either use a very bad estimate of $\E(\oft{G}_1)/\E(\oft{T}_1)$
that leads to suboptimal behavior or will pick some other payoff
function that is simple to calculate and yet also leads to beneficial
behaviors. \Citet{REH90} states that there ``may not be a single
currency that is most appropriate for all biological systems'' and that
``there may be a most appropriate method for processing information,
conditional upon the organism's memory capabilities.'' The papers claim
that $\E(\oft{G}_1)/\E(\oft{T}_1)$ would be estimated by storing in
memory each individual gain $\oft{G}_{\oft{j}}$ and cycle time
$\oft{T}_{\oft{j}}$ for all $j \in \{1,\dots,N\}$ and calculating the
ratio of sums $\oft{G}/\oft{T}$, where $N$ is the total number of
foraging cycles completed. By \longref{eq:oft_payoff_long_rate},
$\oft{G}/\oft{T}$ will converge with probability 1 to
$\E(\oft{G}_1)/\E(\oft{T}_1)$ as $N$ gets large. However, these papers
suggest that storing those $2N$ values for a large $N$ would be too
complex for certain foragers. Thus, other payoff functions with ``no
obvious functional advantage'' \citep{BK96} are defined that require
less storage and also lead to behaviors that optimize
\longref{eq:oft_payoff_EoR}. We note that $\E(\oft{G}_1)/\E(\oft{T}_1)$
can be estimated without storing in memory a large number of previous
net gains and cycle times.  In fact, if the sums $\oft{G}$ and $\oft{T}$
are kept in memory after $N$ cycles, they can be updated after cycle
$N+1$ by adding $\oft{G}_{N+1}$ and $\oft{T}_{N+1}$ respectively. In
fact, if only the sum $\oft{G}$ is stored in memory and lifetime $t$ is
inferred from environmental or physiological cues,
\longref{eq:oft_payoff_long_rate} shows that the ratio $\oft{G}/t$ will
also converge to $\E(\oft{G}_1)/\E(\oft{T}_1)$.  Additionally, while it
may be possible that foragers are calculating these rates in real-time,
it seems more likely that natural selection will choose foragers that
have behaviors that happen to maximize $\E(\oft{G}_1)/\E(\oft{T}_1)$ for
their relatively static environment. 

\section{Simultaneous Encounters}
\label{sec:oft_simultaneous_encounters}

\todo{Many authors (especially those mentioned in
\longref{sec:oft_short_term_rate}) use experiments with simultaneous
encounters and claim that the results invalidate the use of the
long-term rate. I show that these, in fact, do not invalidate the use of
the long-term rate. I stress that one of the key assumptions of OFT is
that simultaneous encounters occur with probability 0 and any
experimental design that does not meet this requirement simply does not
apply to OFT. Many authors forget that Poisson encounters require more
than just memorylessness; orderliness is important.}