% Upper-case    A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
% Lower-case    a b c d e f g h i j k l m n o p q r s t u v w x y z
% Digits        0 1 2 3 4 5 6 7 8 9
% Exclamation   !           Double quote "          Hash (number) #
% Dollar        $           Percent      %          Ampersand     &
% Acute accent  '           Left paren   (          Right paren   )
% Asterisk      *           Plus         +          Comma         ,
% Minus         -           Point        .          Solidus       /
% Colon         :           Semicolon    ;          Less than     <
% Equals        =           Greater than >          Question mark ?
% At            @           Left bracket [          Backslash     \
% Right bracket ]           Circumflex   ^          Underscore    _
% Grave accent  `           Left brace   {          Vertical bar  |
% Right brace   }           Tilde        ~

% ---------------------------------------------------------------------|
% --------------------------- 72 characters ---------------------------|
% ---------------------------------------------------------------------|
%
% Optimal Foraging Theory Revisited: Chapter 4. Optimization Results
%
% (c) Copyright 2007 by Theodore P. Pavlic
%
 
\chapter{Finite-Lifetime Optimization Results}
\label{ch:optimization_results}%
\index{solitary agent model!classical analysis!optimization|(}%
\index{solitary agent model!processing-only analysis!optimization|(}

In \longref{ch:optimization_objectives}, we discussed selecting agent
behavior based on the optimization of functions that encapsulated a
number of different optimization objectives. Some of these functions
traded off objectives by maximizing the ratio of one to the other. Other
functions traded off objectives by maximizing a linear combination of
objectives. In either case, for many of the statistics defined in
\longref{ch:model}, the resulting functions have a special structure in
common. In this chapter, we optimize a general value function that also
has this structure. We then apply these general results to value
functions of interest in our model. The general results are given in
\longref{sec:optimization_rational} and the specific results are given
in \longref{sec:optimization_specific}.

\section{Optimization of a Rational Objective Function}
\label{sec:optimization_rational}
\index{rational objective function|(}

Before we discuss optimization of some of the functions introduced in
\longref{ch:optimization_objectives}, we focus on a special general
case. This general case may be applied to the optimization functions we
have introduced or be used to derive optimal behavior for other novel
valuation functions that have this structure.

\subsection{The Generalized Problem}

Take $n \in \N$ task types. Most statistics defined in
\longref{ch:model} and optimization functions described in
\longref{ch:optimization_objectives} have a special structure in common.
Here, we present a generalized optimization problem that provides
solutions to a broad range of problems in this model.

\subsubsection{The Decision Variables and Constraints}

The decision variables are preference probabilities and processing
times. These variables are constrained, so their bounds must be defined
as parameters. For each $i \in \{1,2,\dots,n\}$, define upper and lower
preference constraint parameters $p_i^-,p_i^+ \in [0,1]$ and upper and
lower time constraint parameters $\tau_i^- \in \R_{\geq0}$, and
$\tau_i^+ \in \extR_{\geq0}$. Collect these constraint parameters into
vectors $\v{p}^-,\v{p}^+ \in [0,1]^n$, $\v{\tau}^- \in \R_{\geq0}^n$,
and $\v{\tau}^+ \in \extR_{\geq0}^n$ defined by
%
\begin{align*}
        \v{p}^-
        &\triangleq 
        \begin{bmatrix} p_1^-, p_2^-, \dots, p_n^- \end{bmatrix}^\T
        &
        \v{\tau}^-
        &\triangleq 
        \begin{bmatrix} 
                \tau_1^-, \tau_2^-, \dots, \tau_n^- 
        \end{bmatrix}^\T\\
        \v{p}^+
        &\triangleq 
        \begin{bmatrix} p_1^+, p_2^+, \dots, p_n^+ \end{bmatrix}^\T
        &
        \v{\tau}^+
        &\triangleq 
        \begin{bmatrix} 
                \tau_1^+, \tau_2^+, \dots, \tau_n^+ 
        \end{bmatrix}^\T
\end{align*}
%
Therefore, for an arbitrary preference probability vector $\v{p}$ and
processing time vector $\v{\tau}$ defined so that
%
\begin{align*}
        \v{p}
        &\triangleq 
        \begin{bmatrix} p_1, p_2, \dots, p_n\end{bmatrix}^\T
        &
        \v{\tau}
        &\triangleq 
        \begin{bmatrix} 
                \tau_1, \tau_2, \dots, \tau_n
        \end{bmatrix}^\T
\end{align*}
%
it must be that
%
\begin{equation*}
        p_i^- \leq p_i \leq p_i^+
        \quad \text{ and } \quad
        \tau_i^- \leq \tau_i \leq \tau_i^+
\end{equation*}
%
for all $i \in \{1,2,\dots,n\}$.

\subsubsection{Generalized Advantage, Disadvantage, and Objective}

For each $i \in \{1,2,\dots,n\}$, define the generalized task advantage
function $a_i: \R_{\geq0} \cap [\tau_i^-,\tau_i^+] \mapsto \R$ and the
generalized task disadvantage function $d_i: \R_{\geq0} \cap
[\tau_i^-,\tau_i^+] \mapsto \R$ to be continuously differentiable
functions%
%\footnote{That is, these functions
%are differentiable and have continuous derivatives at every point in
%their domain.}
. Also define the environment advantage $a \in \R$ and
disadvantage $d \in \R$.  Therefore, the total advantage $A$ and total
disadvantage $D$ are defined by
%
\begin{align*}
        A(\v{p},\v{\tau}) 
        &\triangleq a + \sum\limits_{i=1}^n p_i a_i(\tau_i)
        &
        D(\v{p},\v{\tau}) 
        &\triangleq d + \sum\limits_{i=1}^n p_i d_i(\tau_i)
\end{align*}
%
where $\v{p} \in [0,1]^n$ and $\v{\tau} \in \R_{\geq0}^n$ are arbitrary
preference probability and processing time vectors. Therefore, the
generalized objective $J$, the advantage-to-disadvantage ratio, is
defined by $J(\v{p},\v{\tau}) \triangleq
A(\v{p},\v{\tau})/D(\v{p},\v{\tau})$.

\subsubsection{Notation}

Take $i,j \in \{1,2,\dots,n\}$. For the advantage $a_i$ and disadvantage
$d_i$, use the notation
%
\begin{align*}
        a'(\tau_i) 
        &\triangleq \frac{ \total }{ \total \tau_i } a(\tau_i)
        &
        a''(\tau_i) 
        &\triangleq \frac{ \total^2 }{ \total \tau_i^2 } a(\tau_i)
        \\
        d'(\tau_i) 
        &\triangleq \frac{ \total }{ \total \tau_i } d(\tau_i)
        &
        d''(\tau_i) 
        &\triangleq \frac{ \total^2 }{ \total \tau_i^2 } d(\tau_i)
\end{align*}
%
to represent the first and second derivatives of each function evaluated
at the point $\tau_i$.

\subsection{The Optimization Procedure}
\label{sec:optimization_procedure}

The goal is to choose preference probabilities and processing times to
(locally) maximize $J$. This can be formulated as the constrained
minimization problem
%
\begin{equation*}
        \begin{split}
        &\text{minimize} \enskip {-J}\\
        &\text{subject to} \enskip
        {-\tau_i} \leq \tau_i^-, \enskip
        {\tau_i} \leq \tau_i^+, \enskip
        {-p_i} \leq p_i^-, \enskip
        {p_i} \leq p_i^+
        \text{ for all } i \in \{1,\dots,n\}
        \end{split}
\end{equation*}
%
with $4n$ inequality constraints. This problem can be solved using
\aimention{Joseph-Louis Lagrange}Lagrange multiplier theory
\citep{Bertsekas95}. Define the \aimention{Joseph-Louis
Lagrange}Lagrangian $L$ by
%
\begin{equation*}
        L
        \triangleq
        -J
        - \v{\mu}_-^\T ( \v{p} - \v{p}^- )
        + \v{\mu}_+^\T ( \v{p} - \v{p}^+ )
        - \v{\nu}_-^\T ( \v{\tau} - \v{\tau}^- )
        + \v{\nu}_+^\T ( \v{\tau} - \v{\tau}^+ )
\end{equation*}
%
where $\v{\mu}_-,\v{\mu}_+,\v{\nu}_-,\v{\nu}_+ \in \R_{\geq0}^n$ are
vectors of \aimention{Joseph-Louis Lagrange}Lagrange multipliers,
denoted
%
\begin{align*}
        \v{\mu}_- 
        &\triangleq 
        \begin{bmatrix}
                \mu_{1-} & \mu_{2-} & \dots & \mu_{n-}
        \end{bmatrix}^\T
        &
        \v{\mu}_+
        &\triangleq 
        \begin{bmatrix}
                \mu_{1+} & \mu_{2+} & \dots & \mu_{n+}
        \end{bmatrix}^\T\\
        \v{\nu}_- 
        &\triangleq 
        \begin{bmatrix}
                \nu_{1-} & \nu_{2-} & \dots & \nu_{n-}
        \end{bmatrix}^\T
        &
        \v{\nu}_+
        &\triangleq 
        \begin{bmatrix}
                \nu_{1+} & \nu_{2+} & \dots & \nu_{n+}
        \end{bmatrix}^\T
\end{align*}
%
For ease of notation, we use the symbol $\v{m}^*$ to represent a
collection of one of each of these four \aimention{Joseph-Louis
Lagrange}Lagrange multiplier vectors.  That is, $\v{m}^* \in
(\R_{\geq0}^n)^4$ with $\v{m}^* \triangleq
(\v{\mu}_-^*,\v{\mu}_+^*,\v{\nu}_-^*,\v{\nu}_+^*)$. Next, denote the
feasible set $\set{F}$ of decision variables by
%
\begin{equation*}
        \set{F}
        \triangleq
        \left\{
        (\v{p},\v{\tau}) \in [0,1]^n \times \R_{\geq0}^n
        :
        p_i^- \leq p_i \leq p_i^+, 
        \tau_i^- \leq \tau_i \leq \tau_i^+, 
        i \in \{1,2,\dots,n\}
        \right\}
        %\label{eq:feasible_set}
\end{equation*}
%
Also, for each $(\v{p}^*,\v{\tau}^*) \in \set{F}$, define the sets of
active inequality constraints\footnote{An \emph{active} inequality
constraint is a constraint that holds only by equality. For example, the
constraint $x \geq 1$ is active for $x = 1$ and \emph{inactive} for $x >
1$.} $\set{A}_p^-(\v{p}^*)$, $\set{A}_p^+(\v{p}^*)$, $\set{A}_\tau^-(
\v{\tau}^* )$, $\set{A}_\tau^+( \v{\tau}^* )$ by
%
\begin{align*}
        \set{A}_p^-(\v{p}^*)
        &\triangleq
        \left\{
        i \in \{1,2,\dots,n\}
        :
        p^*_i = p_i^-
        \right\}
        &
        \set{A}_\tau^-(\v{\tau}^*)
        \triangleq
        \left\{
        i \in \{1,2,\dots,n\}
        :
        \tau^*_i = \tau_i^-
        \right\}\\
        \set{A}_p^+(\v{p}^*)
        &\triangleq
        \left\{
        i \in \{1,2,\dots,n\}
        :
        p^*_i = p_i^+
        \right\}
        &
        \set{A}_\tau^+(\v{\tau}^*)
        \triangleq
        \left\{
        i \in \{1,2,\dots,n\}
        :
        \tau^*_i = \tau_i^+
        \right\}
\end{align*}
%
For any $i \in \{1,2,\dots,n\}$, if $p_i^+=p_i^-$ ($\tau_i^+=\tau_i^-$),
then the inequality \aimention{Joseph-Louis Lagrange}Lagrange
multipliers $\mu_i^+$ and $\mu_i^-$ ($\nu_i^+$ and $\nu_i^-$) combine to
form an equality \aimention{Joseph-Louis Lagrange}Lagrange multiplier
$\mu_i^+ - \mu_i^- \in \R$ ($\tau_i^+ - \tau_i^- \in \R$).  Therefore,
it is clear that all points $(\v{p},\v{\tau}) \in \set{F}$ are
regular\footnote{In this context, a \emph{regular} point is a point
where all active constraint gradients are linearly independent.}.
Finally, for any point $(\v{p}^*,\v{\tau}^*)$, define the \emph{feasible
variations} $\set{V}(\v{p}^*,\v{\tau}^*)$ by
%
\begin{equation*}
        \set{V}(\v{p}^*,\v{\tau}^*)
        \triangleq
        \left\{
        \begin{bmatrix}
                \delta^p_1\\ \delta^p_2\\ \vdots\\ \delta^p_n\\
                \delta^\tau_1\\ \delta^\tau_2\\ \vdots\\ \delta^\tau_n
        \end{bmatrix}
        \in \R^{2n}
        :
        \begin{aligned}
        \delta^p_i &= 0,& 
        i &\in \set{A}_p^-(\v{p}^*) \cup \set{A}_p^+(\v{p}^*)
        ,\\
        \delta^\tau_j &= 0,&
        j &\in \set{A}_\tau^-(\v{\tau}^*) \cup \set{A}_\tau^+(\v{\tau}^*)
        \end{aligned}
        \right\}
\end{equation*}
%
We also define the gradient operator $\nabla$ and the \aimention{Ludwig
Otto Hesse}Hessian operator $\nabla^2$ by
%
\begin{align*}
        \nabla 
        &\triangleq
        \begin{bmatrix}
                \frac{\partial}{\partial p_1},
                \frac{\partial}{\partial p_2},
                \dots,
                \frac{\partial}{\partial p_n},
                \frac{\partial}{\partial \tau_1},
                \frac{\partial}{\partial \tau_2},
                \dots,
                \frac{\partial}{\partial \tau_n}
        \end{bmatrix}^\T
        &
        \nabla^2
        &\triangleq
        \nabla \nabla^\T
\end{align*}
%
so that we have the gradient $\nabla L$ and the \aimention{Ludwig Otto
Hesse}Hessian $\nabla^2 L$.  When these are to be evaluated at a point
$(\v{p}^*,\v{\tau}^*) \in \set{F}$ with multipliers $\v{m}^* \in
(\R_{\geq0}^n)^4$, we use the notation $\nabla
L(\v{p}^*,\v{\tau}^*,\v{m}^*)$ and $\nabla^2
L(\v{p}^*,\v{\tau}^*,\v{m}^*)$, respectively. Because the
\aimention{Joseph-Louis Lagrange}Lagrangian is a continuous function,
its \aimention{Ludwig Otto Hesse}Hessian matrix will be symmetric%
%\footnote{To
%say that matrix $\mat{A}$ is symmetric means $\mat{A}^T = \mat{A}$.}
.

\subsubsection{First-Order Necessary Conditions}

Assume that the point $(\v{p}^*,\v{\tau}^*) \in \set{F}$ is a local
maximum of the objective function. For convenience, use the notation
%
\begin{equation}
        J^* \triangleq J(\v{p}^*,\v{\tau}^*)
        \quad
        A^* \triangleq A(\v{p}^*,\v{\tau}^*)
        \quad
        D^* \triangleq D(\v{p}^*,\v{\tau}^*)
        \label{eq:JAD_star_notation}
\end{equation}
%
In order for $J^*$ to be well-defined, it must be assumed that $D^*$ is
nonzero\footnote{While $J^*$ is not defined for $D^* = 0$, any case
where $A^* > 0$ and $D^* = 0$ is certainly desirable.}. It is necessary
that there exist \aimention{Joseph-Louis Lagrange}Lagrange multiplier
vectors $\v{m}^* \in (\R_{\geq0}^n)^4$ such that
%
\begin{equation}
        \nabla L(\v{p}^*,\v{\tau}^*,\v{m}^*) = 0
        \label{eq:first_order_gradient}
\end{equation}
%
and for all $i \in \{1,2,\dots,n\}$,
%
\begin{subequations}
\begin{gather}
        i \notin \set{A}_p^-(\v{p}^*) \implies \mu^*_{i-} = 0
        \qquad \text{ and } \qquad
        i \in \set{A}_p^-(\v{p}^*) \implies \mu^*_{i-} \geq 0
        \label{eq:first_order_multipliers_pminus}\\
        i \notin \set{A}_p^+(\v{p}^*) \implies \mu^*_{i+} = 0
        \qquad \text{ and } \qquad
        i \in \set{A}_p^+(\v{p}^*) \implies \mu^*_{i+} \geq 0
        \label{eq:first_order_multipliers_pplus}\\
        i \notin \set{A}_\tau^-(\v{\tau}^*) \implies \nu^*_{i-} = 0
        \qquad \text{ and } \qquad
        i \in \set{A}_\tau^-(\v{\tau}^*) \implies \nu^*_{i-} \geq 0
        \label{eq:first_order_multipliers_tauminus}\\
        i \notin \set{A}_\tau^+(\v{\tau}^*) \implies \nu^*_{i+} = 0
        \qquad \text{ and } \qquad
        i \in \set{A}_\tau^+(\v{\tau}^*) \implies \nu^*_{i+} \geq 0
        \label{eq:first_order_multipliers_tauplus}
\end{gather}
\end{subequations}
%
where $\implies$ denotes logical implication%
%\footnote{That is, if $A
%\implies B$, then assertion of $B$ is necessary for the assertion of 
%$A$ and assertion of $A$ is sufficient to conclude assertion of $B$. If
%a statement is both necessary and sufficient for another statement, the
%two statements are said to be \emph{logically equivalent}, which is
%denoted with $\iff$; that is, $A \iff B$ is the statement that assertion
%of $A$ occurs \emph{if and only if} assertion of $B$ also occurs.}
. That is, all inequality multipliers are nonnegative; however,
multipliers associated with inactive constraints are zero. Take $j \in
\{1,2,\dots,n\}$. If $p^-_j = p^+_j$, then $p^*_j = p^-_j = p^+_j$.
Similarly, if $\tau^-_j = \tau^*_j$, then $\tau^*_j = \tau^-_j =
\tau^+_j$. We avoid these trivial cases by assuming that $p^-_j \neq
p^+_j$ and $\tau^-_j \neq \tau^+_j$. Of course, if $\tau^+_j = \infty$,
then it is impossible for $\tau^*_j = \tau^*_j$.

\paragraph{Preference Probabilities:} First, consider the requirements
on the preference probabilities. \longref{eq:first_order_gradient}
requires that
%
\begin{equation*}
        \frac
        { D^* a_j(\tau^*_j) - A^* d_j(\tau^*_j) }
        { (D^*)^2 }
        =
        \mu_{i+}^* - \mu_{i-}^*
\end{equation*}
%
There are three cases of interest.
%
\begin{subequations}
\begin{enumerate}[{Special Case }1:]
        \item[$p^*_j \in (p^-_j,p^+_j)$:] By
                \longrefs{eq:first_order_multipliers_pminus} and
                \shortref{eq:first_order_multipliers_pplus},
                $\mu_{j-}^* = \mu_{j+}^* = 0$. Therefore,
                %
                \begin{equation}
                        D^* a_j(\tau^*_j) = A^* d_j(\tau^*_j)
                        \label{eq:pj_first_order_unconst}
                \end{equation}
        \item[$p^*_j = p^-_j$:] By
                \longref{eq:first_order_multipliers_pminus},
                $\mu_{j-}^* \geq 0$ and $\mu_{j+}^* = 0$. Therefore,
                %
                \begin{equation}
                        D^* a_j(\tau^*_j) \leq A^* d_j(\tau^*_j)
                        \label{eq:pj_first_order_minconst}
                \end{equation}
        \item[$p^*_j = p^+_j$:] By
                \longref{eq:first_order_multipliers_pplus}, 
                $\mu_{j-}^* = 0$ and $\mu_{j+}^* \geq 0$. Therefore,
                %
                \begin{equation}
                        D^* a_j(\tau^*_j) \geq A^* d_j(\tau^*_j)
                        \label{eq:pj_first_order_maxconst}
                \end{equation}
\end{enumerate}
\end{subequations}
%
For the minimum constraint to be active, the partial derivative of $J$
at the constraint must be negative. Similarly, for the maximum
constraint to be active, the partial derivative of $J$ at the constraint
must be negative. Otherwise, the partial derivative of $J$ should be
zero. Additionally, if the minimum and maximum constraints are equal,
there is no restriction on the partial derivative of $J$ at that point.
All of these conditions are intuitive and can be explained graphically.

\paragraph{Processing Times:} Next, consider the requirements on the
processing times. \longref{eq:first_order_gradient} requires that
%
\begin{equation*}
        \frac
        { D^* p^*_j a_j'(\tau^*_j) - A^* p^*_j d_j'(\tau^*_j) }
        { (D^*)^2 }
        =
        \nu_{i+}^* - \nu_{i-}^*
\end{equation*}
%
There are three cases of interest.
%
\begin{subequations}
\begin{enumerate}[{Special Case }1:]
        \item[$\tau^*_j \in (\tau^-_j,\tau^+_j)$:] By
                \longrefs{eq:first_order_multipliers_tauminus} and
                \shortref{eq:first_order_multipliers_tauplus},
                $\nu_{j-}^* = \nu_{j+}^* = 0$. Therefore,
                %
                \begin{equation}
                        D^* p^*_j a_j'(\tau^*_j) 
                        = A^* p^*_j d_j'(\tau^*_j)
                        \label{eq:tauj_first_order_unconst}
                \end{equation}
        \item[$\tau^*_j = \tau^-_j$:] By
                \longref{eq:first_order_multipliers_tauminus},
                $\nu_{j-}^* \geq 0$ and $\nu_{j+}^* = 0$. Therefore,
                %
                \begin{equation}
                        D^* p^*_j a_j'(\tau^*_j) 
                        \leq A^* p^*_j d_j'(\tau^*_j)
                        \label{eq:tauj_first_order_minconst}
                \end{equation}
        \item[$\tau^*_j = \tau^+_j$:] By
                \longref{eq:first_order_multipliers_tauplus}, 
                $\nu_{j-}^* = 0$ and $\nu_{j+}^* \geq 0$. Therefore,
                %
                \begin{equation}
                        D^* p^*_j a_j'(\tau^*_j) 
                        \geq A^* p^*_j d_j'(\tau^*_j)
                        \label{eq:tauj_first_order_maxconst}
                \end{equation}
\end{enumerate}
\end{subequations}
%
Clearly, the same interpretation applies here as applied for the
requirements on optimal preference probabilities.

\subsubsection{Second-Order Necessary Conditions}

Once more, assume that the point $(\v{p}^*,\v{\tau}^*) \in \set{F}$ is a
local maximum of the objective function and use the notation in
\longref{eq:JAD_star_notation}. We also use the notation
%
\begin{equation*}
        J^*_{xy}
        \triangleq
        \left.
        \frac{\partial^2 J}{ \partial x \partial y }
        \right|_{(\v{p},\v{\tau})=(\v{p}^*,\v{\tau}^*)}
        \quad
        A^*_{xy}
        \triangleq
        \left.
        \frac{\partial^2 A}{ \partial x \partial y }
        \right|_{(\v{p},\v{\tau})=(\v{p}^*,\v{\tau}^*)}
        \quad
        D^*_{xy}
        \triangleq
        \left.
        \frac{\partial^2 D}{ \partial x \partial y }
        \right|_{(\v{p},\v{\tau})=(\v{p}^*,\v{\tau}^*)}
\end{equation*}
%
Again, $D^*$ must be assumed to be nonzero. We also assume that the
functions $a_i$ and $d_i$ are twice continuously
differentiable\footnote{That is, the derivatives at each point in their
domain are themselves continuously differentiable.} functions for all $i
\in \{1,2,\dots,n\}$. It is necessary that there exist
\aimention{Joseph-Louis Lagrange}Lagrange multiplier vectors $\v{m}^*
\in (\R_{\geq0}^n)^4$ such that the first-order necessary conditions
hold and
%
\begin{equation}
        \v{\delta}^\T \nabla^2 L(\v{p}^*,\v{\tau}^*,\v{m}^*) \v{\delta}
        \geq 0
        \quad \text{ for all } \quad
        \v{\delta} \in \set{V}(\v{p}^*,\v{\tau}^*) \setdiff \{0\}
        \label{eq:second_order_hessian}
\end{equation}
%
That is, at the point $(\v{p}^*,\v{\tau}^*)$, the \aimention{Ludwig Otto
Hesse}Hessian of the \aimention{Joseph-Louis Lagrange}Lagrangian must be
positive semidefinite over the set of feasible variations at that point.
The \aimention{Ludwig Otto Hesse}Hessian $\nabla^2
L(\v{p}^*,\v{\tau}^*,\v{m}^*)$ does not depend upon the multipliers
$\v{m}^*$, and so it is completely characterized by $J^*_{p_jp_k}$,
$J^*_{\tau_j\tau_k}$, and $J^*_{p_j\tau_k}$ for all $j,k \in
\{1,2,\dots,n\}$. Therefore, take $j,k \in \{1,2,\dots,n\}$. 

\paragraph{Elimination of Active Preference Probability Constraints:}
First, assume that $j \in \set{A}_p^-(\v{p}^*) \cup
\set{A}_p^+(\v{p}^*)$. That is, assume that an inequality constraint on
the $j$\th{}\ preference probability is active (\ie, $p^*_j = p^-_j$ or
$p^*_j = p^+_j$). In this case, for all $\v{\delta} \in
\set{V}(\v{p}^*,\v{\tau}^*)$, $\delta^p_j = 0$.  Therefore, because the
feasible variations along active constraint directions are zero,
$J^*_{p_jp_k}$ and $J^*_{p_j\tau_k}$ will have no impact on
\longref{eq:second_order_hessian}. 

\paragraph{Elimination of Active Processing Time Constraints:} Next,
instead assume that $j \in \set{A}_\tau^-(\v{\tau}^*) \cup
\set{A}_\tau^+(\v{\tau}^*)$. That is, assume that an inequality
constraint on the $j$\th{}\ processing time is active (\ie, $\tau^*_j =
\tau^-_j$ or $\tau^*_j = \tau^+_j$). In this case, for all $\v{\delta}
\in \set{V}(\v{p}^*,\v{\tau}^*)$, $\delta^\tau_j = 0$. Therefore,
because the feasible variations along active constraint directions are
zero, $J^*_{p_k\tau_j}$ and $J^*_{\tau_j\tau_k}$ will have no impact on
\longref{eq:second_order_hessian}. 

\paragraph{Elimination of Off-Diagonal Terms:} By the reasoning about
active constraints above, we can focus on coordinates of
$(\v{p}^*,\v{\tau}^*)$ where constraints are inactive, and so we assume
\longrefs{eq:pj_first_order_unconst} and
\shortref{eq:tauj_first_order_unconst}. Therefore,
%
\begin{equation}
        J^*_{p_jp_k} 
        =
        \frac
        { D^* A^*_{p_jp_k} - A^* D^*_{p_jp_k} }
        { (D^*)^2 }
        \quad \text{ and } \quad
        J^*_{p_j\tau_k} 
        =
        \frac
        { D^* A^*_{p_j\tau_k} - A^* D^*_{p_j\tau_k} }
        { (D^*)^2 }
        \label{eq:second_derivative_J_pp_ptau}
\end{equation}
%
and
%
\begin{equation}
        J^*_{\tau_j\tau_k} 
        =
        \frac
        { D^* A^*_{\tau_j\tau_k} - A^* D^*_{\tau_j\tau_k} }
        { (D^*)^2 }
        \label{eq:second_derivative_J_tautau}
\end{equation}
%
For the moment, we focus on the off-diagonal terms of the
\aimention{Ludwig Otto Hesse}Hessian that correspond to inactive
constraints. First, assume that $j \neq k$. Clearly, 
%
\begin{equation*}
        A^*_{p_jp_k} = D^*_{p_jp_k} = 
        A^*_{\tau_j\tau_k} = D^*_{\tau_j\tau_k} =
        A^*_{p_j\tau_k} = D^*_{p_j\tau_k} = 0
\end{equation*}
%
Thus,
%
\begin{equation*}
        J^*_{p_jp_k} = J^*_{\tau_j\tau_k} = J^*_{p_j\tau_k} = 0
\end{equation*}
%
Now we focus on the remaining off-diagonal terms. That is, take $j=k$.
So,
%
\begin{equation*}
        J^*_{p_j\tau_j} 
        =
        \frac
        { D^* a'_j(\tau^*_j) - A^* d'_j(\tau^*_j) }
        { (D^*)^2 }
\end{equation*}
%
Recall that we are taking $j \notin \set{A}_p^-(\v{p}^*) \cup
\set{A}_p^+(\v{p}^*)$ (\ie, the $j\th$ preference probability is
unconstrained, so $p^*_j \in (p^-_j,p^+_j)$). Therefore, $p^*_j > 0$ and
so \longref{eq:tauj_first_order_unconst} implies that $a'_j(\tau^*_j) =
J^* d'_j(\tau^*_j)$. However, $D^* J^* = A^*$.  Thus, by substitution,
it is clear that $J^*_{p_j\tau_j} = 0$.  Hence, $J^*_{p_jp_k}$,
$J^*_{\tau_j\tau_k}$, and $J^*_{p_i\tau_j}$ have no impact on
\longref{eq:second_order_hessian} for all $i,j,k \in \{1,2,\dots,n\}$
with $j \neq k$.

\paragraph{Impact of Inactive Preference Probability Diagonals:} Next,
we consider the diagonal terms of the \aimention{Ludwig Otto
Hesse}Hessian that correspond to inactive preference probabilities. That
is, assume that $j=k$ and $p^*_j \in (p_j^-,p_j^+)$. The condition in
\longref{eq:second_order_hessian} requires that $J^*_{p_jp_j} \leq 0$.
By \longref{eq:second_derivative_J_pp_ptau}, this means that 
%
\begin{equation}
        D^* \times 0 \leq A^* \times 0
        \label{eq:pj_second_order_necessary}
\end{equation}
%
which is always true (\ie, it is always the case that $0 \leq 0$ with
equality). Therefore, this necessary condition adds no more information
than \longref{eq:pj_first_order_unconst}.

\paragraph{Definiteness from Inactive Processing Time Diagonals:} By the
reasoning above, the only second partial derivative that can prevent
\longref{eq:second_order_hessian} from being true is
$J^*_{\tau_j\tau_j}$ where $\tau^*_j \in (\tau_j^-,\tau_j^+)$. That is,
the condition in \longref{eq:second_order_hessian} requires that
$J^*_{\tau_j\tau_j} \leq 0$. By \longref{eq:second_derivative_J_tautau},
this means that
%
\begin{equation}
        D^* p^*_j a''_j(\tau^*_j) \leq A^* p^*_j d''_j(\tau^*_j)
        \label{eq:tauj_second_order_necessary}
\end{equation}
%
If the constraint parameter $p^-_j = 0$ and the $j\th$ preference
probability constraint is active (\ie, $p^*_j = 0$), then this condition
is always true by equality. Otherwise, if $p^*_j > 0$, it must be that
$D^* a''_j(\tau^*_j) \leq A^* d''_j(\tau^*_j)$.

\subsubsection{Second-Order Sufficiency Conditions}

Now take an arbitrary feasible point $(\v{p}^*,\v{\tau}^*) \in \set{F}$
that may be a maximum of the objective function. If there exist
\aimention{Joseph-Louis Lagrange}Lagrange multiplier vectors $\v{m}^*
\in (\R_{\geq0}^n)^4$ such that \longref{eq:first_order_gradient} holds
and \longrefs{eq:second_order_hessian} and
\shortref{eq:first_order_multipliers_pminus}--%
\shortref{eq:first_order_multipliers_tauminus} hold with \emph{strict}
inequality %
%\footnote{That is, replace each $\leq$ with $<$ and each
%$\geq$ with $>$.} 
, then the point must be a local maximum of the
objective function. This is effectively a statement of the local
concavity %
%\footnote{For $m \in \N$, to say that $\set{C} \subseteq \R^m$
%is a \emph{convex set} means that for any $x,y \in \set{C}$ and $t \in
%[0,1]$, $tx + (1-t)y \in \set{C}$. For $m \in \N$ and convex set
%$\set{C} \subseteq \R^m$, to say that $f: \set{C} \mapsto \R$ is a
%\emph{concave function} means that for any $x,y \in \set{C}$ and $t \in
%[0,1]$ with $x \neq y$, $f(tx + (1-t)y) \geq tf(x) + (1-t)f(y)$, where
%\emph{strict} concavity means that $\geq$ can be replaced with $>$. This
%is \emph{global concavity}. Any element of $\set{C}$ that is a local
%maximum of concave function $f$ must also be a global maximum of $f$,
%and if the concavity is strict, then this point will be the unique 
%global maximum of $f$. \emph{Local concavity} is the weaker analogous 
%concept when a function is restricted to a convex \emph{local 
%neighborhood} of a point. That is, this sufficiency condition 
%establishes that the restriction of the function to a certain local 
%neighborhood of a point will be strictly concave.} 
of the objective function at the point
$(\v{p}^*,\v{\tau}^*)$.

\paragraph{The Extreme-Preference Rule:} \index{extreme-preference rule
(EPR)|(indexdef}In order for \longref{eq:second_order_hessian} to hold
with strict inequality, \longrefs{eq:pj_second_order_necessary} and
\shortref{eq:tauj_second_order_necessary} must both hold with strict
inequality. However, this is impossible for
\longref{eq:pj_second_order_necessary}. Therefore, if there is some $i
\in \{1,2,\dots,n\}$ with $p^*_i \in (p^-_i,p^+_i)$, these conditions
cannot be used to show that the point is a local maximum\footnote{In
other words, strict concavity cannot hold at such a point.}. Our goal is
to design strategies guaranteed to be local maxima, so these strategies
will have $p^*_i = p^-_i$ or $p^*_i = p^+_i$ for all $i \in
\{1,2,\dots,n\}$. We call this the \emph{\acro[extreme-preference
rule~(EPR)]{EPR}{\index{extreme-preference rule
(EPR)"|indexglo}extreme-preference rule}}. \Citet{SK86} assume that
$(p^-_i,p^+_i)=(0,1)$ for all $i \in \{1,2,\dots,n\}$, so they call this
the \index{zero-one rule|indexdef}\emph{zero-one rule}. This rule is
part of a sufficiency condition; it is not at all
necessary.\index{extreme-preference rule
(EPR)|)indexdef}\index{EPR|see{extreme-preference rule}}

\paragraph{Problems with Sufficiency at Zero Preference Probability:}
Assume there exists some $j \in \{1,2,\dots,n\}$ such that $p^*_j = 0$.
\longrefs{eq:tauj_first_order_minconst},
\shortref{eq:tauj_first_order_maxconst}, and
\shortref{eq:tauj_second_order_necessary} cannot all hold with strict
inequality for this $\v{p}^*$. In other words, strict concavity is
impossible at this point because the objective function is the same
value for any choice of $\tau^*_j$. However, it can be shown that if
these all hold when $p^*_j$ is replaced with some arbitrarily small
$\varepsilon$ with $0 < \varepsilon < p^+_j$, then the point
$(p^*_j,\tau^*_j)$ is a local maximum of the objective function. In
other words, even if the function is not strictly locally concave, under
these $\varepsilon$-conditions it is certainly locally concave.

\subsection{Solutions to Special Cases}

Solutions to this generalized optimization problem can be difficult to
find%
%\footnote{The graphical interpretations in
%\longref{sec:alternate_optimization_objectives} may provide valuable
%intuition or inspire numerical methods to assist in finding optimal
%solutions.}
. In fact, mere existence of solutions cannot be taken for
granted. However, there are two special cases that guarantee
existence (but not uniqueness) of
solutions and can be equipped with simple methods of finding one of
those solutions.

\subsubsection{Constant Disadvantage Case}
\index{rational objective function!optimal solution!constant
disadvantage|(}

This case not only serves as an important example but is useful in some
real cases. It is our goal to construct a strategy $(\v{p}^*,\v{\tau}^*)
\in \set{F}$ that meets all sufficiency conditions to be called a local
maximum point of the objective function. This point will be a
\emph{global} maximum if the objective function is concave. The point
will be the unique global maximum if the objective function is strictly
concave. Assume that for all $j,k \in \{1,2,\dots,n\}$,
%
\begin{enumerate}[(i)]
        \item $p_j^- = 0$
                \label{item:constant_disadv_exclusion}
        \item $a_j$ and $d_j$ are twice continuously differentiable
                functions
                \label{item:constant_disadv_differentiability}
        \item for all $\tau_j \in \R_{\geq0} \cap [\tau_j^-,\tau_j^+]$
                and all $\tau_k \in \R_{\geq0} \cap
                [\tau_k^-,\tau_k^+]$,
                \begin{compactitem}
                        \item $d(\tau_j) d \geq 0$
                        \item $d_j(\tau_j) d_k(\tau_k) > 0$
                \end{compactitem}
                \label{item:constant_disadv_same_sign}
        \item either $d \neq 0$ or there exists some $i \in
                \{1,2,\dots,n\}$ such that $p^*_i > 0$
                \label{item:constant_disadv_nonzero_Dstar}
        \item $d_j'(\tau_j) = 0$ for all $\tau_j \in
                (\tau_j^-,\tau_j^+)$
                \label{item:constant_disadv}
        \item if $\tau_j^- \neq \tau_j^+$, it is the case that 
                \begin{compactenum}%
                        %[({\ref*{item:constant_disadv_convexity}}.a)]
                        [(a)]
                        \item $d_j(\tau_j^-) a'_j(\tau_j^-) < 0$ or 
                                \label{item:constant_disadv_conv_left}
                        \item $d_j(\tau_j^+) a'_j(\tau_j^+) > 0$ or 
                                \label{item:constant_disadv_conv_right}
                        \item $d_j(\tau_j) a'_j(\tau_j) = 0$ with
                                $d_j(\tau_j) a_j''(\tau_j) < 0$ for some
                                $\tau_j \in (\tau_j^-,\tau_j^+)$
                                \label{item:constant_disadv_conv_middle}
                \end{compactenum}
                \label{item:constant_disadv_convexity}
\end{enumerate}
%
If these assumptions do not hold, for each $j \in \{1,2,\dots,n\}$,
$\tau^-_j$ and $\tau^+_j$ may be adjusted to surround a region where
they do hold. These assumptions lead to the following for all $j \in
\{1,2,\dots,n\}$.
%
\begin{description}
        \item\emph{Well-Defined Objective Function:} By
                (\shortref{item:constant_disadv_same_sign}) and
                (\shortref{item:constant_disadv_nonzero_Dstar}), $D^*
                \neq 0$ and $D^* d_j(\tau_j) > 0$. This implies that
                both $J^*$ and $a_j(\tau_j)/d_j(\tau_j)$ are
                well-defined for all choices of $\tau_j \in
                [\tau_j^-,\tau_j^+]$.
        \item\emph{Maximum Type-Advantage-to-Type-Disadvantage Ratio
                Exists:} By (\shortref{item:constant_disadv_convexity}),
                there exists some $\tau_j^* \in [\tau_j^-,\tau_j^+]$
                such that there is some $\delta_j \in \R_{>0}$ where
                $a_j(\tau_j)/d_j(\tau_j) \leq
                a_j(\tau_j^*)/d_j(\tau_j^*)$ for all $\tau_j \in (\tau_j
                - \delta_j,\tau_j + \delta_j) \cap [\tau_j^-,\tau_j^+]$.
                That is, the $a_j/d_j$ function has a maximum on its
                domain.
        \item\emph{Parameterized Processing Times:} If $\tau_j^- =
                \tau_j^+$, then (\shortref{item:constant_disadv}) and
                (\shortref{item:constant_disadv_convexity}) are
                trivially met. This case is useful when processing times
                are parameters of the system and not decision variables.
                \Citet{SK86} use the name \index{task-type choice
                problem}\emph{prey model} for the case where no
                processing times are free decision variables (\ie, tasks
                are whole items of prey that come lumped with a rigid
                (average) processing time).
\end{description}
%
If $\tau^-_j=\tau^+_j$, let $\tau^*_j=\tau^-_j$. Otherwise, let
$\tau_j^*$ be a maximum of $a_j/d_j$ that is described by
(\shortref{item:constant_disadv_convexity}). Next, assume that the types
are indexed so that
%
\begin{equation}
        \frac{a_1(\tau_1^*)}{d_1(\tau_1^*)}
        >
        \frac{a_2(\tau_2^*)}{d_2(\tau_2^*)}
        > \cdots >
        \frac{a_{n-1}(\tau_{n-1}^*)}{d_{n-1}(\tau_{n-1}^*)}
        >
        \frac{a_n(\tau_n^*)}{d_n(\tau_n^*)}
        \label{eq:generalized_prey_ordering}
\end{equation}
%
Assume that for all $k \in \{0,1,2,\dots,n-1\}$,
%
\begin{equation*}
        \frac%
        { a + \sum\limits_{i=1}^{k} p_i^+ a_i(\tau_i^*)}%
        { d + \sum\limits_{i=1}^{k} p_i^+ d_i(\tau_i^*)}
        \neq 
        \frac{a_{k+1}(\tau_{k+1}^*)}{d_{k+1}(\tau_{k+1}^*)}
\end{equation*}
%
Finally, define $k^*$ by
%
\begin{equation*}
        k^*
        \triangleq 
        \min\left(\left\{
        k \in \{0,1,2,\dots,n-1\}
        :
        \frac%
        { a + \sum\limits_{i=1}^{k} p_i^+ a_i(\tau_i^*)}%
        { d + \sum\limits_{i=1}^{k} p_i^+ d_i(\tau_i^*)}
        >
        \frac{a_{k+1}(\tau_{k+1}^*)}{d_{k+1}(\tau_{k+1}^*)}
        \right\}
        \cup \{n\}
        \right)
\end{equation*}
%
and let
%
\begin{equation*}
        p_j^* 
        =
        \begin{cases}
                p_j^+ &\text{if } j \leq k^*\\
                0 &\text{if } j > k^*
        \end{cases}
\end{equation*}
%
for all $j \in \{1,2,\dots,n\}$. Primarily because of assumption
(\shortref{item:constant_disadv_convexity}) and the results that $D^*
d_j(\tau_j^*) > 0$ and $d_j'(\tau_j^*)' = d_j''(\tau_j^*) = 0$ for all
$j \in \{1,2,\dots,n\}$, it is easy to show that $(\v{p}^*,\v{\tau}^*)$
meets the conditions described in \longref{sec:optimization_procedure}
that guarantee it is a local maximum of the objective
function\footnote{Because $p^-_j = 0$ for all $j \in \{1,2,\dots,n\}$,
this statement requires the zero preference probability modification
described at the end of \longref{sec:optimization_procedure}.}.%
\index{rational objective function!optimal solution!constant
disadvantage|)}

\subsubsection{Decreasing Advantage-to-Disadvantage Ratio}%
\index{rational objective function!optimal solution!decreasing
advantage-to-disadvantage|(}

Again, it is our goal to construct a strategy $(\v{p}^*,\v{\tau}^*) \in
\set{F}$ that meets all sufficiency conditions to be called a local
maximum point of the objective function. However, here we assume that
the disadvantage functions are not constant with respect to processing
time. This is a generalized version of the \index{combined task-type and
processing-length choice problem}\emph{combined prey and patch model}
discussed by \citet{SK86}, and so it shows the \index{marginal value
theorem (MVT)}\ac{MVT} concept \citep{Cha76,Cha73}. However,
\citeauthor{SK86} make different assumptions than we do because they
depend on search costs being nil.  Assume that for all $j,k \in
\{1,2,\dots,n\}$,
%
\begin{enumerate}[(i)]
        \item $p_j^- = 0$
                \label{item:decr_ratio_exclusion}
        \item $a_j$ and $d_j$ are twice continuously differentiable
                functions
                \label{item:decr_ratio_differentiability}
        \item for all $\tau_j \in \R_{\geq0}\cap[\tau_j^-,\tau_j^+]$ and
                all $\tau_k \in \R_{\geq0}\cap[\tau_k^-,\tau_k^+]$,
                \begin{compactitem}
                        \item $d(\tau_j) d \geq 0$
                        \item $d_j(\tau_j) d_k(\tau_k) > 0$
                \end{compactitem}
                \label{item:decr_ratio_same_sign}
        \item either $d \neq 0$ or there exists some $i \in
                \{1,2,\dots,n\}$ such that $p^*_i > 0$
                \label{item:decr_ratio_nonzero_Dstar}
        \item $d_j(\tau_j) d_j'(\tau_j) > 0$ for all $\tau_j \in
                (\tau_j^-,\tau_j^+)$
                \label{item:decr_ratio_same_sign_derivative}
        \item $(a_j(\tau_j)/d_j(\tau_j))' < 0$ for all $\tau_j \in
                (\tau_j^-,\tau_j^+)$
                \label{item:decr_ratio}
        \item $(a_j'(\tau_j)/d_j'(\tau_j))' < 0$ for all $\tau_j \in
                (\tau_j^-,\tau_j^+)$
                \label{item:decr_ratio_second}
\end{enumerate}
%
If these assumptions do not hold, for each $j \in \{1,2,\dots,n\}$,
$\tau^-_j$ and $\tau^+_j$ may be adjusted to surround a region where
they do. These assumptions lead to the following for all $j \in
\{1,2,\dots,n\}$.
%
\begin{description}
        \item\emph{Well-Defined Objective Function:} By
                (\shortref{item:decr_ratio_same_sign}) and
                (\shortref{item:decr_ratio_nonzero_Dstar}), $D^* \neq 0$
                and $D^* d_j(\tau_j) > 0$. This implies that both $J^*$,
                $a_j(\tau_j)/d_j(\tau_j)$, and
                $a_j'(\tau_j)/d_j'(\tau_j)$ are all well-defined for all
                choices of $\tau_j \in [\tau_j^-,\tau_j^+]$.
        \item\emph{Maximum Type-Advantage-to-Type-Disadvantage Ratio
                Exists:} By (\shortref{item:decr_ratio}), $\tau_j^-$ is
                such that $a_j(\tau_j)/d_j(\tau_j) \leq
                a_j(\tau_j^-)/d_j(\tau_j^-)$ for all $\tau_j \in
                [\tau_j^-,\tau_j^+]$.  That is, the
                $a_j(\tau_j)/d_j(\tau_j)$ function achieves its maximum
                at $\tau_j = \tau_j^-$.
        \item\emph{Ordering of Ratios:} By (\shortref{item:decr_ratio})
                and (\shortref{item:decr_ratio_same_sign_derivative}),
                $a_k(\tau_k)/d_k(\tau_k) > a_k'(\tau_k)/d_k'(\tau_k)$
                for all $\tau_k \in (\tau_k^-,\tau_k^+)$.
        \item\emph{Parameterized Processing Times:} If $\tau_j^- =
                \tau_j^+$, then
                (\shortref{item:decr_ratio_same_sign_derivative})--%
                (\shortref{item:decr_ratio_second}) are trivially met. 
\end{description}
%
Assume the types are indexed so that
%
\begin{equation}
        \frac{a_1(\tau_1^-)}{d_1(\tau_1^-)}
        >
        \frac{a_2(\tau_2^-)}{d_2(\tau_2^-)}
        > \cdots >
        \frac{a_{n-1}(\tau_{n-1}^-)}{d_{n-1}(\tau_{n-1}^-)}
        >
        \frac{a_n(\tau_n^-)}{d_n(\tau_n^-)}
        \label{eq:generalized_prey_ordering_2}
\end{equation}
%
In other words, as in the constant advantage case, order the task types
by decreasing maximum advantage-to-disadvantage ratio. This is the same
ordering used by \citeauthor{SK86}; however, because we have assumed the
derivative of this ratio is strictly decreasing, the initial ratio will
always be the maximum ratio. Next, for all $k \in \{0,1,\dots,n\}$,
define $\tau_j^k$ so that
%
\begin{align*}
        \frac{ a'_j(\tau_j^k) }{ d'_j(\tau_j^k) }
        &> 
        \frac%
        { a + \sum\limits_{i=1}^{k} p_i^+ a_i(\tau_i^k)}%
        { d + \sum\limits_{i=1}^{k} p_i^+ d_i(\tau_i^k)}
        \quad \text{ for } \tau_j^k = \tau_j^+\\
%
\intertext{or}
%
        \frac{ a'_j(\tau_j^k) }{ d'_j(\tau_j^k) }
        &< 
        \frac%
        { a + \sum\limits_{i=1}^{k} p_i^+ a_i(\tau_i^k)}%
        { d + \sum\limits_{i=1}^{k} p_i^+ d_i(\tau_i^k)}
        \quad \text{ for } \tau_j^k = \tau_j^-\\
%
\intertext{or}
%
        \frac{ a'_j(\tau_j^k) }{ d'_j(\tau_j^k) }
        &= 
        \frac%
        { a + \sum\limits_{i=1}^{k} p_i^+ a_i(\tau_i^k)}%
        { d + \sum\limits_{i=1}^{k} p_i^+ d_i(\tau_i^k)}
        \quad \text{ for } \tau_j^k \in (\tau_j^-,\tau_j^+)
\end{align*}
%
By (\shortref{item:decr_ratio_second}), this is always possible.
Unfortunately, for each $k \in \{0,1,\dots,n\}$, all elements of the set
$\{ \tau_j^k : j = \{1,2,\dots,k\} \}$ must be determined
simultaneously. This is different from the constant disadvantage case.
That is, because $d_k'(\tau_j) \neq 0$ for all $\tau_j \in
[\tau_j^-,\tau_j^+]$, there is coupling among the optimal choices of
processing time. It must also be assumed that
%
\begin{equation*}
        \frac%
        { a + \sum\limits_{i=1}^{k} p_i^+ a_i(\tau_i^k)}%
        { d + \sum\limits_{i=1}^{k} p_i^+ d_i(\tau_i^k)}
        \neq 
        \frac{a_{k+1}(\tau_{k+1}^-)}{d_{k+1}(\tau_{k+1}^-)}
\end{equation*}
%
for all $k \in \{0,1,2,\dots,n-1\}$. Now, define $k^*$ by
%
\begin{equation*}
        k^*
        \triangleq 
        \min\left(\left\{
        k \in \{0,1,2,\dots,n-1\}
        :
        \frac%
        { a + \sum\limits_{i=1}^{k} p_i^+ a_i(\tau_i^k)}%
        { d + \sum\limits_{i=1}^{k} p_i^+ d_i(\tau_i^k)}
        >
        \frac{a_{k+1}(\tau_{k+1}^-)}{d_{k+1}(\tau_{k+1}^-)}
        \right\}
        \cup \{n\}
        \right)
\end{equation*}
%
Finally, let
%
\begin{equation*}
        \tau_j^*
        =
        \tau_j^{k^*}
        \quad \text{ and } \quad
        p_j^* 
        =
        \begin{cases}
                p_j^+ &\text{if } j \leq k^*\\
                0 &\text{if } j > k^*
        \end{cases}
\end{equation*}
%
for all $j \in \{1,2,\dots,n\}$. Primarily because of assumptions
(\shortref{item:decr_ratio}) and (\shortref{item:decr_ratio_second}), it
is easy to show that $(\v{p}^*,\v{\tau}^*)$ meets the conditions
described in \longref{sec:optimization_procedure} that guarantee it is a
local maximum of the objective function\footnote{Because $p^-_j = 0$ for
all $j \in \{1,2,\dots,n\}$, this statement requires the zero preference
probability modification described at the end of
\longref{sec:optimization_procedure}.}.%
\index{rational objective function!optimal solution!decreasing
advantage-to-disadvantage|)}%
\index{rational objective function|)}

\section{Optimization of Specific Objective Functions}
\label{sec:optimization_specific}
\index{examples!analytical solutions|(indexdef}

The optimization results given in \longref{sec:optimization_rational}
may be applied to many of the functions introduced in
\longref{ch:optimization_objectives}. We consider three of them here.
Unfortunately, the reward-to-variability and reward-to-variance
optimization functions do not fit the form of
\longref{sec:optimization_rational} because the central moments used to
define them involve a great deal of cross-coupling among task-type
parameters and decision variables. Therefore, we do not consider
solutions to these optimization functions. We also do not provide
solutions for the constrained optimization functions; however, we have
shown other ways to implement success thresholds that can be handled by
the methods in \longref{sec:optimization_rational}.

\subsection{Maximization of Rate of Excess Net Point Gain}%
\label{sec:max_renpg}%
\index{examples!analytical solutions!rate of excess net gain|(indexdef}%
\index{rate of excess net gain!analytical optimization|(}

Consider the function $(\E(G_1)-G^T/N^p)/\E(T_1)$ where $G^T \in \R$ is
a net gain success threshold. Using the statistics derived in
\longref{ch:model}, this can be expressed by
%
\begin{equation*}
        \frac{\E(G_1) - \frac{G^T}{N^p}}{\E(T_1)}
        =
        \frac%
        {\overline{g^p}-\overline{c^p}-\frac{c^s}{\lambda^p}
         -\frac{G^T}{N^p}}
        {\overline{\tau^p}+\frac{1}{\lambda^p}}
        =
        \frac%
        {-c^s + \sum\limits_{i=1}^n p_i 
        \lambda_i 
        \left( g_i(\tau_i) - c_i \tau_i - \frac{G^T}{N^p} \right)}
        {1 + \sum\limits_{i=1}^n p_i 
         \lambda_i \tau_i}
\end{equation*}
%
Define
%
\begin{align*}
        a &\triangleq -c^s
        &
        a_j(\tau_j) 
        &\triangleq 
        \lambda_j 
        \left( g_j(\tau_j) - c_j \tau_j - \frac{G^T}{N^p}\right)\\
        d &\triangleq 1
        &
        d_j(\tau_j) &\triangleq \lambda_j \tau_j
\end{align*}
%
Using these definitions, $(\E(G_1)-G^T/N^p)/\E(T_1)$ fits the form
studied in \longref{sec:optimization_rational}.
\index{rate of excess net gain!analytical optimization|)}%
\index{examples!analytical solutions!rate of excess net gain|)indexdef}

\subsection{Maximization of Discounted Net Gain}
\index{examples!analytical solutions!discounted net gain|(indexdef}

Consider the function $\E(G_1) - w \E(T_1)$ where $w \in \R$. Using the
statistics derived in \longref{ch:model}, this can be expressed by
%
\begin{equation*}
        \E(G_1) - w \E(T_1)
        =
        \overline{g^p}-\overline{c^p}-\frac{c^s}{\lambda^p}
        -w\overline{\tau^p}-w\frac{1}{\lambda^p}
        =
        \frac%
        { -(c^s+w) + \sum\limits_{i=1}^n p_i 
        \lambda_i ( g_i(\tau_i) - c_i \tau_i - w \tau_i ) }%
        { \sum\limits_{i=1}^n p_i \lambda_i }
\end{equation*}
%
Define
%
\begin{align*}
        a &\triangleq -(c^s+w)
        &
        a_j(\tau_j) &\triangleq \lambda_j ( g_j(\tau_j) - c_j \tau_j - w
        \tau_j )\\
        d &\triangleq 0
        &
        d_j(\tau_j) &\triangleq \lambda_j
\end{align*}
%
Using these definitions, clearly $\E(G_1) - w\E(T_1)$ fits the form
studied in \longref{sec:optimization_rational}. This is a constant
disadvantage example. Consider fixing processing times to be parameters
so that the excess rate of gain function in \longref{sec:max_renpg} is
also a constant disadvantage example. Also take $G^T = 0$. In this case,
the resulting rate of net gain function is nearly identical to the one
studied in classical \ac{OFT}. In this constant disadvantage context
(called the \emph{prey model} by \citet{SK86}), indexing by
advantage-to-disadvantage ratio will often lead to the same ordering for
both the rate of net gain and the discounted net gain functions.
Therefore, if observational justification for the use of rate of point
gain as an optimization objective is based entirely on task-type
ranking, then discounted net gain is an equally valid optimization
objective to consider.
\index{examples!analytical solutions!discounted net gain|)indexdef}

\subsection{Maximization of Rate of Excess Efficiency}
\index{examples!analytical solutions!excess efficiency|(indexdef}

Consider the function $(\E(G_1)+\E(C_1)-G_g^T/N^p)/\E(C_1)$ where $G_g^T
\in \R$ is a gross gain success threshold. Using the statistics derived
in \longref{ch:model}, this can be expressed by
%
\begin{equation*}
        \frac{\E(G_1)+\E(C_1)-\frac{G_g^T}{N^p}}{\E(C_1)}
        =
        \frac%
        {\overline{g^p}-\frac{G_g^T}{N^p}}
        {\overline{c^p}+\frac{c^s}{\lambda^p}}
        =
        \frac%
        {\sum\limits_{i=1}^n p_i 
        \lambda_i 
        \left( g_i(\tau_i) - \frac{G_g^T}{N^p} \right)}
        {c^s + \sum\limits_{i=1}^n p_i 
         \lambda_i c_i \tau_i}
\end{equation*}
%
Define
%
\begin{align*}
        a &\triangleq 0
        &
        a_j(\tau_j) 
        &\triangleq 
        \lambda_j 
        \left( g_j(\tau_j) - \frac{G_g^T}{N^p}\right)\\
        d &\triangleq c^s
        &
        d_j(\tau_j) &\triangleq \lambda_j c_j \tau_j
\end{align*}
%
Using these definitions, $(\E(G_1)+\E(C_1)-G_g^T/N^p)/\E(C_1)$ fits the
form studied in \longref{sec:optimization_rational}.

There are two major criticisms of optimizing efficiency
\citep[p.~9]{SK86}. First, it ignores the impact of time. Second, it
equates behaviors that bring small gains at small costs with behaviors
that bring large gains at large costs. Together, an efficiency optimizer
can spend large amounts of time for a small gain that is insufficient
for survival. However, costs in our model are affinely related to time,
so cost minimization exerts pressure on time as well. Additionally,
efficiency is defined with a success threshold (\ie, excess efficiency),
and so all behaviors that have positive efficiency also lead to
survival. Therefore, if our model can be used, efficiency maximization
may be a viable alternative to rate maximization. In a constant
disadvantage context, the efficiency advantage-to-disadvantage indexing
will be very similar to the indexing in \longref{sec:max_renpg}, and so
evidence for the use of rate maximization may also justify the use of
efficiency maximization.
\index{examples!analytical solutions!excess efficiency|)indexdef}
\index{examples!analytical solutions|)indexdef}
\index{solitary agent model!processing-only analysis!optimization|)}%
\index{solitary agent model!classical analysis!optimization|)}