% Upper-case A B C D E F G H I J K L M N O P Q R S T U V W X Y Z % Lower-case a b c d e f g h i j k l m n o p q r s t u v w x y z % Digits 0 1 2 3 4 5 6 7 8 9 % Exclamation ! Double quote " Hash (number) # % Dollar $ Percent % Ampersand & % Acute accent ' Left paren ( Right paren ) % Asterisk * Plus + Comma , % Minus - Point . Solidus / % Colon : Semicolon ; Less than < % Equals = Greater than > Question mark ? % At @ Left bracket [ Backslash \ % Right bracket ] Circumflex ^ Underscore _ % Grave accent ` Left brace { Vertical bar | % Right brace } Tilde ~ % ---------------------------------------------------------------------| % --------------------------- 72 characters ---------------------------| % ---------------------------------------------------------------------| % % Optimal Foraging Theory Revisited: Chapter 4. Optimization Results % % (c) Copyright 2007 by Theodore P. Pavlic % \chapter{Finite-Lifetime Optimization Results} \label{ch:optimization_results}% \index{solitary agent model!classical analysis!optimization|(}% \index{solitary agent model!processing-only analysis!optimization|(} In \longref{ch:optimization_objectives}, we discussed selecting agent behavior based on the optimization of functions that encapsulated a number of different optimization objectives. Some of these functions traded off objectives by maximizing the ratio of one to the other. Other functions traded off objectives by maximizing a linear combination of objectives. In either case, for many of the statistics defined in \longref{ch:model}, the resulting functions have a special structure in common. In this chapter, we optimize a general value function that also has this structure. We then apply these general results to value functions of interest in our model. The general results are given in \longref{sec:optimization_rational} and the specific results are given in \longref{sec:optimization_specific}. \section{Optimization of a Rational Objective Function} \label{sec:optimization_rational} \index{rational objective function|(} Before we discuss optimization of some of the functions introduced in \longref{ch:optimization_objectives}, we focus on a special general case. This general case may be applied to the optimization functions we have introduced or be used to derive optimal behavior for other novel valuation functions that have this structure. \subsection{The Generalized Problem} Take $n \in \N$ task types. Most statistics defined in \longref{ch:model} and optimization functions described in \longref{ch:optimization_objectives} have a special structure in common. Here, we present a generalized optimization problem that provides solutions to a broad range of problems in this model. \subsubsection{The Decision Variables and Constraints} The decision variables are preference probabilities and processing times. These variables are constrained, so their bounds must be defined as parameters. For each $i \in \{1,2,\dots,n\}$, define upper and lower preference constraint parameters $p_i^-,p_i^+ \in [0,1]$ and upper and lower time constraint parameters $\tau_i^- \in \R_{\geq0}$, and $\tau_i^+ \in \extR_{\geq0}$. Collect these constraint parameters into vectors $\v{p}^-,\v{p}^+ \in [0,1]^n$, $\v{\tau}^- \in \R_{\geq0}^n$, and $\v{\tau}^+ \in \extR_{\geq0}^n$ defined by % \begin{align*} \v{p}^- &\triangleq \begin{bmatrix} p_1^-, p_2^-, \dots, p_n^- \end{bmatrix}^\T & \v{\tau}^- &\triangleq \begin{bmatrix} \tau_1^-, \tau_2^-, \dots, \tau_n^- \end{bmatrix}^\T\\ \v{p}^+ &\triangleq \begin{bmatrix} p_1^+, p_2^+, \dots, p_n^+ \end{bmatrix}^\T & \v{\tau}^+ &\triangleq \begin{bmatrix} \tau_1^+, \tau_2^+, \dots, \tau_n^+ \end{bmatrix}^\T \end{align*} % Therefore, for an arbitrary preference probability vector $\v{p}$ and processing time vector $\v{\tau}$ defined so that % \begin{align*} \v{p} &\triangleq \begin{bmatrix} p_1, p_2, \dots, p_n\end{bmatrix}^\T & \v{\tau} &\triangleq \begin{bmatrix} \tau_1, \tau_2, \dots, \tau_n \end{bmatrix}^\T \end{align*} % it must be that % \begin{equation*} p_i^- \leq p_i \leq p_i^+ \quad \text{ and } \quad \tau_i^- \leq \tau_i \leq \tau_i^+ \end{equation*} % for all $i \in \{1,2,\dots,n\}$. \subsubsection{Generalized Advantage, Disadvantage, and Objective} For each $i \in \{1,2,\dots,n\}$, define the generalized task advantage function $a_i: \R_{\geq0} \cap [\tau_i^-,\tau_i^+] \mapsto \R$ and the generalized task disadvantage function $d_i: \R_{\geq0} \cap [\tau_i^-,\tau_i^+] \mapsto \R$ to be continuously differentiable functions% %\footnote{That is, these functions %are differentiable and have continuous derivatives at every point in %their domain.} . Also define the environment advantage $a \in \R$ and disadvantage $d \in \R$. Therefore, the total advantage $A$ and total disadvantage $D$ are defined by % \begin{align*} A(\v{p},\v{\tau}) &\triangleq a + \sum\limits_{i=1}^n p_i a_i(\tau_i) & D(\v{p},\v{\tau}) &\triangleq d + \sum\limits_{i=1}^n p_i d_i(\tau_i) \end{align*} % where $\v{p} \in [0,1]^n$ and $\v{\tau} \in \R_{\geq0}^n$ are arbitrary preference probability and processing time vectors. Therefore, the generalized objective $J$, the advantage-to-disadvantage ratio, is defined by $J(\v{p},\v{\tau}) \triangleq A(\v{p},\v{\tau})/D(\v{p},\v{\tau})$. \subsubsection{Notation} Take $i,j \in \{1,2,\dots,n\}$. For the advantage $a_i$ and disadvantage $d_i$, use the notation % \begin{align*} a'(\tau_i) &\triangleq \frac{ \total }{ \total \tau_i } a(\tau_i) & a''(\tau_i) &\triangleq \frac{ \total^2 }{ \total \tau_i^2 } a(\tau_i) \\ d'(\tau_i) &\triangleq \frac{ \total }{ \total \tau_i } d(\tau_i) & d''(\tau_i) &\triangleq \frac{ \total^2 }{ \total \tau_i^2 } d(\tau_i) \end{align*} % to represent the first and second derivatives of each function evaluated at the point $\tau_i$. \subsection{The Optimization Procedure} \label{sec:optimization_procedure} The goal is to choose preference probabilities and processing times to (locally) maximize $J$. This can be formulated as the constrained minimization problem % \begin{equation*} \begin{split} &\text{minimize} \enskip {-J}\\ &\text{subject to} \enskip {-\tau_i} \leq \tau_i^-, \enskip {\tau_i} \leq \tau_i^+, \enskip {-p_i} \leq p_i^-, \enskip {p_i} \leq p_i^+ \text{ for all } i \in \{1,\dots,n\} \end{split} \end{equation*} % with $4n$ inequality constraints. This problem can be solved using \aimention{Joseph-Louis Lagrange}Lagrange multiplier theory \citep{Bertsekas95}. Define the \aimention{Joseph-Louis Lagrange}Lagrangian $L$ by % \begin{equation*} L \triangleq -J - \v{\mu}_-^\T ( \v{p} - \v{p}^- ) + \v{\mu}_+^\T ( \v{p} - \v{p}^+ ) - \v{\nu}_-^\T ( \v{\tau} - \v{\tau}^- ) + \v{\nu}_+^\T ( \v{\tau} - \v{\tau}^+ ) \end{equation*} % where $\v{\mu}_-,\v{\mu}_+,\v{\nu}_-,\v{\nu}_+ \in \R_{\geq0}^n$ are vectors of \aimention{Joseph-Louis Lagrange}Lagrange multipliers, denoted % \begin{align*} \v{\mu}_- &\triangleq \begin{bmatrix} \mu_{1-} & \mu_{2-} & \dots & \mu_{n-} \end{bmatrix}^\T & \v{\mu}_+ &\triangleq \begin{bmatrix} \mu_{1+} & \mu_{2+} & \dots & \mu_{n+} \end{bmatrix}^\T\\ \v{\nu}_- &\triangleq \begin{bmatrix} \nu_{1-} & \nu_{2-} & \dots & \nu_{n-} \end{bmatrix}^\T & \v{\nu}_+ &\triangleq \begin{bmatrix} \nu_{1+} & \nu_{2+} & \dots & \nu_{n+} \end{bmatrix}^\T \end{align*} % For ease of notation, we use the symbol $\v{m}^*$ to represent a collection of one of each of these four \aimention{Joseph-Louis Lagrange}Lagrange multiplier vectors. That is, $\v{m}^* \in (\R_{\geq0}^n)^4$ with $\v{m}^* \triangleq (\v{\mu}_-^*,\v{\mu}_+^*,\v{\nu}_-^*,\v{\nu}_+^*)$. Next, denote the feasible set $\set{F}$ of decision variables by % \begin{equation*} \set{F} \triangleq \left\{ (\v{p},\v{\tau}) \in [0,1]^n \times \R_{\geq0}^n : p_i^- \leq p_i \leq p_i^+, \tau_i^- \leq \tau_i \leq \tau_i^+, i \in \{1,2,\dots,n\} \right\} %\label{eq:feasible_set} \end{equation*} % Also, for each $(\v{p}^*,\v{\tau}^*) \in \set{F}$, define the sets of active inequality constraints\footnote{An \emph{active} inequality constraint is a constraint that holds only by equality. For example, the constraint $x \geq 1$ is active for $x = 1$ and \emph{inactive} for $x > 1$.} $\set{A}_p^-(\v{p}^*)$, $\set{A}_p^+(\v{p}^*)$, $\set{A}_\tau^-( \v{\tau}^* )$, $\set{A}_\tau^+( \v{\tau}^* )$ by % \begin{align*} \set{A}_p^-(\v{p}^*) &\triangleq \left\{ i \in \{1,2,\dots,n\} : p^*_i = p_i^- \right\} & \set{A}_\tau^-(\v{\tau}^*) \triangleq \left\{ i \in \{1,2,\dots,n\} : \tau^*_i = \tau_i^- \right\}\\ \set{A}_p^+(\v{p}^*) &\triangleq \left\{ i \in \{1,2,\dots,n\} : p^*_i = p_i^+ \right\} & \set{A}_\tau^+(\v{\tau}^*) \triangleq \left\{ i \in \{1,2,\dots,n\} : \tau^*_i = \tau_i^+ \right\} \end{align*} % For any $i \in \{1,2,\dots,n\}$, if $p_i^+=p_i^-$ ($\tau_i^+=\tau_i^-$), then the inequality \aimention{Joseph-Louis Lagrange}Lagrange multipliers $\mu_i^+$ and $\mu_i^-$ ($\nu_i^+$ and $\nu_i^-$) combine to form an equality \aimention{Joseph-Louis Lagrange}Lagrange multiplier $\mu_i^+ - \mu_i^- \in \R$ ($\tau_i^+ - \tau_i^- \in \R$). Therefore, it is clear that all points $(\v{p},\v{\tau}) \in \set{F}$ are regular\footnote{In this context, a \emph{regular} point is a point where all active constraint gradients are linearly independent.}. Finally, for any point $(\v{p}^*,\v{\tau}^*)$, define the \emph{feasible variations} $\set{V}(\v{p}^*,\v{\tau}^*)$ by % \begin{equation*} \set{V}(\v{p}^*,\v{\tau}^*) \triangleq \left\{ \begin{bmatrix} \delta^p_1\\ \delta^p_2\\ \vdots\\ \delta^p_n\\ \delta^\tau_1\\ \delta^\tau_2\\ \vdots\\ \delta^\tau_n \end{bmatrix} \in \R^{2n} : \begin{aligned} \delta^p_i &= 0,& i &\in \set{A}_p^-(\v{p}^*) \cup \set{A}_p^+(\v{p}^*) ,\\ \delta^\tau_j &= 0,& j &\in \set{A}_\tau^-(\v{\tau}^*) \cup \set{A}_\tau^+(\v{\tau}^*) \end{aligned} \right\} \end{equation*} % We also define the gradient operator $\nabla$ and the \aimention{Ludwig Otto Hesse}Hessian operator $\nabla^2$ by % \begin{align*} \nabla &\triangleq \begin{bmatrix} \frac{\partial}{\partial p_1}, \frac{\partial}{\partial p_2}, \dots, \frac{\partial}{\partial p_n}, \frac{\partial}{\partial \tau_1}, \frac{\partial}{\partial \tau_2}, \dots, \frac{\partial}{\partial \tau_n} \end{bmatrix}^\T & \nabla^2 &\triangleq \nabla \nabla^\T \end{align*} % so that we have the gradient $\nabla L$ and the \aimention{Ludwig Otto Hesse}Hessian $\nabla^2 L$. When these are to be evaluated at a point $(\v{p}^*,\v{\tau}^*) \in \set{F}$ with multipliers $\v{m}^* \in (\R_{\geq0}^n)^4$, we use the notation $\nabla L(\v{p}^*,\v{\tau}^*,\v{m}^*)$ and $\nabla^2 L(\v{p}^*,\v{\tau}^*,\v{m}^*)$, respectively. Because the \aimention{Joseph-Louis Lagrange}Lagrangian is a continuous function, its \aimention{Ludwig Otto Hesse}Hessian matrix will be symmetric% %\footnote{To %say that matrix $\mat{A}$ is symmetric means $\mat{A}^T = \mat{A}$.} . \subsubsection{First-Order Necessary Conditions} Assume that the point $(\v{p}^*,\v{\tau}^*) \in \set{F}$ is a local maximum of the objective function. For convenience, use the notation % \begin{equation} J^* \triangleq J(\v{p}^*,\v{\tau}^*) \quad A^* \triangleq A(\v{p}^*,\v{\tau}^*) \quad D^* \triangleq D(\v{p}^*,\v{\tau}^*) \label{eq:JAD_star_notation} \end{equation} % In order for $J^*$ to be well-defined, it must be assumed that $D^*$ is nonzero\footnote{While $J^*$ is not defined for $D^* = 0$, any case where $A^* > 0$ and $D^* = 0$ is certainly desirable.}. It is necessary that there exist \aimention{Joseph-Louis Lagrange}Lagrange multiplier vectors $\v{m}^* \in (\R_{\geq0}^n)^4$ such that % \begin{equation} \nabla L(\v{p}^*,\v{\tau}^*,\v{m}^*) = 0 \label{eq:first_order_gradient} \end{equation} % and for all $i \in \{1,2,\dots,n\}$, % \begin{subequations} \begin{gather} i \notin \set{A}_p^-(\v{p}^*) \implies \mu^*_{i-} = 0 \qquad \text{ and } \qquad i \in \set{A}_p^-(\v{p}^*) \implies \mu^*_{i-} \geq 0 \label{eq:first_order_multipliers_pminus}\\ i \notin \set{A}_p^+(\v{p}^*) \implies \mu^*_{i+} = 0 \qquad \text{ and } \qquad i \in \set{A}_p^+(\v{p}^*) \implies \mu^*_{i+} \geq 0 \label{eq:first_order_multipliers_pplus}\\ i \notin \set{A}_\tau^-(\v{\tau}^*) \implies \nu^*_{i-} = 0 \qquad \text{ and } \qquad i \in \set{A}_\tau^-(\v{\tau}^*) \implies \nu^*_{i-} \geq 0 \label{eq:first_order_multipliers_tauminus}\\ i \notin \set{A}_\tau^+(\v{\tau}^*) \implies \nu^*_{i+} = 0 \qquad \text{ and } \qquad i \in \set{A}_\tau^+(\v{\tau}^*) \implies \nu^*_{i+} \geq 0 \label{eq:first_order_multipliers_tauplus} \end{gather} \end{subequations} % where $\implies$ denotes logical implication% %\footnote{That is, if $A %\implies B$, then assertion of $B$ is necessary for the assertion of %$A$ and assertion of $A$ is sufficient to conclude assertion of $B$. If %a statement is both necessary and sufficient for another statement, the %two statements are said to be \emph{logically equivalent}, which is %denoted with $\iff$; that is, $A \iff B$ is the statement that assertion %of $A$ occurs \emph{if and only if} assertion of $B$ also occurs.} . That is, all inequality multipliers are nonnegative; however, multipliers associated with inactive constraints are zero. Take $j \in \{1,2,\dots,n\}$. If $p^-_j = p^+_j$, then $p^*_j = p^-_j = p^+_j$. Similarly, if $\tau^-_j = \tau^*_j$, then $\tau^*_j = \tau^-_j = \tau^+_j$. We avoid these trivial cases by assuming that $p^-_j \neq p^+_j$ and $\tau^-_j \neq \tau^+_j$. Of course, if $\tau^+_j = \infty$, then it is impossible for $\tau^*_j = \tau^*_j$. \paragraph{Preference Probabilities:} First, consider the requirements on the preference probabilities. \longref{eq:first_order_gradient} requires that % \begin{equation*} \frac { D^* a_j(\tau^*_j) - A^* d_j(\tau^*_j) } { (D^*)^2 } = \mu_{i+}^* - \mu_{i-}^* \end{equation*} % There are three cases of interest. % \begin{subequations} \begin{enumerate}[{Special Case }1:] \item[$p^*_j \in (p^-_j,p^+_j)$:] By \longrefs{eq:first_order_multipliers_pminus} and \shortref{eq:first_order_multipliers_pplus}, $\mu_{j-}^* = \mu_{j+}^* = 0$. Therefore, % \begin{equation} D^* a_j(\tau^*_j) = A^* d_j(\tau^*_j) \label{eq:pj_first_order_unconst} \end{equation} \item[$p^*_j = p^-_j$:] By \longref{eq:first_order_multipliers_pminus}, $\mu_{j-}^* \geq 0$ and $\mu_{j+}^* = 0$. Therefore, % \begin{equation} D^* a_j(\tau^*_j) \leq A^* d_j(\tau^*_j) \label{eq:pj_first_order_minconst} \end{equation} \item[$p^*_j = p^+_j$:] By \longref{eq:first_order_multipliers_pplus}, $\mu_{j-}^* = 0$ and $\mu_{j+}^* \geq 0$. Therefore, % \begin{equation} D^* a_j(\tau^*_j) \geq A^* d_j(\tau^*_j) \label{eq:pj_first_order_maxconst} \end{equation} \end{enumerate} \end{subequations} % For the minimum constraint to be active, the partial derivative of $J$ at the constraint must be negative. Similarly, for the maximum constraint to be active, the partial derivative of $J$ at the constraint must be negative. Otherwise, the partial derivative of $J$ should be zero. Additionally, if the minimum and maximum constraints are equal, there is no restriction on the partial derivative of $J$ at that point. All of these conditions are intuitive and can be explained graphically. \paragraph{Processing Times:} Next, consider the requirements on the processing times. \longref{eq:first_order_gradient} requires that % \begin{equation*} \frac { D^* p^*_j a_j'(\tau^*_j) - A^* p^*_j d_j'(\tau^*_j) } { (D^*)^2 } = \nu_{i+}^* - \nu_{i-}^* \end{equation*} % There are three cases of interest. % \begin{subequations} \begin{enumerate}[{Special Case }1:] \item[$\tau^*_j \in (\tau^-_j,\tau^+_j)$:] By \longrefs{eq:first_order_multipliers_tauminus} and \shortref{eq:first_order_multipliers_tauplus}, $\nu_{j-}^* = \nu_{j+}^* = 0$. Therefore, % \begin{equation} D^* p^*_j a_j'(\tau^*_j) = A^* p^*_j d_j'(\tau^*_j) \label{eq:tauj_first_order_unconst} \end{equation} \item[$\tau^*_j = \tau^-_j$:] By \longref{eq:first_order_multipliers_tauminus}, $\nu_{j-}^* \geq 0$ and $\nu_{j+}^* = 0$. Therefore, % \begin{equation} D^* p^*_j a_j'(\tau^*_j) \leq A^* p^*_j d_j'(\tau^*_j) \label{eq:tauj_first_order_minconst} \end{equation} \item[$\tau^*_j = \tau^+_j$:] By \longref{eq:first_order_multipliers_tauplus}, $\nu_{j-}^* = 0$ and $\nu_{j+}^* \geq 0$. Therefore, % \begin{equation} D^* p^*_j a_j'(\tau^*_j) \geq A^* p^*_j d_j'(\tau^*_j) \label{eq:tauj_first_order_maxconst} \end{equation} \end{enumerate} \end{subequations} % Clearly, the same interpretation applies here as applied for the requirements on optimal preference probabilities. \subsubsection{Second-Order Necessary Conditions} Once more, assume that the point $(\v{p}^*,\v{\tau}^*) \in \set{F}$ is a local maximum of the objective function and use the notation in \longref{eq:JAD_star_notation}. We also use the notation % \begin{equation*} J^*_{xy} \triangleq \left. \frac{\partial^2 J}{ \partial x \partial y } \right|_{(\v{p},\v{\tau})=(\v{p}^*,\v{\tau}^*)} \quad A^*_{xy} \triangleq \left. \frac{\partial^2 A}{ \partial x \partial y } \right|_{(\v{p},\v{\tau})=(\v{p}^*,\v{\tau}^*)} \quad D^*_{xy} \triangleq \left. \frac{\partial^2 D}{ \partial x \partial y } \right|_{(\v{p},\v{\tau})=(\v{p}^*,\v{\tau}^*)} \end{equation*} % Again, $D^*$ must be assumed to be nonzero. We also assume that the functions $a_i$ and $d_i$ are twice continuously differentiable\footnote{That is, the derivatives at each point in their domain are themselves continuously differentiable.} functions for all $i \in \{1,2,\dots,n\}$. It is necessary that there exist \aimention{Joseph-Louis Lagrange}Lagrange multiplier vectors $\v{m}^* \in (\R_{\geq0}^n)^4$ such that the first-order necessary conditions hold and % \begin{equation} \v{\delta}^\T \nabla^2 L(\v{p}^*,\v{\tau}^*,\v{m}^*) \v{\delta} \geq 0 \quad \text{ for all } \quad \v{\delta} \in \set{V}(\v{p}^*,\v{\tau}^*) \setdiff \{0\} \label{eq:second_order_hessian} \end{equation} % That is, at the point $(\v{p}^*,\v{\tau}^*)$, the \aimention{Ludwig Otto Hesse}Hessian of the \aimention{Joseph-Louis Lagrange}Lagrangian must be positive semidefinite over the set of feasible variations at that point. The \aimention{Ludwig Otto Hesse}Hessian $\nabla^2 L(\v{p}^*,\v{\tau}^*,\v{m}^*)$ does not depend upon the multipliers $\v{m}^*$, and so it is completely characterized by $J^*_{p_jp_k}$, $J^*_{\tau_j\tau_k}$, and $J^*_{p_j\tau_k}$ for all $j,k \in \{1,2,\dots,n\}$. Therefore, take $j,k \in \{1,2,\dots,n\}$. \paragraph{Elimination of Active Preference Probability Constraints:} First, assume that $j \in \set{A}_p^-(\v{p}^*) \cup \set{A}_p^+(\v{p}^*)$. That is, assume that an inequality constraint on the $j$\th{}\ preference probability is active (\ie, $p^*_j = p^-_j$ or $p^*_j = p^+_j$). In this case, for all $\v{\delta} \in \set{V}(\v{p}^*,\v{\tau}^*)$, $\delta^p_j = 0$. Therefore, because the feasible variations along active constraint directions are zero, $J^*_{p_jp_k}$ and $J^*_{p_j\tau_k}$ will have no impact on \longref{eq:second_order_hessian}. \paragraph{Elimination of Active Processing Time Constraints:} Next, instead assume that $j \in \set{A}_\tau^-(\v{\tau}^*) \cup \set{A}_\tau^+(\v{\tau}^*)$. That is, assume that an inequality constraint on the $j$\th{}\ processing time is active (\ie, $\tau^*_j = \tau^-_j$ or $\tau^*_j = \tau^+_j$). In this case, for all $\v{\delta} \in \set{V}(\v{p}^*,\v{\tau}^*)$, $\delta^\tau_j = 0$. Therefore, because the feasible variations along active constraint directions are zero, $J^*_{p_k\tau_j}$ and $J^*_{\tau_j\tau_k}$ will have no impact on \longref{eq:second_order_hessian}. \paragraph{Elimination of Off-Diagonal Terms:} By the reasoning about active constraints above, we can focus on coordinates of $(\v{p}^*,\v{\tau}^*)$ where constraints are inactive, and so we assume \longrefs{eq:pj_first_order_unconst} and \shortref{eq:tauj_first_order_unconst}. Therefore, % \begin{equation} J^*_{p_jp_k} = \frac { D^* A^*_{p_jp_k} - A^* D^*_{p_jp_k} } { (D^*)^2 } \quad \text{ and } \quad J^*_{p_j\tau_k} = \frac { D^* A^*_{p_j\tau_k} - A^* D^*_{p_j\tau_k} } { (D^*)^2 } \label{eq:second_derivative_J_pp_ptau} \end{equation} % and % \begin{equation} J^*_{\tau_j\tau_k} = \frac { D^* A^*_{\tau_j\tau_k} - A^* D^*_{\tau_j\tau_k} } { (D^*)^2 } \label{eq:second_derivative_J_tautau} \end{equation} % For the moment, we focus on the off-diagonal terms of the \aimention{Ludwig Otto Hesse}Hessian that correspond to inactive constraints. First, assume that $j \neq k$. Clearly, % \begin{equation*} A^*_{p_jp_k} = D^*_{p_jp_k} = A^*_{\tau_j\tau_k} = D^*_{\tau_j\tau_k} = A^*_{p_j\tau_k} = D^*_{p_j\tau_k} = 0 \end{equation*} % Thus, % \begin{equation*} J^*_{p_jp_k} = J^*_{\tau_j\tau_k} = J^*_{p_j\tau_k} = 0 \end{equation*} % Now we focus on the remaining off-diagonal terms. That is, take $j=k$. So, % \begin{equation*} J^*_{p_j\tau_j} = \frac { D^* a'_j(\tau^*_j) - A^* d'_j(\tau^*_j) } { (D^*)^2 } \end{equation*} % Recall that we are taking $j \notin \set{A}_p^-(\v{p}^*) \cup \set{A}_p^+(\v{p}^*)$ (\ie, the $j\th$ preference probability is unconstrained, so $p^*_j \in (p^-_j,p^+_j)$). Therefore, $p^*_j > 0$ and so \longref{eq:tauj_first_order_unconst} implies that $a'_j(\tau^*_j) = J^* d'_j(\tau^*_j)$. However, $D^* J^* = A^*$. Thus, by substitution, it is clear that $J^*_{p_j\tau_j} = 0$. Hence, $J^*_{p_jp_k}$, $J^*_{\tau_j\tau_k}$, and $J^*_{p_i\tau_j}$ have no impact on \longref{eq:second_order_hessian} for all $i,j,k \in \{1,2,\dots,n\}$ with $j \neq k$. \paragraph{Impact of Inactive Preference Probability Diagonals:} Next, we consider the diagonal terms of the \aimention{Ludwig Otto Hesse}Hessian that correspond to inactive preference probabilities. That is, assume that $j=k$ and $p^*_j \in (p_j^-,p_j^+)$. The condition in \longref{eq:second_order_hessian} requires that $J^*_{p_jp_j} \leq 0$. By \longref{eq:second_derivative_J_pp_ptau}, this means that % \begin{equation} D^* \times 0 \leq A^* \times 0 \label{eq:pj_second_order_necessary} \end{equation} % which is always true (\ie, it is always the case that $0 \leq 0$ with equality). Therefore, this necessary condition adds no more information than \longref{eq:pj_first_order_unconst}. \paragraph{Definiteness from Inactive Processing Time Diagonals:} By the reasoning above, the only second partial derivative that can prevent \longref{eq:second_order_hessian} from being true is $J^*_{\tau_j\tau_j}$ where $\tau^*_j \in (\tau_j^-,\tau_j^+)$. That is, the condition in \longref{eq:second_order_hessian} requires that $J^*_{\tau_j\tau_j} \leq 0$. By \longref{eq:second_derivative_J_tautau}, this means that % \begin{equation} D^* p^*_j a''_j(\tau^*_j) \leq A^* p^*_j d''_j(\tau^*_j) \label{eq:tauj_second_order_necessary} \end{equation} % If the constraint parameter $p^-_j = 0$ and the $j\th$ preference probability constraint is active (\ie, $p^*_j = 0$), then this condition is always true by equality. Otherwise, if $p^*_j > 0$, it must be that $D^* a''_j(\tau^*_j) \leq A^* d''_j(\tau^*_j)$. \subsubsection{Second-Order Sufficiency Conditions} Now take an arbitrary feasible point $(\v{p}^*,\v{\tau}^*) \in \set{F}$ that may be a maximum of the objective function. If there exist \aimention{Joseph-Louis Lagrange}Lagrange multiplier vectors $\v{m}^* \in (\R_{\geq0}^n)^4$ such that \longref{eq:first_order_gradient} holds and \longrefs{eq:second_order_hessian} and \shortref{eq:first_order_multipliers_pminus}--% \shortref{eq:first_order_multipliers_tauminus} hold with \emph{strict} inequality % %\footnote{That is, replace each $\leq$ with $<$ and each %$\geq$ with $>$.} , then the point must be a local maximum of the objective function. This is effectively a statement of the local concavity % %\footnote{For $m \in \N$, to say that $\set{C} \subseteq \R^m$ %is a \emph{convex set} means that for any $x,y \in \set{C}$ and $t \in %[0,1]$, $tx + (1-t)y \in \set{C}$. For $m \in \N$ and convex set %$\set{C} \subseteq \R^m$, to say that $f: \set{C} \mapsto \R$ is a %\emph{concave function} means that for any $x,y \in \set{C}$ and $t \in %[0,1]$ with $x \neq y$, $f(tx + (1-t)y) \geq tf(x) + (1-t)f(y)$, where %\emph{strict} concavity means that $\geq$ can be replaced with $>$. This %is \emph{global concavity}. Any element of $\set{C}$ that is a local %maximum of concave function $f$ must also be a global maximum of $f$, %and if the concavity is strict, then this point will be the unique %global maximum of $f$. \emph{Local concavity} is the weaker analogous %concept when a function is restricted to a convex \emph{local %neighborhood} of a point. That is, this sufficiency condition %establishes that the restriction of the function to a certain local %neighborhood of a point will be strictly concave.} of the objective function at the point $(\v{p}^*,\v{\tau}^*)$. \paragraph{The Extreme-Preference Rule:} \index{extreme-preference rule (EPR)|(indexdef}In order for \longref{eq:second_order_hessian} to hold with strict inequality, \longrefs{eq:pj_second_order_necessary} and \shortref{eq:tauj_second_order_necessary} must both hold with strict inequality. However, this is impossible for \longref{eq:pj_second_order_necessary}. Therefore, if there is some $i \in \{1,2,\dots,n\}$ with $p^*_i \in (p^-_i,p^+_i)$, these conditions cannot be used to show that the point is a local maximum\footnote{In other words, strict concavity cannot hold at such a point.}. Our goal is to design strategies guaranteed to be local maxima, so these strategies will have $p^*_i = p^-_i$ or $p^*_i = p^+_i$ for all $i \in \{1,2,\dots,n\}$. We call this the \emph{\acro[extreme-preference rule~(EPR)]{EPR}{\index{extreme-preference rule (EPR)"|indexglo}extreme-preference rule}}. \Citet{SK86} assume that $(p^-_i,p^+_i)=(0,1)$ for all $i \in \{1,2,\dots,n\}$, so they call this the \index{zero-one rule|indexdef}\emph{zero-one rule}. This rule is part of a sufficiency condition; it is not at all necessary.\index{extreme-preference rule (EPR)|)indexdef}\index{EPR|see{extreme-preference rule}} \paragraph{Problems with Sufficiency at Zero Preference Probability:} Assume there exists some $j \in \{1,2,\dots,n\}$ such that $p^*_j = 0$. \longrefs{eq:tauj_first_order_minconst}, \shortref{eq:tauj_first_order_maxconst}, and \shortref{eq:tauj_second_order_necessary} cannot all hold with strict inequality for this $\v{p}^*$. In other words, strict concavity is impossible at this point because the objective function is the same value for any choice of $\tau^*_j$. However, it can be shown that if these all hold when $p^*_j$ is replaced with some arbitrarily small $\varepsilon$ with $0 < \varepsilon < p^+_j$, then the point $(p^*_j,\tau^*_j)$ is a local maximum of the objective function. In other words, even if the function is not strictly locally concave, under these $\varepsilon$-conditions it is certainly locally concave. \subsection{Solutions to Special Cases} Solutions to this generalized optimization problem can be difficult to find% %\footnote{The graphical interpretations in %\longref{sec:alternate_optimization_objectives} may provide valuable %intuition or inspire numerical methods to assist in finding optimal %solutions.} . In fact, mere existence of solutions cannot be taken for granted. However, there are two special cases that guarantee existence (but not uniqueness) of solutions and can be equipped with simple methods of finding one of those solutions. \subsubsection{Constant Disadvantage Case} \index{rational objective function!optimal solution!constant disadvantage|(} This case not only serves as an important example but is useful in some real cases. It is our goal to construct a strategy $(\v{p}^*,\v{\tau}^*) \in \set{F}$ that meets all sufficiency conditions to be called a local maximum point of the objective function. This point will be a \emph{global} maximum if the objective function is concave. The point will be the unique global maximum if the objective function is strictly concave. Assume that for all $j,k \in \{1,2,\dots,n\}$, % \begin{enumerate}[(i)] \item $p_j^- = 0$ \label{item:constant_disadv_exclusion} \item $a_j$ and $d_j$ are twice continuously differentiable functions \label{item:constant_disadv_differentiability} \item for all $\tau_j \in \R_{\geq0} \cap [\tau_j^-,\tau_j^+]$ and all $\tau_k \in \R_{\geq0} \cap [\tau_k^-,\tau_k^+]$, \begin{compactitem} \item $d(\tau_j) d \geq 0$ \item $d_j(\tau_j) d_k(\tau_k) > 0$ \end{compactitem} \label{item:constant_disadv_same_sign} \item either $d \neq 0$ or there exists some $i \in \{1,2,\dots,n\}$ such that $p^*_i > 0$ \label{item:constant_disadv_nonzero_Dstar} \item $d_j'(\tau_j) = 0$ for all $\tau_j \in (\tau_j^-,\tau_j^+)$ \label{item:constant_disadv} \item if $\tau_j^- \neq \tau_j^+$, it is the case that \begin{compactenum}% %[({\ref*{item:constant_disadv_convexity}}.a)] [(a)] \item $d_j(\tau_j^-) a'_j(\tau_j^-) < 0$ or \label{item:constant_disadv_conv_left} \item $d_j(\tau_j^+) a'_j(\tau_j^+) > 0$ or \label{item:constant_disadv_conv_right} \item $d_j(\tau_j) a'_j(\tau_j) = 0$ with $d_j(\tau_j) a_j''(\tau_j) < 0$ for some $\tau_j \in (\tau_j^-,\tau_j^+)$ \label{item:constant_disadv_conv_middle} \end{compactenum} \label{item:constant_disadv_convexity} \end{enumerate} % If these assumptions do not hold, for each $j \in \{1,2,\dots,n\}$, $\tau^-_j$ and $\tau^+_j$ may be adjusted to surround a region where they do hold. These assumptions lead to the following for all $j \in \{1,2,\dots,n\}$. % \begin{description} \item\emph{Well-Defined Objective Function:} By (\shortref{item:constant_disadv_same_sign}) and (\shortref{item:constant_disadv_nonzero_Dstar}), $D^* \neq 0$ and $D^* d_j(\tau_j) > 0$. This implies that both $J^*$ and $a_j(\tau_j)/d_j(\tau_j)$ are well-defined for all choices of $\tau_j \in [\tau_j^-,\tau_j^+]$. \item\emph{Maximum Type-Advantage-to-Type-Disadvantage Ratio Exists:} By (\shortref{item:constant_disadv_convexity}), there exists some $\tau_j^* \in [\tau_j^-,\tau_j^+]$ such that there is some $\delta_j \in \R_{>0}$ where $a_j(\tau_j)/d_j(\tau_j) \leq a_j(\tau_j^*)/d_j(\tau_j^*)$ for all $\tau_j \in (\tau_j - \delta_j,\tau_j + \delta_j) \cap [\tau_j^-,\tau_j^+]$. That is, the $a_j/d_j$ function has a maximum on its domain. \item\emph{Parameterized Processing Times:} If $\tau_j^- = \tau_j^+$, then (\shortref{item:constant_disadv}) and (\shortref{item:constant_disadv_convexity}) are trivially met. This case is useful when processing times are parameters of the system and not decision variables. \Citet{SK86} use the name \index{task-type choice problem}\emph{prey model} for the case where no processing times are free decision variables (\ie, tasks are whole items of prey that come lumped with a rigid (average) processing time). \end{description} % If $\tau^-_j=\tau^+_j$, let $\tau^*_j=\tau^-_j$. Otherwise, let $\tau_j^*$ be a maximum of $a_j/d_j$ that is described by (\shortref{item:constant_disadv_convexity}). Next, assume that the types are indexed so that % \begin{equation} \frac{a_1(\tau_1^*)}{d_1(\tau_1^*)} > \frac{a_2(\tau_2^*)}{d_2(\tau_2^*)} > \cdots > \frac{a_{n-1}(\tau_{n-1}^*)}{d_{n-1}(\tau_{n-1}^*)} > \frac{a_n(\tau_n^*)}{d_n(\tau_n^*)} \label{eq:generalized_prey_ordering} \end{equation} % Assume that for all $k \in \{0,1,2,\dots,n-1\}$, % \begin{equation*} \frac% { a + \sum\limits_{i=1}^{k} p_i^+ a_i(\tau_i^*)}% { d + \sum\limits_{i=1}^{k} p_i^+ d_i(\tau_i^*)} \neq \frac{a_{k+1}(\tau_{k+1}^*)}{d_{k+1}(\tau_{k+1}^*)} \end{equation*} % Finally, define $k^*$ by % \begin{equation*} k^* \triangleq \min\left(\left\{ k \in \{0,1,2,\dots,n-1\} : \frac% { a + \sum\limits_{i=1}^{k} p_i^+ a_i(\tau_i^*)}% { d + \sum\limits_{i=1}^{k} p_i^+ d_i(\tau_i^*)} > \frac{a_{k+1}(\tau_{k+1}^*)}{d_{k+1}(\tau_{k+1}^*)} \right\} \cup \{n\} \right) \end{equation*} % and let % \begin{equation*} p_j^* = \begin{cases} p_j^+ &\text{if } j \leq k^*\\ 0 &\text{if } j > k^* \end{cases} \end{equation*} % for all $j \in \{1,2,\dots,n\}$. Primarily because of assumption (\shortref{item:constant_disadv_convexity}) and the results that $D^* d_j(\tau_j^*) > 0$ and $d_j'(\tau_j^*)' = d_j''(\tau_j^*) = 0$ for all $j \in \{1,2,\dots,n\}$, it is easy to show that $(\v{p}^*,\v{\tau}^*)$ meets the conditions described in \longref{sec:optimization_procedure} that guarantee it is a local maximum of the objective function\footnote{Because $p^-_j = 0$ for all $j \in \{1,2,\dots,n\}$, this statement requires the zero preference probability modification described at the end of \longref{sec:optimization_procedure}.}.% \index{rational objective function!optimal solution!constant disadvantage|)} \subsubsection{Decreasing Advantage-to-Disadvantage Ratio}% \index{rational objective function!optimal solution!decreasing advantage-to-disadvantage|(} Again, it is our goal to construct a strategy $(\v{p}^*,\v{\tau}^*) \in \set{F}$ that meets all sufficiency conditions to be called a local maximum point of the objective function. However, here we assume that the disadvantage functions are not constant with respect to processing time. This is a generalized version of the \index{combined task-type and processing-length choice problem}\emph{combined prey and patch model} discussed by \citet{SK86}, and so it shows the \index{marginal value theorem (MVT)}\ac{MVT} concept \citep{Cha76,Cha73}. However, \citeauthor{SK86} make different assumptions than we do because they depend on search costs being nil. Assume that for all $j,k \in \{1,2,\dots,n\}$, % \begin{enumerate}[(i)] \item $p_j^- = 0$ \label{item:decr_ratio_exclusion} \item $a_j$ and $d_j$ are twice continuously differentiable functions \label{item:decr_ratio_differentiability} \item for all $\tau_j \in \R_{\geq0}\cap[\tau_j^-,\tau_j^+]$ and all $\tau_k \in \R_{\geq0}\cap[\tau_k^-,\tau_k^+]$, \begin{compactitem} \item $d(\tau_j) d \geq 0$ \item $d_j(\tau_j) d_k(\tau_k) > 0$ \end{compactitem} \label{item:decr_ratio_same_sign} \item either $d \neq 0$ or there exists some $i \in \{1,2,\dots,n\}$ such that $p^*_i > 0$ \label{item:decr_ratio_nonzero_Dstar} \item $d_j(\tau_j) d_j'(\tau_j) > 0$ for all $\tau_j \in (\tau_j^-,\tau_j^+)$ \label{item:decr_ratio_same_sign_derivative} \item $(a_j(\tau_j)/d_j(\tau_j))' < 0$ for all $\tau_j \in (\tau_j^-,\tau_j^+)$ \label{item:decr_ratio} \item $(a_j'(\tau_j)/d_j'(\tau_j))' < 0$ for all $\tau_j \in (\tau_j^-,\tau_j^+)$ \label{item:decr_ratio_second} \end{enumerate} % If these assumptions do not hold, for each $j \in \{1,2,\dots,n\}$, $\tau^-_j$ and $\tau^+_j$ may be adjusted to surround a region where they do. These assumptions lead to the following for all $j \in \{1,2,\dots,n\}$. % \begin{description} \item\emph{Well-Defined Objective Function:} By (\shortref{item:decr_ratio_same_sign}) and (\shortref{item:decr_ratio_nonzero_Dstar}), $D^* \neq 0$ and $D^* d_j(\tau_j) > 0$. This implies that both $J^*$, $a_j(\tau_j)/d_j(\tau_j)$, and $a_j'(\tau_j)/d_j'(\tau_j)$ are all well-defined for all choices of $\tau_j \in [\tau_j^-,\tau_j^+]$. \item\emph{Maximum Type-Advantage-to-Type-Disadvantage Ratio Exists:} By (\shortref{item:decr_ratio}), $\tau_j^-$ is such that $a_j(\tau_j)/d_j(\tau_j) \leq a_j(\tau_j^-)/d_j(\tau_j^-)$ for all $\tau_j \in [\tau_j^-,\tau_j^+]$. That is, the $a_j(\tau_j)/d_j(\tau_j)$ function achieves its maximum at $\tau_j = \tau_j^-$. \item\emph{Ordering of Ratios:} By (\shortref{item:decr_ratio}) and (\shortref{item:decr_ratio_same_sign_derivative}), $a_k(\tau_k)/d_k(\tau_k) > a_k'(\tau_k)/d_k'(\tau_k)$ for all $\tau_k \in (\tau_k^-,\tau_k^+)$. \item\emph{Parameterized Processing Times:} If $\tau_j^- = \tau_j^+$, then (\shortref{item:decr_ratio_same_sign_derivative})--% (\shortref{item:decr_ratio_second}) are trivially met. \end{description} % Assume the types are indexed so that % \begin{equation} \frac{a_1(\tau_1^-)}{d_1(\tau_1^-)} > \frac{a_2(\tau_2^-)}{d_2(\tau_2^-)} > \cdots > \frac{a_{n-1}(\tau_{n-1}^-)}{d_{n-1}(\tau_{n-1}^-)} > \frac{a_n(\tau_n^-)}{d_n(\tau_n^-)} \label{eq:generalized_prey_ordering_2} \end{equation} % In other words, as in the constant advantage case, order the task types by decreasing maximum advantage-to-disadvantage ratio. This is the same ordering used by \citeauthor{SK86}; however, because we have assumed the derivative of this ratio is strictly decreasing, the initial ratio will always be the maximum ratio. Next, for all $k \in \{0,1,\dots,n\}$, define $\tau_j^k$ so that % \begin{align*} \frac{ a'_j(\tau_j^k) }{ d'_j(\tau_j^k) } &> \frac% { a + \sum\limits_{i=1}^{k} p_i^+ a_i(\tau_i^k)}% { d + \sum\limits_{i=1}^{k} p_i^+ d_i(\tau_i^k)} \quad \text{ for } \tau_j^k = \tau_j^+\\ % \intertext{or} % \frac{ a'_j(\tau_j^k) }{ d'_j(\tau_j^k) } &< \frac% { a + \sum\limits_{i=1}^{k} p_i^+ a_i(\tau_i^k)}% { d + \sum\limits_{i=1}^{k} p_i^+ d_i(\tau_i^k)} \quad \text{ for } \tau_j^k = \tau_j^-\\ % \intertext{or} % \frac{ a'_j(\tau_j^k) }{ d'_j(\tau_j^k) } &= \frac% { a + \sum\limits_{i=1}^{k} p_i^+ a_i(\tau_i^k)}% { d + \sum\limits_{i=1}^{k} p_i^+ d_i(\tau_i^k)} \quad \text{ for } \tau_j^k \in (\tau_j^-,\tau_j^+) \end{align*} % By (\shortref{item:decr_ratio_second}), this is always possible. Unfortunately, for each $k \in \{0,1,\dots,n\}$, all elements of the set $\{ \tau_j^k : j = \{1,2,\dots,k\} \}$ must be determined simultaneously. This is different from the constant disadvantage case. That is, because $d_k'(\tau_j) \neq 0$ for all $\tau_j \in [\tau_j^-,\tau_j^+]$, there is coupling among the optimal choices of processing time. It must also be assumed that % \begin{equation*} \frac% { a + \sum\limits_{i=1}^{k} p_i^+ a_i(\tau_i^k)}% { d + \sum\limits_{i=1}^{k} p_i^+ d_i(\tau_i^k)} \neq \frac{a_{k+1}(\tau_{k+1}^-)}{d_{k+1}(\tau_{k+1}^-)} \end{equation*} % for all $k \in \{0,1,2,\dots,n-1\}$. Now, define $k^*$ by % \begin{equation*} k^* \triangleq \min\left(\left\{ k \in \{0,1,2,\dots,n-1\} : \frac% { a + \sum\limits_{i=1}^{k} p_i^+ a_i(\tau_i^k)}% { d + \sum\limits_{i=1}^{k} p_i^+ d_i(\tau_i^k)} > \frac{a_{k+1}(\tau_{k+1}^-)}{d_{k+1}(\tau_{k+1}^-)} \right\} \cup \{n\} \right) \end{equation*} % Finally, let % \begin{equation*} \tau_j^* = \tau_j^{k^*} \quad \text{ and } \quad p_j^* = \begin{cases} p_j^+ &\text{if } j \leq k^*\\ 0 &\text{if } j > k^* \end{cases} \end{equation*} % for all $j \in \{1,2,\dots,n\}$. Primarily because of assumptions (\shortref{item:decr_ratio}) and (\shortref{item:decr_ratio_second}), it is easy to show that $(\v{p}^*,\v{\tau}^*)$ meets the conditions described in \longref{sec:optimization_procedure} that guarantee it is a local maximum of the objective function\footnote{Because $p^-_j = 0$ for all $j \in \{1,2,\dots,n\}$, this statement requires the zero preference probability modification described at the end of \longref{sec:optimization_procedure}.}.% \index{rational objective function!optimal solution!decreasing advantage-to-disadvantage|)}% \index{rational objective function|)} \section{Optimization of Specific Objective Functions} \label{sec:optimization_specific} \index{examples!analytical solutions|(indexdef} The optimization results given in \longref{sec:optimization_rational} may be applied to many of the functions introduced in \longref{ch:optimization_objectives}. We consider three of them here. Unfortunately, the reward-to-variability and reward-to-variance optimization functions do not fit the form of \longref{sec:optimization_rational} because the central moments used to define them involve a great deal of cross-coupling among task-type parameters and decision variables. Therefore, we do not consider solutions to these optimization functions. We also do not provide solutions for the constrained optimization functions; however, we have shown other ways to implement success thresholds that can be handled by the methods in \longref{sec:optimization_rational}. \subsection{Maximization of Rate of Excess Net Point Gain}% \label{sec:max_renpg}% \index{examples!analytical solutions!rate of excess net gain|(indexdef}% \index{rate of excess net gain!analytical optimization|(} Consider the function $(\E(G_1)-G^T/N^p)/\E(T_1)$ where $G^T \in \R$ is a net gain success threshold. Using the statistics derived in \longref{ch:model}, this can be expressed by % \begin{equation*} \frac{\E(G_1) - \frac{G^T}{N^p}}{\E(T_1)} = \frac% {\overline{g^p}-\overline{c^p}-\frac{c^s}{\lambda^p} -\frac{G^T}{N^p}} {\overline{\tau^p}+\frac{1}{\lambda^p}} = \frac% {-c^s + \sum\limits_{i=1}^n p_i \lambda_i \left( g_i(\tau_i) - c_i \tau_i - \frac{G^T}{N^p} \right)} {1 + \sum\limits_{i=1}^n p_i \lambda_i \tau_i} \end{equation*} % Define % \begin{align*} a &\triangleq -c^s & a_j(\tau_j) &\triangleq \lambda_j \left( g_j(\tau_j) - c_j \tau_j - \frac{G^T}{N^p}\right)\\ d &\triangleq 1 & d_j(\tau_j) &\triangleq \lambda_j \tau_j \end{align*} % Using these definitions, $(\E(G_1)-G^T/N^p)/\E(T_1)$ fits the form studied in \longref{sec:optimization_rational}. \index{rate of excess net gain!analytical optimization|)}% \index{examples!analytical solutions!rate of excess net gain|)indexdef} \subsection{Maximization of Discounted Net Gain} \index{examples!analytical solutions!discounted net gain|(indexdef} Consider the function $\E(G_1) - w \E(T_1)$ where $w \in \R$. Using the statistics derived in \longref{ch:model}, this can be expressed by % \begin{equation*} \E(G_1) - w \E(T_1) = \overline{g^p}-\overline{c^p}-\frac{c^s}{\lambda^p} -w\overline{\tau^p}-w\frac{1}{\lambda^p} = \frac% { -(c^s+w) + \sum\limits_{i=1}^n p_i \lambda_i ( g_i(\tau_i) - c_i \tau_i - w \tau_i ) }% { \sum\limits_{i=1}^n p_i \lambda_i } \end{equation*} % Define % \begin{align*} a &\triangleq -(c^s+w) & a_j(\tau_j) &\triangleq \lambda_j ( g_j(\tau_j) - c_j \tau_j - w \tau_j )\\ d &\triangleq 0 & d_j(\tau_j) &\triangleq \lambda_j \end{align*} % Using these definitions, clearly $\E(G_1) - w\E(T_1)$ fits the form studied in \longref{sec:optimization_rational}. This is a constant disadvantage example. Consider fixing processing times to be parameters so that the excess rate of gain function in \longref{sec:max_renpg} is also a constant disadvantage example. Also take $G^T = 0$. In this case, the resulting rate of net gain function is nearly identical to the one studied in classical \ac{OFT}. In this constant disadvantage context (called the \emph{prey model} by \citet{SK86}), indexing by advantage-to-disadvantage ratio will often lead to the same ordering for both the rate of net gain and the discounted net gain functions. Therefore, if observational justification for the use of rate of point gain as an optimization objective is based entirely on task-type ranking, then discounted net gain is an equally valid optimization objective to consider. \index{examples!analytical solutions!discounted net gain|)indexdef} \subsection{Maximization of Rate of Excess Efficiency} \index{examples!analytical solutions!excess efficiency|(indexdef} Consider the function $(\E(G_1)+\E(C_1)-G_g^T/N^p)/\E(C_1)$ where $G_g^T \in \R$ is a gross gain success threshold. Using the statistics derived in \longref{ch:model}, this can be expressed by % \begin{equation*} \frac{\E(G_1)+\E(C_1)-\frac{G_g^T}{N^p}}{\E(C_1)} = \frac% {\overline{g^p}-\frac{G_g^T}{N^p}} {\overline{c^p}+\frac{c^s}{\lambda^p}} = \frac% {\sum\limits_{i=1}^n p_i \lambda_i \left( g_i(\tau_i) - \frac{G_g^T}{N^p} \right)} {c^s + \sum\limits_{i=1}^n p_i \lambda_i c_i \tau_i} \end{equation*} % Define % \begin{align*} a &\triangleq 0 & a_j(\tau_j) &\triangleq \lambda_j \left( g_j(\tau_j) - \frac{G_g^T}{N^p}\right)\\ d &\triangleq c^s & d_j(\tau_j) &\triangleq \lambda_j c_j \tau_j \end{align*} % Using these definitions, $(\E(G_1)+\E(C_1)-G_g^T/N^p)/\E(C_1)$ fits the form studied in \longref{sec:optimization_rational}. There are two major criticisms of optimizing efficiency \citep[p.~9]{SK86}. First, it ignores the impact of time. Second, it equates behaviors that bring small gains at small costs with behaviors that bring large gains at large costs. Together, an efficiency optimizer can spend large amounts of time for a small gain that is insufficient for survival. However, costs in our model are affinely related to time, so cost minimization exerts pressure on time as well. Additionally, efficiency is defined with a success threshold (\ie, excess efficiency), and so all behaviors that have positive efficiency also lead to survival. Therefore, if our model can be used, efficiency maximization may be a viable alternative to rate maximization. In a constant disadvantage context, the efficiency advantage-to-disadvantage indexing will be very similar to the indexing in \longref{sec:max_renpg}, and so evidence for the use of rate maximization may also justify the use of efficiency maximization. \index{examples!analytical solutions!excess efficiency|)indexdef} \index{examples!analytical solutions|)indexdef} \index{solitary agent model!processing-only analysis!optimization|)}% \index{solitary agent model!classical analysis!optimization|)}