Index of values


A
alpha [Obandit.AlphaUCBParam]
The $ \alpha $ parameter.
alpha [Obandit.AlphaPhiUCBParam]
The $ \alpha $ parameter.

C
c [Obandit.DecayingEpsilonGreedyParam]
The $ c$ hyperparameter.

D
d [Obandit.DecayingEpsilonGreedyParam]
The $ d$ hyperparameter, a tight lower bound on $ \max_{i=1,\cdots,K} \Delta_i $.

E
epsilon [Obandit.EpsilonGreedyParam]
The $ \epsilon $ parameter.
eta [Obandit.FixedExp3Param]
The fixed learning rate $ \eta $.

I
initialBandit [Obandit.RangedBandit]
initialBandit [Obandit.Bandit]
The initial state of the bandit algorithm.
invLFPhi [Obandit.AlphaPhiUCBParam]
The inverse of the Legendre-Fenchel transform of $ \psi $.

K
k [Obandit.HorizonExp3Param]
The number of actions $K$ .
k [Obandit.FixedExp3Param]
The number of actions $ K $ .
k [Obandit.EpsilonGreedyParam]
The number of actions $ K $ .
k [Obandit.DecayingEpsilonGreedyParam]
The number of actions $ K $ .
k [Obandit.RateBanditParam]
The number of actions $K$ .
k [Obandit.KBanditParam]
The number of actions $ K $ .
k [Obandit.AlphaUCBParam]
The number of actions $ K $ .
k [Obandit.AlphaPhiUCBParam]
The number of actions $ K $ .

L
lower [Obandit.RangeParam]
The lower value of the range

N
n [Obandit.HorizonExp3Param]
The $ n $ parameter, the horizon to optimize for.

R
rate [Obandit.RateBanditParam]
The rate.

S
step [Obandit.RangedBandit]
step [Obandit.Bandit]
step r advances the bandit game one step, where r is the reward for the last action.

U
upper [Obandit.RangeParam]
The upper value of the range