Index of values

A
alpha [Obandit.AlphaUCBParam]

The $ \alpha $ parameter.

alpha [Obandit.AlphaPhiUCBParam]

The $ \alpha $ parameter.

C
c [Obandit.DecayingEpsilonGreedyParam]

The $ c$ hyperparameter.

D
d [Obandit.DecayingEpsilonGreedyParam]

The $ d$ hyperparameter, a tight lower bound on $ \max_{i=1,\cdots,K} \Delta_i $.

E
epsilon [Obandit.EpsilonGreedyParam]

The $ \epsilon $ parameter.

eta [Obandit.FixedExp3Param]

The fixed learning rate $ \eta $.

I
initialBandit [Obandit.RangedBandit]
initialBandit [Obandit.Bandit]

The initial state of the bandit algorithm.

invLFPhi [Obandit.AlphaPhiUCBParam]

The inverse of the Legendre-Fenchel transform of $ \psi $.

K
k [Obandit.HorizonExp3Param]

The number of actions $K$ .

k [Obandit.FixedExp3Param]

The number of actions $ K $ .

k [Obandit.EpsilonGreedyParam]

The number of actions $ K $ .

k [Obandit.DecayingEpsilonGreedyParam]

The number of actions $ K $ .

k [Obandit.RateBanditParam]

The number of actions $K$ .

k [Obandit.KBanditParam]

The number of actions $ K $ .

k [Obandit.AlphaUCBParam]

The number of actions $ K $ .

k [Obandit.AlphaPhiUCBParam]

The number of actions $ K $ .

L
lower [Obandit.RangeParam]

The lower value of the range

N
n [Obandit.HorizonExp3Param]

The $ n $ parameter, the horizon to optimize for.

R
rate [Obandit.RateBanditParam]

The rate.

S
step [Obandit.RangedBandit]
step [Obandit.Bandit]

step r advances the bandit game one step, where r is the reward for the last action.

U
upper [Obandit.RangeParam]

The upper value of the range