Functor Obandit.WrapRange

module WrapRange: functor (R : RangeParam) -> functor (B : Bandit) -> RangedBandit  with type bandit = B.bandit

The WrapRange functor wraps a bandit algorithm with the doubling trick. This heuristic allows to use a bandit algorithm without knowing the reward ranges. All rewards are linearly rescaled to a range (initially given by a RangeParam). When a value is observed above the range, the bandit algorithm is restarted and the range interval is doubled in that direction.

Parameters:

`R`	:	`RangeParam`
`B`	:	`Bandit`

type bandit

val initialBandit : bandit Obandit.rangedBandit

val step : bandit Obandit.rangedBandit ->
       float ->
       Obandit.rangedAction * bandit Obandit.rangedBandit