 |
Active learning in multi-armed bandits
András Antos, Varun Grover and Csaba Szepesvari
Book Section
Item availablity restricted.
(02 October 2008)
|
 |
Active learning with heteroscedastic noise
András Antos, Varun Grover and Csaba Szepesvari
Article
Item availablity restricted.
(17 June 2010)
|
  |
Fitted Q-iteration in continuous action-space MDPs
András Antos, Rémi Munos and Csaba Szepesvari
Book Section
Item availablity restricted.
(2008)
|
 |
Learning near-optimal policies with Bellman-residual minimization based fitted policy iteration and a single sample path
András Antos, Csaba Szepesvari and Rémi Munos
Article
Item availablity restricted.
(April 2008)
|
 |
Learning near-optimal policies with Bellman-residual minimization based fitted policy iteration and a single sample path
András Antos, Csaba Szepesvari and Rémi Munos
Book Section
Item availablity restricted.
(29 September 2006)
|
|
On codecell convexity of optimal multiresolution scalar quantizers for continuous sources
András Antos
Article
Item not available online.
(06 February 2012)
|
|
Online Markov Decision Processes under Bandit Feedback
Gergely Neu, Andras Gyorgy, Csaba Szepesvari and András Antos
Book Section
Item not available online.
(December 2010)
|
 |
Online Markov decision processes under bandit feedback
Gergely Neu, Andras Gyorgy, Csaba Szepesvari and András Antos
Conference or Workshop Item
Item availablity restricted.
(06 December 2010)
|
|
Toward a classification of finite partial-monitoring games
András Antos, Gábor Bartók, Dávid Pál and Csaba Szepesvari
Article
Item not available online.
(21 February 2011)
|
 |
Value-iteration based fitted policy iteration: learning with a single trajectory
András Antos, Csaba Szepesvari and Rémi Munos
Conference or Workshop Item
Item availablity restricted.
(05 April 2007)
|