Joao_Carvalho_Balloon Estimators for Improving and Scaling the Nonparametric Off-Policy Policy Gradient_Figure1

Joao_Carvalho_Balloon Estimators for Improving and Scaling the Nonparametric Off-Policy Policy Gradient_Figure1

Caption

Figure 1: Analysis of mean log-likelihood for different bandwidth factors for Balloon estimator and Gaussian (bandwidth). The plot shows that the Balloon estimator achieves higher likelihood across a range of bandwidths.

Participating Universities