Joao_Carvalho_Balloon Estimators for Improving and Scaling the Nonparametric Off-Policy Policy Gradient_Figure1
Joao_Carvalho_Balloon Estimators for Improving and Scaling the Nonparametric Off-Policy Policy Gradient_Figure1
Caption
Figure 1: Analysis of mean log-likelihood for different bandwidth factors for Balloon estimator and Gaussian (bandwidth). The plot shows that the Balloon estimator achieves higher likelihood across a range of bandwidths.