The aim is to find such weights for MC, that makes original distribution look like target distribution. I've found a way to use here gradient boosting over regression trees, though it is quite different from usual GBRT in loss, update rule and splitting criterion.
Update rule of reweighting GB
Here is the definition of weights:
w={w,event from target distributionepredw,event from original distribution,
so as you see, we are looking for multiplier epred, which will reweight the original distribution.
Pred is raw prediction of gradient boosting.
Splitting criterion of reweighting GB
The most obvious way to select splitting is to maximize binned χ2 statistics, so we are looking for the tree, which splits the space on the most 'informative' cells, where the difference is significant.
χ2=∑bins(wtarget−woriginal)2wtarget+woriginal
Here I selected a bit more symmetric version, though this was not necessary.
Computing optimal value in the leaf
since we are going to remove difference in distributions, the optimal value is obviously:
leaf_value=logwtargetworiginal
No comments :
Post a Comment