How to change the reward function

Dear developers,
I’m using the commonroad_RL package now. Where can I change the default reward function? I’m using the ppo algorithm, and I have changed the value in the configs.yaml, nothing seem to change, the total reward is still coverage to the same value, is there any other place need to be changed?

Best Regards,
Kevin

Can you give more details? Which reward type did you use and which reward values did you change?

So the problem is , the mean reward of the training converges to -0.1, and I choose the dense reward mode. After setting all the reward value to be positive, it still not changed. Here are the learning curve and reward function of my situation, the blue one is what I am talking about


Is there normalization or other kinds of process being applied to the results automatically ?

Hello, I am also using Commonroad-RL. You said you use dense reward, but from your picture, you config the hybrid reward parameters. Maybe you can try to config the dense reward parameters instead like this.
image
However, I config the adapted parameters, but I still get the similar situation just like you. I use sparse reward model and config the same parameters to the Commonroad-RL paper.
It’s great if you have any process and willing to share your idea. Good Luck to everyone!

1 Like

yeah, I mean, even for the dense reward, there shouldn’t be negative value for the total reward from the configuration right?

Hi Kevin,

Actually the signs of all reward terms are incorporated in the code since punishment or rewarding is already settled, i.e., the users only need to tune the values of the coefficients, not the signs.

But yes, the VecNormalize wrapper provided by stable baselines provide normalization for the rewards. See here.

Best,
Xiao