How to change the reward function

kevinkang · June 24, 2022, 8:23am

Dear developers,
I’m using the commonroad_RL package now. Where can I change the default reward function? I’m using the ppo algorithm, and I have changed the value in the configs.yaml, nothing seem to change, the total reward is still coverage to the same value, is there any other place need to be changed?

Best Regards,
Kevin

xwang · June 24, 2022, 9:16am

Can you give more details? Which reward type did you use and which reward values did you change?

kevinkang · June 24, 2022, 9:32am

So the problem is , the mean reward of the training converges to -0.1, and I choose the dense reward mode. After setting all the reward value to be positive, it still not changed. Here are the learning curve and reward function of my situation, the blue one is what I am talking about

kevinkang · June 24, 2022, 9:34am

Is there normalization or other kinds of process being applied to the results automatically ?

Beyond · June 25, 2022, 2:51pm

Hello, I am also using Commonroad-RL. You said you use dense reward, but from your picture, you config the hybrid reward parameters. Maybe you can try to config the dense reward parameters instead like this.

However, I config the adapted parameters, but I still get the similar situation just like you. I use sparse reward model and config the same parameters to the Commonroad-RL paper.
It’s great if you have any process and willing to share your idea. Good Luck to everyone!

kevinkang · June 27, 2022, 2:16am

yeah, I mean, even for the dense reward, there shouldn’t be negative value for the total reward from the configuration right?

xwang · July 7, 2022, 8:18am

Hi Kevin,

Actually the signs of all reward terms are incorporated in the code since punishment or rewarding is already settled, i.e., the users only need to tune the values of the coefficients, not the signs.

But yes, the VecNormalize wrapper provided by stable baselines provide normalization for the rewards. See here.

Best,
Xiao