HighD dataset out-of-memory

Galen233 · March 7, 2022, 8:43am

Hi,

I am trying to load the complete highD dataset by following the tutorial at Files · master · tum-cps / commonroad-rl · GitLab

Transfer .csv to .XML and then to .pickle,

but then I find the dataset is too big (near 10 GB). When I initialize the commonroad-RL envs, the machine reports out-of-memory error. I guess it is because commonroad-RL is loading all the data to memory. Is there a way we can do this more efficiently? I guess people must have had similar issues before but I cannot find any clues after a quick search. Could you briefly explain this or point me to the right place?

Hoping for your reply,

Galen Liu

xwang · March 8, 2022, 2:08pm

Hi Galen,

Are you using multiple processes to train on the whole highD dataset? We provided a script to scatter the whole dataset into subfolders here, then each subprocess should only load from its own subfolder. Was that the problem or do you mean the size of the total dataset (10GB) already exceeds the RAM of your machine?

Best,
Xiao

Galen233 · March 10, 2022, 9:26am

Hi Xiao,

I am actually using multi-process and I have split the dataset with your code. I eventually load the complete environment, and I have traced the memory used by commonroad-RL and it ends up using 78888.636416 MB (78 GB) instead of 10 GB of RAM. It is not surprising since 10GB is just for the data and the envs must have other things to store. My local machine can not support this cost and I am actually using a very strong cluster. I just feel that loading the entire dataset and envs to RAM is not efficient. Maybe putting part of the dataset in the hard disk and loading them consistently during training is more feasible.

I am not sure if I have my points. Pls let me know if you have more thoughts about this.

Best,
Galen

xwang · March 16, 2022, 9:58am

Hi Galen,

Yes, this is a trade-off between RAM and runtime. Loading scenarios online will result in longer training time. But if RAM is a bottleneck, then it would be a better option for you. The reset() method of our gym env supports passing the scenario and planning problem as arguments, see here. You can definitely implement in your training script that whenever done=True and the env needs to be reset, load an .xml from hard disk and pass the loaded scenario and planning problem to env.reset().

However, you should be aware that the env wrappers in stable baselines 2 do not support arguments of reset() method. So you would need to change the source code of those wrappers on your local after you install stable baselines 2.

Best,
Xiao

Galen233 · March 16, 2022, 10:11am

Hi Xiao,
Thanks for your help. I will check and see how to implement it.

Best,
Galen