When decided to use reinforcement learning to train a robot, it doesn't matter whether the robot is hard (rigid-body) or soft (flexible-body) because RL methods are model-free and specifically intended to learn through trial-and-error interactions with the environment without an explicit mathematical model of the robot.
In pure simulation environments, the true challenge is probably in modeling the flexible-body robot in Simulink using Simscape blocks. Despite RL are model-free, you still need to build the physical model for the robot so that your RL agent can interact with it correctly. Most beginners rely on existing models in Simscape examples rather than having the complete mathematical knowledge of the coupled dynamics of the flexible bodies to build the robot model from scratch.
