Vanilla IL (e.g., Behavior Cloning) is known to be sensitive to covariate
shift: as soon as the learned policy deviates from the expert policy, errors begin to compound, leading the system to drift to new and possibly dangerous states.
-- On the Sample Complexity of Stability Constrained Imitation
Learning