Look at his shutter speed at the top of his screen while he's flying, notice how it's changing going as high as 1/500~1/1000 sometimes.
He's not shooting properly for video. If you shoot with automatic shutter you're going to get that look of jerkiness especially at lower frame rates, and when turning/moving. This is why you would do something like the 180 degree shutter rule, which gives the motion blur a more natural appearance.
So if you shot at 30 fps , you double that number and shoot at a shutter of 1/60 , but conditions could be too bright for that, and without variable aperture, that's where ND filters come in. If he's needing to do 1/500 automatically, then he needs to reduce the light 1/500 -> 1/250 -> 1/125 -> 1/60 , 3 stops to get more natural motion blur. An ND8 (0.9) will give you 3 stop less light.
The reason the high shutter speed causes jerkiness is because the higher the shutter, the more freeze-frame the picture looks, giving a nice crisp image, you could likely take a single frame out of his footage and print it off even with him turning a bit. But because every frame is crisp, there's no motion blur to bridge one frame to the next and if you move fast enough there's a big gap between frames that appears to be jerky when it jumps from one to the next.
I haven't had any video/photo quality issues with Litchi since the
Mini 1,
Mini 2, and now the
Air 2S because for video I don't shoot in auto mode (though I am liking the automatic ISO on the
Air 2S as a form of automatic exposure since you don't want to touch your shutter speed). But the same issue would have been present in the fly app too so it's not software specific.
Edit: The other way you can end up causing jerkiness outside of drone is when you do the video editing. Despite the software (mainly in DJI's) telling you the frame rate is 24fps, it's actually 23.98. That's really important to know if you set up software like Premiere or Davinci Resolve with a 24fps timeline, with the footage not being exactly 24fps, it will do re-timing to compensate, the default retiming method is nearest neighbor, and you end up with jerky looking footage as a result of filling in the missing frames.