To address the issue of videos with captions being tiny, for the CSE site we were able to create a workaround by creating a "Video With Caption" view mode and enabling it for videos. This provided the figure object with a view-mode attribute, which we were able to attach CSS to (the default view options do not provide this class):
(The machine name of our view mode is "video-widget")
figure.media--view-mode-video-widget.caption,
.media--view-mode-video-widget.caption > figcaption
{
display: block;
}
This may be a useful approach to apply at the Folwell level. Let us know if you have any questions.