Video with different audio and video durations for HLS playback

I am trying to generate ABR manifest for a video that has a slightly longer video stream than audio stream. Running mediastreamvalidator tool on the manifest I get an error "Different content duration detected between discontinuities". I assume it is caused by my generated manifest having video segments extent past the last audio segments in seconds.

Should I shorten the video playback to be the shorter between the two?

Should I pad out the video with empty segments so that both streams have the same duration?

My first suggestion would be to run hlsreport with the --disc' option (or the --verbose` option). That will show you all the discontinuity domain start time and durations. Then you can determine whether it is complaining about the last domain or something else.

You say the video is longer than the audio. Extending the audio with silence is a reasonable approach. (Is that what you meant to have as your second suggestion above?)

I don't understand what is the discontinuity information duration table saying. What is the domains column? What does it mean if the start time table is all empty dashes? Also, I tried running the validator tool against the same file without differing video and audio stream duration and there was no error about "Different content duration detected between discontinuities" but there was still a shorter discontinuity information table. So is there something else that could be causing that warning instead of the total duration of the two streams being different in seconds?

And yes I meant to extend the audio stream to match the video stream duration by padding the request and manifest with silence or padding video with empty P frames.

I see that R11-R14 are 5 seconds shorter than the others. That is what it is unhappy about. The dashes on start time mean it was not able to get the start time.. Probably because your content is encrypted? Yes, pad the audio.

Also, can we just allow the manifest to have a video stream that is longer than the audio stream and vice versa? The manifest is pointing to JIT transcoded segments, and it's extra complexity to pad with appropriate silence or empty frames instead of simply matching the source file which has a shorter audio stream. Is it really required to have matching content duration between audio and video streams? It's also not impossible that the discrepancy in the source file between audio and video stream lengths gets over 10s or 30s.

Dash and VLC can playback files with differing video and audio stream lengths fine so I'm wondering if this is a real requirement.

Video with different audio and video durations for HLS playback
 
 
Q