-
Media Playback
Smooth audio and video playback is vital to the overall Apple TV experience across many types and categories of apps. Gain insight into how to ensure robust playback of high quality media with AVFoundation, how to provide the best native media playback experience within your app, and how FairPlay Protected Streaming can offer industrial strength content protection for your media.
Apple TV Tech Talks - Session 6 - tvOS
-
So the Apple TV is very much focused on a communal living room experience. And so smooth and beautiful audio and video playback is actually a component of so many apps on the platform and not just those that are focused on say TV shows and movies.
When you think about it, media plays very heavy role in gaming or shopping or all sorts of different types of content. And so given that all of these different apps require the ability to playback media, let's take the closer look to all of this. In this segment, we're going to talk about media playback on tvOS and how to provide the best possible experience in your apps. First, I'm going to give you some insights and tips on how to get the most out of the native player UI on tvOS, AVPlayerViewController. Then we're going to dig in a bit into how AVFoundation works and how you can use it to playback both file-based as well as streaming resources. Then looking closer at that, HTTP Live Streaming is a really great way to stream media content to the Apple TV. And so we're going to look at how that works and how to best prepare your media for that. And finally, we're going to very briefly talk about FairPlay Protected Streaming, the strongest content protection on Apple devices. With that, I'd like to introduce to you to AVPlayerViewController, the native media playback UI on tvOS. And one of the hallmark elements tvOS is this new video player. The new AVPlayerViewController offers pretty much all of the features you might need such as an amazing and smooth scrubbing experience including trick play functionality.
Trick play offers this beautiful sliding thumbnail view allowing people to be both extremely fast as well as accurate when trying to seek to a certain spot in the video that they're watching.
Also, has an info panel so that when people swipe down on the remote, they can reveal this allowing them to view metadata about the current piece of content or perhaps navigate between chapters or markers in the video. It also takes care of subtitles and closed captions and there are selection in the subtitles tab, as well as selection of audio language, dynamic range control for night viewing as well the ability to select an airplay destination for audio output in the audio tab. So all of this and some more that we're going to speak about are available to you with very little effort, and it allows you to offer people a really intuitive, native video playback experience that they quickly come to expect on the platform. Now let's take a step back and take a look at the media stack on the tvOS.
And now, this should be quite familiar since it mirrors very much what is on iOS and OS10. AVFoundation sits on top of core audio, core media and core animation. And these act as the engine for your media playback allowing you not to have to worry about the underlining implementation details involved in playing back both file-based and streaming media. AVKit then sits on top of UIKit and handles the presentation and user interactions involved with media playback. Now zooming in a bit on how AVFoundation and AVKit works together, we can see that the lower three classes, AVAsset, AVPlayerItem and AVPlayer are all part of AVFoundation. And as I mentioned, AVPLayerViewController is part of AVKit providing the user interface for media playback. Now let's take a brief look at how these all kind of fit together.
The AVAsset is what represents a single media asset and you create one by either pointing it to a file-based asset, something that might be stored locally on the device and maybe you downloaded this using ODR or maybe it was included in the bundle initially. Or you can point it to a streaming asset such as an HTTP Live Streaming or HLS master playlist. An important thing to note though, as you venture into using AVFoundation is that these classes are really asynchronous in nature. They're designed to avoid blocking as they're loading media or accessing remote resources.
So as a result, you're going to find that you work a lot with key value observing as well as listening for notifications. Now if you need to put your media assets on the remote server behind authentication, or you need some sort of custom behavior when fetching encryption keys, please take a look at the documentation for AVAssetResourceLoaderDelegate which can help you out with all of this. Now the next layer up on this stack is AVPlayerItem.
You initialize one of these by pointing it to an AVAsset. This class is then responsible for managing the presentation state of your media.
Now this might include things like audio track selection or subtitle language selection. On tvOS, it also actually includes some new properties for the programmatic selection and association of metadata with a given asset. And we're going to take a little closer look at this in a moment.
Managing the playback of the AVPlayerItem is the job of AVPlayer. AVPlayer handles pause, play, seeking, as well as playback rate. It references a single player item at a time but AVQueuePlayer actually allows you to load up a sequence of items that you might playback one after another. AVQueuePlayer will perform buffering and preparation for playback ahead of playback time.
This allows you a much more seamless transition between desperate pieces of media. It should be noted though that each of the items in that queue are represented as an individual unique timeline in AVPlayerViewController as opposed to presenting all of the items end on end in a single timeline in the UI, which leads us to AVPlayerViewController. This is part of the AVKit framework. This class provides all of the player UI and displays of audiovisual output from an instance of AVPlayer or AVQueuePlayer. As discussed earlier, it provides the system playback controls including that gesture based trick play functionality. It also has that info panel displaying the media metadata such as title and description, offers chapter navigation and we now have some management of interstitial content in here as well.
But first, how do we get the metadata into that panel? Jumping back to AVPlayerItem, there are now some extensions on AVPlayerItem that come as part of AVKit on tvOS.
Now you might choose to embed your metadata into your media file and AVFoundation or AVKit will actually just utilize these to populate the info panel automatically. But what if you're actually unable to embed the metadata into the media file or what if you want to perform the association of that metadata at runtime? There are actually now three new properties on AVPlayerItem on tvOS that are going to allow you to programmatically manage the metadata presented in AVPlayerViewController. These properties are navigationMarkerGroups, externalMetadata and interstitialTimeRanges. So let's take a closer look at these. First let's talk about externalMetadata, this metadata shows up in the first tab of the info panel. It includes things like the title, description, artwork as well as content rating.
To set the external metadata property, you can provide it with an array of AVMetadataItem objects, one for each piece of supported metadata that you would like to display.
And although the system supports a very large number of metadata identifiers, the info panel currently supports five types.
It should also be noted that populating this information is required to inform the system of what is correctly playing and this is part of enabling the, "What did she say," feature of Siri to work. We'll take a look at that a little bit later here. Moving on, the navigationMarkerGroups property allows you to define important points in your media such as chapter markers, or perhaps goal scored in a sports match of some kind. This metadata shows up in that first tab of the info panel, allowing people to select and jump to a specific point in time in the media. The navigationMarkerGroups property is set using in array of AVNavigationMarkerGroup objects.
Now although, this is an array, we currently support only the first group. Each of these groups then contains an array of AVTimedMetadataGroup objects, one for each point in time you would like to represent. Now each of these then are formed by AVMedataItem objects providing the title and artwork metadata associated with a given range and time. Finally, there's the interstitialTimeRanges property.
Interstitials typically include regions of video that are not actually part of the core content. So this might be something like advertisements or legal text or maybe some content warnings. You might have want to have specific playback or presentation behavior during this time ranges such as preventing people from fast forwarding through them. The best practice when dealing with interstitials on our platforms is actually to provide a single, unified master playlist or media item that contains both the content and the interstitials. Now some of you might have experienced or developed an approach where you have two separate players, one for the content and one for the interstitials and you're constantly swapping them back and forth.
This experience tends to be really jarring for people using the app and it doesn't provide a really smooth transition as you move between the two items. It's much better to provide a stitched interstitial. Now we'll actually collapse these interstitialTimeRanges on the U -- in the UI in AVPlayerViewController. But it's important to note though that if you're performing seeks or otherwise, interacting with the asset programmatically, you're still dealing with the full asset time inclusive of the marked interstitial regions.
So let's see how this looks. Here's a paused AVPlayerViewController, I want you to note the time underneath the scrubber mark as well as the time remaining on the right hand side. These times only include actual content time. Even though the overall asset include some ads, because these timeframes are actually -- because the ad timeframes are declared as interstitials, that time is actually subtracted out from what is displayed in the UI. Note that we also represent the four interstitial regions with little dots on the timeline.
These affordances are really helpful for the person enjoying your content to understand where they are between advertisement breaks and to really orient where they are in your program. Your AVPlayerViewControllerDelegate can actually receive some callbacks now when the playhead crosses into an interstitialTimeRange as well as when it leaves an interstitialTimeRange. This is your opportunity to perform any business logic you may need to such as disabling, seek and scrub actions and you can do this by setting the requiresLinearPlayback property on AVPLayerViewController. Or at this point, you might want to programmatically seek the playhead passed that interstitial region, maybe if that commercial is already been seen once, for instance.
Now you might also want to prevent the person using your app from skipping past an interstitial region.
Well, you can do that by implementing playerViewController, will resume playback after user navigated from time to time which is going to inform you every single time the person performs a seek using the remote. So let's just take a look at the timeline to practically see how these all plays out. Here we have a single long AVAsset. The green sections are what we would describe as interstitials, legal text and advertisements. And the blue sections are the actual content, the movie or TV show itself.
But this is just one long continuous video stream. Now we're going to use the interstitialTimeRanges property to declare the green sections as interstitial regions, which means that as described earlier, we're actually going to collapse those off the visual timeline in the AVPlayerViewController.
So the top timeline in here represents the actual asset, what you're going to be interacting with programmatically.
The bottom one represents the timeline as presented in the UI. Now when the person presses the play/pause button on the remote to begin playback, the first thing that's going to happen is you're going to get the will present interstitialTimeRange callback on your AVPlayerViewControllerDelegate.
This is going to be your opportunity to do something like, disable seeking using the remote by setting that requiresLinearPlayback property on AVPlayerViewController. Then, when then interstitial completes, the didPresentInterstitialTimeRange callback is going to occur and this is your opportunity to get back into normal content mode such as again allowing for seeking through content using the Siri Remote.
Also note that the playhead hasn't actually moved on screen yet. It's still showing to be at the beginning of the content because we haven't actually watched any content yet, only an interstitial. As that first section of content plays back, the playhead a moves in the UI until we get to that next interstitial where we're going to get that will present callback again. And then that did present callback after the ad completes.
And again with the next ad and then the callback after that last ad completes. Now note that the playhead is never moving onscreen during any of these ad breaks. But time is still ticking along in the underlying asset. Now what if at this point, someone decides that they want to seek to a certain points in the content? Well as long as that requiresLinearPlayback is set to false, they're going to be able to use the Siri Remote to swipe the playhead back into the content. Which is going to then trigger the player underneath to perform a seek event.
But your AVPlayerViewController delegate also gets a callback every single time that a person performs a seek event which is going to let you know in AVAsset time where the seek started and where it finished. At this point, you could just simple allow playback to continue or maybe you need to enforce an ad break at which point, you could now seek the user programmatically to watch that ad. And then once that ad completes, seek them back to where they originally wanted to go. And that's how you can now manage interstitial content using AVPlayerViewController on tvOS.
Now before we wrap up this section here, let's take a look at a quick bit of code showing you how you might instantiate the playback of a simple HLS video in an AVPlayerViewController. First, we're going to create an NSURL pointing to the master playlist, ensure that it's not nil and create an AVURLAsset with it. We'll create an AVPlayerItem using the asset. And then an AVPlayer with that item.
Build out the AVPlayerViewController. And then assign the player to the player property of the AVPlayerViewController.
Then we're going to present it like we would any other view controller. In this case, we're actually using the completion block to trigger playback. And that's it. It's pretty straightforward to set this up.
But we can actually trim a couple lines out if you don't need to access those objects on the way there. So here, we're actually just going to create the AVPlayer direct from the NSURL object. And this would be if you don't need to actually directly access the AVAsset or AVPlayerItem objects during that setup sequence. Dismissal of AVPlayerViewController is generally handled automatically when the view is presented modally such as when the person clicks the menu button to back out. It's going to dismiss that view controller just as you would expect. However, it might be the case that you actually want a callback when that controller is being dismissed. Maybe you need to know that the person has decided to stop watching the video.
Or maybe, you've actually embedded the AVPlayerViewController inside of another view controller to allow for some sort of custom experience or maybe you're only going to want the view controller to occupy a small part of the screen. Well, in this case, you're going to have to take care of the dismissal yourself. So, we also really strongly discourage and actually say outright, don't do it in the header, we discourage you from subclassing AVPlayerViewController. So how are you going to handle this? Well, in this case, you're going to now assign a UITapGestureRecognizer to the view of the AVPlayerViewController allowing for presses of type menu. Now, if you do this, you're going to be responsible also for pausing playback and setting the player's current item to nil. In that way, you can prevent auto-resume functionality of the AVPlayerViewController.
Otherwise it's possible that you could experience continued playback even if you remove the object from the view hierarchy. But let's take a look at a brief code sample on showing how to deal with this UITapGestureRecognizer. First, we're going to create that recognizer, set it up to call method and have at handle the custom dismissal. Here we're setting allowed pressed types to look for presses of type menu, since in this case, we only care about the menu button. And then we're going to assign that recognizer to the view of the AVPlayerViewController.
Now, this way, when the menu button is clicked and the view controller would otherwise be automatically dismissed, our method is now called.
And that wraps up the section on how to leverage AVPlayerViewController for media playback in your apps.
In summary, we've got an amazing playback experience provided by AVKit and people intuitively understand and it offers also a lot of control and features for you as developers. Now in most cases, we would strongly recommend that you use AVPlayerViewController for your media playback on tvOS. But there are some rare cases where it might be more appropriate to actually create your own player experience. And if you absolutely need to do that, AVPlayerLayer is available to you on tvOS.
It provides no controls or UI, it simply displays the media. Now, a great example of this use case would actually be Zova, a fitness app that we've shown a couple of times here already and I'd like to dig in to more now. So, we're all going to do some dumbbell squats. Nobody is getting -- oh here, we go.
Excellent, excellent. So imagine that I'm doing this dumbbell squats and we're really just going to imagine that.
And I'm either interrupted or I just want to try them again. All I have to do is swipe back on the remote and we'll start the exercise over again.
Really simple. All right. Now, more likely, maybe I don't feel like doing any dumbbell squats. Well, all I have to do now in this case is swipe to the right in order to skip on to the next exercise.
The key here is that a traditional video player experience wouldn't actually make any sense for this app.
And this sort of custom implementation in this case actually works a whole lot better. But in our experience, these kinds of use cases are actually very few and far between. So please really be cautious before you go this route and make sure that it really offers the best user experience for your application. Now that we've talked through the new player, I'd actually like to take a step back and discuss the types of media that you're going to likely encounter while developing your apps.
And to kick this off, I'd like to demo a game that we've built with this epic video sequence to kick it off. Now, this poor app, we can never seem to get it right our team but let's see how this goes. So, remote in hand, we're ready to rock here and just I can't tell you how excited I am to show this to you. OK. So, it's the campaign that all kicks off with this epic intro video so we're going to select that.
So excited. OK. Ah, here we go. All right.
So check this out. All right. Probably, just be one.
Oh, no I guess I need do that a couple of times. [ Music ] Well that's a pretty rough kick off to a game.
Ugh. All right, well, I'm sorry about that. I realized that wasn't really quite what I promised. It's a lot of stalls.
But I'm sure that you've realized that this isn't actually an app. And we've mocked this video up to highlight an issue that we actually see all the time. I'm sure you've also all experienced something like this before and it's a really disappointing way to kick off an app or get into your content. So let's talk a little more about how to avoid this. In this case, the app actually had a really high quality video located on the remote web server. And if we had a super fast connection, we might have only waited a short while before video started playback and we may have even actually made it through the entire video without it stalling or dropping frames. But in this case, our connection really wasn't very great and so it was a pretty rough experience. It really goes to show that you can't actually make any assumptions about the available bandwidth on a device.
It's highly dynamic and can change from moment to moment. Even in the living room environment which brings us to this.
There's really two ways, kind of core ways to deliver media content, file-based and streaming. With file-based assets, the asset typically resides locally on the device. It might have been part of the app bundle or maybe it was downloaded using ODR.
The file needs to be mostly downloaded before playback starts. And ideally, actually resides entirely on local storage which was very much not the case for that example. This means that it's actually only appropriate for very short clips if they are not stored locally.
It also doesn't supply or it doesn't support live media, only pre-recorded. With streaming assets, we're going to leverage HTTP Live Streaming, a technology that's going to adapt to the available bandwidth where media assets are broken down into short segments that are downloaded sequentially and then reassembled while the asset is being played back. This is going to allow for a really fast startup time as well as support both live and pre-recorded streams. And this is an ideal way to deliver video content to tvOS. So let's take a closer look at HLS.
HLS allows you to provide you a wide variety of bitrates and resolutions of video. And AVFoundation is going to dynamically select the appropriate variance based on the attached output device as well as the available bandwidth. It also allows for high performance seeking and scrubbing through the use of i-frame only playlists enabling that trick play functionality that we discussed earlier. In addition, it supports multiple audio variance as well as multiple subtitles. You can even name multiple sources for the content such as primary and back up servers allowing for seamless failover if necessary. The HLS master playlist is a plain text file that provides details on all of the available variance, allowing AVFoundation to make optimal choices before and during playback in order to provide the best possible experience.
You actually create and AVAsset by pointing it to the URL of the master playlist which typically ends at m3u8 as the file extension.
Now we're not going to go into detail about how to alter the playlist. But please visit developer.apple.com where you can find the HLS authoring guide for tvOS as well as some sample streams that you can reference to help get started.
As I mentioned a moment ago, this is going to allow the device to adapt to available bandwidth, switching between available variance as appropriate.
So let's take a look at how this might work in practice. Each of the blocks on the right hand side is a segment of video stored remotely on the server.
AVFoundation is going to sequentially download and play them back recreating the continuous video stream. So this is what might happen on a high performance connection.
The initial variant is always the 2000 kbps one as recommended in our authoring guidelines. But in this case, because the bandwidth available is really good, AVFoundation is going to automatically switch to the highest bitrate option right away. It's also now leveraging the AC-3 audio track because the Apple TV happens to be attached to a home theater receiver that supports multi-channel audio and the person watching the video has opted to display English subtitles. So we're going to download those as well. Note that we're only downloading the parts that are actually required. We're not downloading all of the options all of the time. Now that's a pretty simple example, but how about on a lower bandwidth connection that's -- might be kind of fluctuating? Well starting again with that default at 2000 kbps, AVFoundation is going to automatically switch to lower bitrate options in order to ensure that video playback doesn't stall or drop frames because of the low bandwidth conditions. But if maybe more bandwidth becomes available, it could also switch back to higher bitrate options as appropriate. The important things here to note is that playback starts quickly, it doesn't stall and frames aren't dropped. Let's take a look at this media is put together. Your video needs to be encoded using H264 and Apple TV supports up to High Profile Level 4.2. We're going to strongly recommend that you put key frames every two seconds in order to improve the seek performance of your content. And a great new feature of Apple TV, of the new Apple TV is support for 60 frames per second video which is really great for sports and fast action content, it just looks absolutely stunning on the big screen. The authoring guide I mentioned earlier available on developer.apple.com describes the recommended bitrates, resolutions and frame rates for each of the video variance that you should host on your server for playback.
Note that we'll recommend using the 2000 kbps one as the default and you do this by listing it first in that master playlist. This is really a good compromise between the fast start up on a slow connection and a reasonable resolution on a faster one. Now for audio, we actually support one or more elementary audio streams. Although you could actually embed the audio into the video payload, it's actually better to have it separate. At a minimum, you need to provide a stereo audio track encoded using AAC.
But if you want, you can also provide a Dolby AC-3 variant to support up to 5.1 surround audio. And in addition to that, with the new Apple TV, we also support enhanced AC-3 allowing for 7.1 audio. I'd also love to point out, we actually support discreet surround audio output for your games as well. This is great opportunity to create a really rich listening environment that just has the games extend right off the screen and into the living room. We would also encourage you to provide the correct loudness metadata in your audio files. This is going to help ensure that audio levels are consistent from media items and media item as well as in between apps.
If you provide multiple audio options such as multiple languages, people can actually select their preferred language using that audio panel in AVPlayerViewController, but tvOS will actually automatically select the appropriate language based on the device locale.
But this can allow for a bit of an override. You can also select a specific audio track programmatically if you'd like.
Another common use of additional audio track options is to provide a described to video track. This allows for those with vision impairments to hear a special audio track with additional narration about the visual scene, really increasing the accessibility of your content.
If you do offer a described video track, please include the special attributes in your HLS playlist, this way, we'll automatically select this track if the system accessibility preferences indicate that we should do so. Now let's move on to trick play.
Trick play offers this beautiful sliding thumbnail view allowing people to be extremely fast and accurate when seeking to a certain spot in the video they're watching.
In order to support this, when using HLS, we require the addition of some special playlists. And you have two options in supporting this.
The best option is to offer a set of one frame per second dense i-frame tracks. This is effectively a sequence, a rapid sequence of thumbnails and provides the best possible performance and user experience. If you're unable to provide those special encodes, your playlist could also reference the iFrames in the main video media. It's not going to work as well as the prior option but it's still much better than not providing either of them.
The last type of track though to include is part of your HLS payload are subtitles. Subtitles are going to allow for those who are hearing impaired to enjoy your content by providing a text representation of the music, dialogue and sound effects. This also enables the Siri what did she say functionality which is going to cause the video to seek backwards by 15 seconds, temporarily enabling subtitles during that section of playback.
It's a really great experience. So let's take a look. A kindred spirit.
A dragon. So I'll press the Siri button on the remote, what did she say? [ Inaudible Remarks ] Someone very dear. A kindred spirit. A dragon. And after we leave that 15-second repeat, it's just going to turn subtitles off again. It's a great way, you're in a noisy room and wanted to see what was said right there, all you have to do is include subtitles in your video, set the metadata and you're set. It's a really cool feature.
You provide captions or subtitles by either embedding them into the video transport stream itself or by providing subtitles in the WebVTT format. Please do include the appropriate attributes and the language tags in your master playlist so that AVFoundation can make selections appropriately based on the accessibility and localization preferences of set on the device.
There's also a great session from WWDC 2013 that goes into a lot of depth on managing media accessibility and we really encourage you to check it out.
One other important note is that like all network connections starting with iOS 9 and now tvOS, application transport security requires the use of secured connections. We really encourage you to deliver your playlists and media content securely as this is actually going to be a requirement in the future. So it's best not just simply put a whole bunch of exceptions in to your info.plist and forget about it.
We would encourage you to migrate to secured delivery as soon as you possibly can. Now, please take a look at WWDC session from this year entitled "Networking with NSURL Session" for some more information on application transport security. Now once you've actually encoded all your media and you hosted on your web server, you've created your playlists, you want to make sure that you've done this all properly. To help with this, we provided tool called Media Stream Validator as part of our HTTP Live Streaming Tools package. It's a command line tool that you point to your master playlist and it's going to validate the syntax of your playlist, the serving of the assets as well as the validity of the media. Note that it doesn't actually sets asset quality though so look towards some other quality control methods to ensure that your encoding is delivering a great looking and sounding experience.
The last point I want to touch on in this segment is FairPlay Protected Streaming. It's really important for anyone who's delivering content at tvOS, iOS and OS X devices that needs to protect their content from capture. FairPlay Streaming is the strongest content protection available on Apple devices. It's already broadly utilized in the premium content industry. Key delivery is individualized and decryption is actually managed entirely inside of the kernel boundary using industrial strength content protection techniques.
We actually also recently announced that this technology is now available as part of the Apple developer program and that separate licensing is no longer required. FPS provides a secured key delivering mechanism that is going to ensure that the encryption key for your content is protected both while it's in transit over the network to the clients as well as while it resides on the client during playback.
You can also integrate this into existing key server infrastructure. So maybe you already have some other encryption mechanisms or key serving that you work with, you can integrate this as well. FairPlay Protected Streaming also ensures that the HDMI connection is HDCP compliant and it's going to refuse video output if it's not. Now if you want to manage policies or the distribution of keys for playback of your media, maybe the number of devices allowed to, you know, view your content in any given time, you're going to have to build out those elements separately but they're very straightforward to combine with FPS. We don't actually provide any sort of rights expression engine as part of this. As mentioned earlier though, this was actually a key topic at WWDC 2015 and we'd encourage you to check out this session for a detailed discussion of FairPlay Protected Streaming and its implementation. So many if not most apps on tvOS are going to require some -- and involve some media playback of some kind. In this session, we've discussed how to leverage the rich APIs offered by AVFoundation in order to play back media content from both file and streaming sources. How to use AVPlayerViewController to deliver a great native playback experience? How HTTP Live Streaming allows streaming of your high quality live or on demand content over a wide variety of network conditions? And that FairPlay Protected Streaming can offer you industrial strength content protection for you media. For more information on these topics, please visit the AVFoundation, HTTP Live Streaming and FairPlay Protected Streaming landing pages on developer.apple.com for links to the relevant programming guides and documentation. Thank you very much.
-
Looking for something specific? Enter a topic above and jump straight to the good stuff.
An error occurred when submitting your query. Please check your Internet connection and try again.