Learn about Common Media Application Format(CMAF).
The** Common Media Application Format (CMAF)** is an extensible standard for encoding, packaging, and decoding segmented media objects. In other words, a standard media streaming format that uses a single set of files for all target platforms and devices. Its goal is to consolidate competing codecs, protocols, media formats and platforms into a single format. Doing so removes the need for providing multiple copies of the same content and enables higher efficiency and theoretical saving of over 70% of the encoding, packaging, and storing demand.
CMAF uses a fragmented MP4 file format that enables the use of only one set of video and audio files along a lightweight manifest that maps video, audio and other meta information together for presentation and rights management. Online Video Platforms (OVPs) can encode or transcode, package and store the video only once and thus saving on storage and processing needs. CMAF also provides low latency streaming and efficient caching due to its fragmented encoding and transfer paradigm.
CMAF also provides Digital Rights Management (DRM) for streamed content it supports AES-CTR - Counter (Widevine and PlayReady) and AES-CBC - Cipher Block Chaining (Apple FairPlay), which are both parts of the Common Encryption Scheme (CENC).
Standard allows a wide range of implementations including HTTP Live Streaming (HLS) and MPEG's Dynamic Adaptive Streaming over HTTP (MPEG DASH). CMAF specifications define the following media objects:
The CMAF Hypothetical Reference Model defines how tracks can be delivered, combined and synchronized in CMAF Presentations. Due to it being a hypothetical and reference model, it allows the use of any compatible implementation. Different implementations can share the same resources, CMAF Addressable Objects, thereby allowing efficient caching even when delivering to multiple platforms. CMAF Addressable Media objects consist of:
In HLS, Manifest references HLS Multivariant Playlist and Media Playlists, which describe a single or a sequence of CMAF presentations. HLS Multivariant Playlist defines different tiers of the presentation using the EXT-X-STREAM-INF
tags. Tiers differ in bit rate, required codes, resolution, other attributes and HLS Media Playlist they specify. Each tier may also have additional HLS Renditions, which are also Media Playlists determined by EXT-X-MEDIA
tags. HLS Rendition can present either video, audio or subtitles. EXT-X-MEDIA
tags are used to associate video, audio and subtitles together and present a single EXT-X-STREAM-INF
tag. This enables the use of a single HLS Rendition by several EXT-X-STREAM-INF
tags.
Each CMAF Track has one HLS Media Playlist, which contains CMAF Segments. Each CMAF segment has an EXT-X-MAP
tag that references the CMAF Header and accompanying CMAF Fragments. The EXT-X-INDEPENDENT-SEGMENTS
tag should be included in HLS Media Playlist since all CMAF Fragments are independently decodable. The EXT-X-SESSION-KEY
tag should be included in HLS Multivariant Playlist if the data is encrypted to enable prefetching of keys. The EXT-X-BYTERANGE
tag determines if CMAF Segment is a byte range inside a larger resource. The EXT-X-I-FRAMES-ONLY
tag determines that CMAF Segments start on a CMAF Fragment boundary. The EXT-X-DISCONTINUITY
tag is used to concatenate multiple CMAF Tracks of the same media type in a Media Playlist.
Each Track in a video CMAF Switching Set should appear in the Multivariant Playlist as a Media Playlist URI. The URI is prefixed by an EXT-X-STREAM-INF
that describes the Track and specifies additional renditions that are intended to play with video by indicating the appropriate EXT-X-MEDIA
tag.
Each Track in an audio CMAF Switching Set should be represented in the Multivariant Playlist by an EXT-X-MEDIA
tag. The URI attribute of the tag references one or more Track's Media Playlists.
CMAF Selection Sets can offer either alternate encodings of the same source content or homogenous encodings of different versions of the source content. In the first case, each Switching Set in the Selection Set appears as a set of EXT-X-STREAM-INF
tags, for video, or a set of EXT-X-MEDIA
tags, for other media types. In the second case, each Track of a member Switching Set should appear as an EXT-X-MEDIA
tag.
An example of an HLS Media Playlist for video (video.m3u8) CMAF Track that is built from 2 seconds long CMAF Fragments (VF1, VF2, VF3):
#EXTM3U
#EXT-X-TARGETDURATION:4
#EXT-X-VERSION:6
#EXT-X-PLAYLIST-TYPE:VOD
#EXT-X-MAP:URI="VIDEO-HEADER"
#EXTINF:2.0,
VF1
#EXTINF:2.0,
VF2
#EXTINF:2.0,
VF3
#EXT-X-ENDLIST
The Media Playlist for the video is accompanied by the Media playlist for the audio (audio.m3u8) CMAF Track that is built from 2 seconds long CMAF Fragments (AF1, AF2, AF3):
#EXTM3U
#EXT-X-TARGETDURATION:4
#EXT-X-VERSION:6
#EXT-X-PLAYLIST-TYPE:VOD
#EXT-X-MAP:URI="AUDIO-HEADER"
#EXTINF:2.0,
AF1
#EXTINF:2.0,
AF2
#EXTINF:2.0,
AF3
#EXT-X-ENDLIST
Media Playlists can be grouped into Multivariant Playlist that enables the selection of appropriate configuration. The example presents CMAF Selection Sets can appear as separate Renditions (english.m3u8 and slovene.m3u8), or as separate sets of tiers determined by different codecs (video.m3u8 and video-hq.m3u8 together present the AVC Switching Set and video.m3u8 and hevc-video-hq.m3u8 together present the HEVC Switching Set.) Together they form a Selection Set that allows the selection of codec.
#EXTM3U
#EXT-X-VERSION:6
#EXT-X-INDEPENDENT-SEGMENTS
#EXT-X-MEDIA:NAME="English",TYPE=AUDIO,GROUP-ID="audio-stereo-64",LANGUAGE="en",DEFAULT=YES,AUTOSELECT=YES,URI="english.m3u8"
#EXT-X-MEDIA:NAME="Slovene",TYPE=AUDIO,GROUP-ID="audio-stereo-64",LANGUAGE="si",DEFAULT=NO,AUTOSELECT=YES,URI="slovene.m3u8"
#EXT-X-MEDIA:NAME="English",TYPE=AUDIO,GROUP-ID="audio-stereo-128",LANGUAGE="en",DEFAULT=YES,AUTOSELECT=YES,URI="english-hi.m3u8"
#EXT-X-MEDIA:NAME="Slovene",TYPE=AUDIO,GROUP-ID="audio-stereo-128",LANGUAGE="si",DEFAULT=NO,AUTOSELECT=YES,URI="slovene-hi.m3u8"
#EXT-X-STREAM-INF:BANDWIDTH=1123000,CODECS="avc1.64001f,mp4a.40.2", AUDIO="audio-stereo-64",RESOLUTION=620x334
video.m3u8
#EXT-X-STREAM-INF:BANDWIDTH=8187000,CODECS="avc1.640028,mp4a.40.2", AUDIO="audio-stereo-128",RESOLUTION=1916x1032
video-hq.m3u8
#EXT-X-STREAM-INF:BANDWIDTH=623000, CODECS="hvc1.1.6.L120.B0,mp4a.40.2",AUDIO="audio-stereo-64", RESOLUTION=620x334
hevc-video.m3u8
#EXT-X-STREAM-INF:BANDWIDTH=4187000, CODECS="hvc1.1.6.L120.B0,mp4a.40.2",AUDIO="audio-stereo-128", RESOLUTION=1916x1032
hevc-video-hq.m3u8
A CDN, or "Content Delivery Network," is a network of servers (typically placed around the world) used to deliver content (such as videos, photos, and CSS).