When you upload a video to the internet, you want to reach the broadest audience possible and deliver the best possible viewing experience. Video encoding optimizes your video for transmission via the internet and compatibility with different devices and platforms. Understanding this concept will help you troubleshoot issues and improve your video strategy.
- What is video encoding?
- Why do videos need encoding?
- The solution: video compression
- How does video compression work?
- What are codecs?
- What are video containers?
- What is adaptive bitrate?
- Video encoding vs video transcoding
- Live stream encoding
What is Video Encoding?
So, what does encoding mean? What does a video encoder do? In short, video encoding is the initial conversion of video (in essence, light) to a digital format that is usable for software and hardware applications. As part of the process, the video file size is reduced through compression to facilitate its consumption.
Why do videos need encoding?
Video encoding is essential for OTT video streaming, both live and on-demand. To best explain video encoding, let’s first look at the two problems it’s trying to solve.
With video being so popular, countless ways to capture, share, apply, and view video have brought us so many different video formats over time, that compatibility quickly became an issue. Encoding and transcoding let you match the video format with your use case and objective.
The second problem is file size. Video is a heavy and bulky data format. Think of it. Most videos are shot at 30 frames per second. So one minute of video yields a file the size of 1800 images. Make those images HD or 4K quality and you got yourself a massive file. That’s why video encoding is required.
Raw video has a constant data rate (the speed at which data is transmitted within a computer or between a computer and a device or network, measured in bits per second) based on pixel representation, image resolution, and frame rate. A video stream in HD 1080 at 30 frames per second has a data rate of approximately 1.5 Gbit/s. According to Statista.com, the average global internet connection speed as of august 2020 is 34.82 Mbps on mobile and 84.33 Mbps on fixed broadband. Streaming a 4K film on such a connection would be akin to trying to channel a river through a drinking straw.
The solution: Video compression
To reduce the data size and enable transmission over an internet connection, we’ll need to remove unnecessary information, known as lossy compression. This can be done using compression techniques such as interframe compression, image resizing, and chroma subsampling. More on that, later.
The higher the compression, the more data is irreversibly thrown out. On the viewer’s end, the video is built up again, minus the data that was removed, resulting in a lower-quality rendition of the content, the dreaded pixelated or muddy image.
It’s this balance that forms the main challenge of encoding: low compression delivers a higher image quality with a higher risk of buffering and lagging, whereas high compression yields a lower image quality but smooth playback. This challenge resulted in a variety of compression formats for different use cases (more on those later).
How Does Video Compression Work?
In a sentence, the video encoding software detects and removes inessential information within the video data. The positive result of this data loss is the smaller file size. The negative consequence is that trimmed data cannot be recovered at playback (lossy compression), resulting in lower image quality.
Several compression techniques can do the job. Let’s take a look at the three most common techniques: interframe compression, image resizing, and chroma subsampling.
Interframe compression (motion compensation)
Let’s look again at our 30 images per second video. From frame to frame, much data will be repetitive or identical. A three-second shot of a motionless landscape is essentially 90 times the same image. Interframe compression removes the data of the non-essential images and replaces them with a reference.
Lowering the resolution is another way to reduce the amount of data in a video. By removing pixels, the image size is reduced from e.g., 1920×1080 pixels (full HD) to 1280×720 pixels (HD), and even lower. The further you reduce the number of pixels, the larger the pixels become when you display it full screen, resulting in a blocky or pixelated image. Adaptive bitrate streaming is based on this compression technique (more on that later).
A third compression technique we’ll discuss here is reducing the amount of color information in a video frame, otherwise known as chroma subsampling. The color quality is reduced while luminance is emphasized, by switching from RGB channels to a YCbCr color space. This compression technique can reduce your file size by 50%. On the downside, chroma subsampling can lead to color artifacts in your image.
What Are Codecs?
Video encoding can apply various compression standards. These are called ‘codecs’, a contraction of the words “encoder” and “decoder”. Each codec has two components: an encoder that compresses the video for transmission and a decoder that builds the video back up for playback.
Each of these codecs comes with specifications that are suitable for different use cases or applications. Some of the most widely used video codecs are h.264, h.265, VP9, AV1, MPEG-2, and RV 40. The audio stream of a video uses its own compression standards, such as mp3, FLAC, and most popular AAC.
Also known as AVC (Advanced Video Coding), h.264 is the most commonly used video codec. It’s considered the best video encoding format for its high quality, better bitrate, efficiency, and encoding speed.
h.265 or HEVC (High-Efficiency Video Coding) offers a compression rate double that of h.264, making the file size half as large for excellent image quality. This makes h.265 a great solution for high-resolution video like 2K or 4K, and for live streaming. On the downside, h.265 requires much more resources to encode and isn’t widely supported (yet).
VP9 is an open-source video codec developed by Google that delivers a consistent and reliable stream. The biggest issue with VP9 is that it is not supported by Apple.
What Are Video Containers?
Once your video file is encoded, it gets packaged into a video container. Video containers encapsulate the video codec, audio codec, and metadata such as subtitles, thumbnails, et cetera. The video container dictates the compatibility with devices and apps. The most common containers are .MP4, .MOV, .WMV, .MKV, .FLV, and .AVI.
MP4 is the most commonly used and versatile video format, despite its lower image quality. It was developed by the International Standards Organization (ISO). MP4 is said to support h.264 encoded videos the best.
Developed for their Quicktime media player in the 90s, MOV is Apple’s answer to Microsoft’s AVI format. This video container has the same specs as MP4.
Developed by Microsoft in 1992, AVI is the granddaddy of all video containers and has since been replaced by WMV. This container produces less compression for higher image quality. AVI doesn’t support modern codecs like h.264, or chaptering and metadata.
How to check the video encoding format and codec?
Both the video codec and the containers are listed in the video file metadata. On a Mac computer, you can access this information by control-clicking the video file and choosing ‘Get Info” in the dropdown menu. The container is listed under General as ‘Kind’, while the codecs are listed under More Info. You’ll typically see a video codec and an audio codec (assuming your video has both). On PC, you can access the same information by right-clicking your video file and select Properties from the dropdown menu. You can also find the codec of a video through video editing software or media players like VLC.
What Is Adaptive Bitrate?
First, what is bitrate? Bitrate is the measurement of the speed at which data is transmitted across a network, expressed in bits per second (Kbps, Mbps, or Gbps).
We mentioned earlier that the average global connection speed currently stands at 34.82 Mbps on mobile and 84.33 Mbps on fixed broadband. Basing your compression level on a global average would not be user-friendly in this age of hyper-personalized service. It will deliver a subpar viewing experience to anyone with a connection below or above the average, which is of course the majority. The trick would be to deliver the highest quality that each individual user’s internet connection or viewing application allows for seamless streaming.
This is where adaptive bitrate (ABR) streaming comes in. ABR, also referred to as variable bitrate or VBR, delivers the viewer the best bitrate (and thus image quality) their internet connection allows. When the connection is great, the video quality is high. If the viewer has a slower internet connection or if the strength of the signal is lower at the viewer’s geographical location, the video quality will be automatically lowered to avoid buffering. This can even happen mid-video. Because viewers tend to prefer a reliable stream over high video quality, ABR has become one of the go-to video encoding technologies.
Some of the most popular video formats supporting ABR are Apple’s HLS (HTTP Live Streaming) and the open-source format MPEG-DASH (Dynamic Adaptive Streaming over HTTP). More on these formats and on Adaptive Bitrate streaming in our Low Latency Video Streaming Guide.
Video Encoding vs Video Transcoding
Now that you know more about the video encoding and decoding process, we should also look at a related process: video transcoding.
What is video transcoding?
Video encoding transforms raw video into a compressed format ready for transmission and playback in software or hardware applications. Video transcoding is the process of decompressing a video file from one codec and compressing it again into another codec. In other words, transcoding is a form of encoding that converts video files.
Transcoding is typically applied for compatibility reasons, to change a video’s bitrate, resolution, or aspect ratio, or to convert your video into a newer format.
video transcoding basics: always aim to encode your raw video from the get-go in the right format and at the highest quality for your needs. Transcoding can cause encoding degradation, especially when you transcode from a lower quality format into a higher quality format.
Live Stream Encoding
The encoding process is one of the most time-consuming steps in video processing. It can take hours. This is rarely an issue for video-on-demand. However, it is a challenge for live streaming, where recording, encoding, and distribution should ideally happen simultaneously. Live streamers need a dedicated video encoder to handle that process. Video encoders for live streaming can be software, like OBS, Wirecast, or FMLE. Video encoder hardware, like Teradek or tricaster, is more powerful but also much pricier. Top-tier live broadcasting and webcasting platforms like Kaltura Townhalls and Live Events have the encoding technology integrated into the software, saving you the extra step. Kaltura automatically encodes your live stream in the background.
Video encoding and transcoding is a broad and technical topic. In this article, we shared everything you need to know, but there’s so much more to know. We recommend you peruse the Kaltura Knowledge Center to learn best practices and to find out which encoders are supported by Kaltura products.
Are you planning a live event? Kaltura can help!