Skip to main content
BlogComputeBenchmarking VPUs and GPUs for Media Workloads

Benchmarking VPUs and GPUs for Media Workloads

MP-56512_Blog Headers and social images Benchmarking VPUs and GPUs for Media Workload_Header with text (1)

Akamai Cloud recently introduced support of NETINT’s T1U Quadra video processing units (VPUs), making these powerful cards available at an hourly rate when deploying cloud virtual machines. These join NVIDIA RTX™ 4000 Ada Generation GPUs already offered in the Akamai Cloud portfolio. 

To assess real-world performance and energy efficiency, we partnered with Cires21, a pioneer company in the streaming industry since 2008. Using their C21 Live Encoder, optimized for both NETINT and NVIDIA architectures, we benchmarked VPUs and GPUs across demanding media workloads, using some of the most challenging videos to process as our test subject. 

We found that in the most demanding scenarios, VPUs had a 4.7x higher energy efficiency over GPUs and, in some scenarios, even outperformed the NVIDIA cards. The GPUs did shine in some tests, but they always used more power than VPUs. 

If you’re at a company planning your Scope 3 emissions for the coming year, or just struggling with energy requirements in the AI landscape of today, VPUs should be top of your list to test. You can learn more about leveraging VPUs here

Benchmarking Setup

We ran each test in the same Akamai Cloud region (Frankfurt) using the following configurations: 

  • A single GPU instance with the NVIDIA RTX4000 Ada x1 Small plan
  • A single VPU with the NETINT Quadra T1U x1 Small plan
  • Both instances ran optimized Docker containers for C21 Live Encoder: one with CUDA (GPU) and the other with Libxcoder (VPU)

Tests were performed on the cards to measure the encoding and decoding parts of a typical media workflow using the H.264/AVC codec. 

In the table below, you can see that an adaptive bitrate (ABR) ladder was generated with simultaneous output at 1080p, 720p, 576p, 432p, and 360p, all at 60 frames per second in a 1:N fashion from the input. The “Max Jobs” label in the table indicates the maximum simultaneous transcoding jobs that each encoder implementation could run in real-time. The test used a raw 6-minute video at 1080p resolution. The encoding workload then processed this video and re-encoded it to the same format, H.264/AVC, as the target output.

Summary Table: Transcode Job Capacity by Resolution

ResolutionNETINT Max JobsNVIDIA Max JobsNETINT WattsNVIDIA Watts
1080p19161259
720p22241169
576p2025861
432p2128855
360p2030751
ABR681382

As you can see from our data above, even when the GPU delivered higher job capacity at lower resolutions, it did so with significantly greater energy. VPUs offered better energy performance at nearly every resolution, especially in high-throughput and high-resolution workloads. 

In the table below, we estimated the energy consumption for 1,000 simultaneous streams running continuously for a year (using 8,760 hours in a year).

Table: Energy usage for 1,000 streams for 1 year

NETINT @ 1080p5,532,631.58KW hours
GPU @ 1080p32,302,500.00KW hours
NETINT @ ABR18,980,000.00KW hours
GPU @ ABR89,790,000.00KW hours

For any organization looking to reduce their supply chain emissions or even increase their throughput in certain workload scenarios, the NETINT VPUs are a great solution and are available hourly from Akamai as needed. 

Video Multimethod Assessment Fusion (VMAF) Scores

Of course, lower power and more throughput are nothing if you sacrifice quality. The VMAF scores included below show that VPUs also do a great job here.

To validate output accuracy, we ran VMAP tests across all outputs. 

In the graph above, our test results show the VMAF scores for high efficiency video coding (HEVC) produced by the NETINT Libxcoder and the NVIDIA Encoder. Each score is reported at a specific output resolution, with the corresponding bitrate listed in parentheses. These scores provide an objective measure of video quality, allowing us to compare how well each encoder preserves visual fidelity across different encoding settings.

In this next graph (above), our test results show the AV1 VMAF scores for the NETINT Libxcoder and the NVIDIA Encoder at each output resolution, with the corresponding bitrates shown in parentheses. These scores indicate how effectively each encoder maintains visual quality when compressing video using the AV1 codec, highlighting differences in efficiency and quality tradeoffs across resolutions.

Available Now in Akamai Cloud

By optimizing the processing pipeline on NETINT, VPUs are very capable of doing more with less. With peak measured power usage of 13 watts, the NETINT Quadra T1U VPUs show that you can scale media workloads efficiently without compromising performance and quality. 

Migrating from GPU-based to VPU-based video encoding for 1,000 year-long 1080p streams could reduce annual energy consumption from 32 MWh to just 5.5 MWh—a savings of 26.5 MWh. This translates to approximately 80% in annual energy savings, along with a reduction of around 12.6 tonnes of CO₂ emissions per year, assuming a global average emissions factor. 

If you want to try these NETINT VPU cards, they are available today in Akamai Cloud at an hourly rate. 

Deploy NETINT VPUs or NVIDIA GPUs on demand with our Accelerated Compute plan or use the C21 Live Encoder from Cires21 to streamline setup and maximize efficiency with GPU and VPU infrastructure.

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *