HomeCommunityAI blog
May 21, 2026

What’s new in KleidiCV 26.03

See how KleidiCV 26.03 improves OpenCV acceleration on Arm CPUs with new optical-flow support, platform updates, and up to 14x gains.

By Mawussi

Share
Reading time 7 minutes

KleidiCV 26.03 expands optimized computer vision support on Arm CPUsThe release adds image-processing and optical-flow support, clarifies backend selection, and improves OpenCV performance on Arm CPUs. It also introduces calendar versioning to provide a more predictable release cadence for production software teams. These updates make it easier for you to evaluate and deploy KleidiCV across more environments. 

What's new in this update

KleidiCV 26.03 adds support for more image-processing and sparse optical-flow functions, clarifies backend selection behavior, and improves performance across Arm CPUs with support for Neon, SVE2, SME, and SME2 technologies. On Arm Cortex-A710 cores, some routines achieve up to 14x faster performance compared to OpenCV alone, as shown in the graph below. 
The release also adds support for macOS and Windows 11 on Arm-based PCs. This makes it easier to evaluate and deploy KleidiCV across more environments.  
KleidiCV already integrates with OpenCV. From OpenCV 4.13 onwards, it is enabled by default for AArch64 builds on Android, Linux, and macOS. With KleidiCV 26.03, you can also integrate it directly into your software more easily. These changes demonstrate the value of KleidiCV as a direct dependency in production software. 
The release also moves from semantic versioning 0.7.0 to calendar versioning 26.03. This change gives users a more predictable release cadence. 

Expanded computer vision coverage and backend control

Broader image-processing coverage

The latest release also extends support to many common image-processing stages. This enables more of your camera and computer vision pipeline to run on optimized Arm CPU paths. New and expanded support includes selected remap and warpPerspective functions, improved rotate, multithreaded out-of-place transpose, and linear resize for downscaling from one-third to one. The release also expands filtering and color conversion support. KleidiCV now supports larger and more flexible GaussianBlur and MedianBlur kernels on Neon. It also adds more YUV conversion paths, including planar formats, and supports sepFilter2D in the OpenCV HAL.

These updates extend acceleration to more parameter sets and data layouts without changing the OpenCV programming model.

Broader sparse optical-flow pipeline support

KleidiCV has expanded support for sparse optical-flow pipelines. Many computer vision applications use multiple processing stages for motion estimation and tracking, instead of isolated filters. KleidiCV now supports standalone Lucas-Kanade routines, the LK optical-flow pyramid builder, and the pyramidal LK optical-flow calculation API. It also improves supporting stages. These include Scharr interleaved multi-channel processing, multi-channel blur operations, and downsample operations. These updates extend acceleration to workloads that depend on pyramid construction, derivatives, and iterative tracking. 

Clearer backend control and dispatch

Diagram showing KleidiCV backend selection on Arm CPUs, illustrating how execution is dispatched to either an SME-based backend or a Neon-based backend depending on API choice and hardware support.
For developers who integrate KleidiCV into their own software, direct access to implementations for particular hardware acceleration features can be important. Choosing between backends, for example, is now clearer and easier to control as KleidiCV expands support for Arm CPU instruction sets. Recent releases separate SME-only implementations from SME2 implementations. They have also included build options such as KLEIDICV_ENABLE_SME and KLEIDICV_LIMIT_SME_TO_SELECTED_ALGORITHMS, and introduced the KLEIDICV_PREFER_SME_BACKEND environment variable.

KleidiCV 26.03 updates dispatch behavior. APIs without the _sme postfix use Neon or SVE2 by default. APIs with the _sme postfix use SME or SME2 and fall back if those features are not available. This gives you more control over how KleidiCV maps workloads to available Arm CPU features while preserving default behavior. For more detail, see the KleidiCV Reference Guide.

Platform and ecosystem progress

KleidiCV has improved platform support and ecosystem integration. OpenCV integration progresses from version 4.11 to 4.13. KleidiCV also adds a dynamic dispatcher on compatible M-series Apple silicon running macOS. The release also expands supported development and deployment environments. KleidiCV now supports Windows 11 on Arm-based PCs as a platform target. Arm tests AArch64 Ubuntu on Arm Neoverse cores for correctness and performance.

How to get KleidiCV 26.03

KleidiCV is available from the KleidiCV GitLab repository under the Apache License Version 2.0. You can build it as a standalone library or use it through OpenCV, with KleidiCV integration enabled. For direct integration, see the KleidiCV documentation and Reference Guide for build instructions, supported routines, and backend selection details.

Performance speedups versus OpenCV

KleidiCV improves the execution speed of key computer vision routines on Arm CPUs. For this update, we compare selected OpenCV routines with KleidiCV enabled and disabled. We use the same benchmark script as previous KleidiCV releases. We report percentage speedups instead of raw benchmark data. The figure highlights routines added since the previous release. It also includes a small number of existing routines that now show larger speedups. The results show what changed between releases 0.4.0 and 26.03. The measurements compare baseline OpenCV with OpenCV built with KleidiCV enabled. These benchmarks were run on a Samsung Galaxy S22 Ultra using two threads pinned to two Cortex-A710 cores. Unless otherwise specified, benchmarks use 1080p images.

Bar chart of percentage speedups for selected OpenCV routines with KleidiCV enabled on Arm Cortex-A710 cores, showing gains up to 14x over baseline OpenCV.

Some of the largest speedups occur in areas that KleidiCV has expanded since the previous release. These include larger GaussianBlur and MedianBlur kernels, new downscaling support, transpose, planar YUV conversion, and more of the sparse optical-flow pipelines.

These updates improve performance for individual kernels and extend acceleration across practical image-processing pipelines.

Conclusion

KleidiCV 26.03 expands algorithm coverage, improves backend behavior, and increases performance for selected routines compared to OpenCV. If you use OpenCV, you can benefit from KleidiCV acceleration with little or no application-level changes. If you evaluate direct integration, you can more easily assess KleidiCV as a dependency for production image-processing software on Arm CPUs. KleidiCV does not yet support all OpenCV functionality. Future releases will expand algorithm coverage, refine backend behavior, and improve end-to-end performance for computer vision pipelines.


Log in to like this post
Share

Article text

Re-use is only permitted for informational and non-commercial or personal use only.

placeholder