Write your own Arm NN backend plugin
The example custom plugin provides a useful template for writing your own backend. We will look at the different things that you need to do when writing your own backend. We will use the code from the example plugin to illustrate the process.
Build system integration
Before you can build your custom plugin, you will need to integrate the plugin with the Arm NN build system. Arm NN uses the CMake build management system.
Follow these steps to write your own Arm NN backend plugin:
- Create a directory for your custom plugin in
armnn/src/backends
, for example custom:
mkdir <armnn_install_dir>/armnn/src/backends/custom
- Create a
backend.cmake
file to specify what needs to be built. Thebackend.cmake
file in the example plugin contains:
add_subdirectory(${PROJECT_SOURCE_DIR}/src/backends/custom)
list(APPEND armnnLibraries armnnCustomBackend)
list(APPEND armnnLibraries armnnCustomBackendWorkloads)
list(APPEND armnnUnitTestLibraries armnnCustomBackendUnitTests) - Create
CMakeLists.txt
files in each directory to specify the rules to build the new build targets. For example, here is theCMakeLists.txt
file in the top-levelcustom
directory:
list(APPEND armnnCustomBackend_sources CustomBackend.cpp CustomBackend.hpp CustomBackendUtils.cpp CustomBackendUtils.hpp CustomLayerSupport.cpp CustomLayerSupport.hpp CustomPreCompiledObject.cpp CustomPreCompiledObject.hpp CustomWorkloadFactory.cpp CustomWorkloadFactory.hpp ) add_library(armnnCustomBackend OBJECT ${armnnCustomBackend_sources}) target_include_directories(armnnCustomBackend PRIVATE ${PROJECT_SOURCE_DIR}/src/armnn) target_include_directories(armnnCustomBackend PRIVATE ${PROJECT_SOURCE_DIR}/src/armnnUtils) target_include_directories(armnnCustomBackend PRIVATE ${PROJECT_SOURCE_DIR}/src/backends) add_subdirectory(workloads) if(BUILD_UNIT_TESTS) add_subdirectory(test) endif()
- Create a
backend.mk
file to specify the source files. This file is used for Android builds:
BACKEND_SOURCES := \ CustomBackend.cpp \ CustomBackendUtils.cpp \ CustomLayerSupport.cpp \ CustomPreCompiledObject.cpp \ CustomWorkloadFactory.cpp \ workloads/CustomAdditionWorkload.cpp \ workloads/CustomPreCompiledWorkload.cpp BACKEND_TEST_SOURCES := \ test/CustomCreateWorkloadTests.cpp \ test/CustomEndToEndTests.cpp
Identify and register your plugin
All backends must identify themselves with a unique BackendId.
Here is the code in CustomBackend.cpp
that provides the unique ID:
const BackendId& CustomBackend::GetIdStatic() { static const BackendId s_Id{"Custom"}; return s_Id; }
Plugins must also register with the BackendRegistry
. A helper structure, BackendRegistry::StaticRegistryInitializer
, is provided to register the backend:
static BackendRegistry::StaticRegistryInitializer g_RegisterHelper { BackendRegistryInstance(), CustomBackend::GetIdStatic(), []() { return IBackendInternalUniquePtr(new CustomBackend()); } };
Implement the IBackendInternal interface
All backends need to implement the IBackendInternal
interface. Here are the interface functions to implement:
IMemoryManagerUniquePtr CreateMemoryManager()
IWorkloadFactoryPtr CreateWorkloadFactory(IMemoryManagerSharedPtr)
- The returned
IWorkloadFactory
object is used to create the workload layer computation units.
- The returned
IBackendContextPtr CreateBackendContext(IRuntime::CreationOptions)
ILayerSupportSharedPtr GetLayerSupport()
- During optimization, Arm NN needs to decide which layers are supported by the backend.
IsLayer<x>Supported()
functions indicate whether the backend supports the specified layer.
OptimizationViews OptimizeSubGraph(SubGraph)
- The subgraph to optimize is passed as the input to this function.
- The function returns an object containing a list of subgraph substitutions, a list of failed subgraph optimizations, and a list of untouched subgraphs.
The following sections look at each of these functions in more detail, as seen in CustomBackend.cpp.
Memory management: CreateMemoryManager()
The purpose of memory management is to minimize memory usage by allocating memory just before it is needed, and releasing it when the memory is no longer required.
All backends must support the IBackendInternal
interface CreateMemoryManager()
method, which returns a unique pointer to an IMemoryManager
object:
IBackendInternal::IMemoryManagerUniquePtr MyBackend::CreateMemoryManager() const
{
return std::make_unique<MyMemoryManager>(...);
}
In this example, MyMemoryManager
is a class that is derived from IBackendInternal::IMemoryManager
.
A backend that does not support a memory manager, such as the example plugin, should return an empty pointer, as you can see here:
IBackendInternal::IMemoryManagerUniquePtr MyBackend::CreateMemoryManager() const
{
return IBackendInternal::IMemoryManagerUniquePtr{};
}
The IMemoryManager
interface defines two pure virtual methods that are implemented by the derived class for the backend:
virtual void Acquire() = 0;
Acquire()
is called by theLoadedNetwork
before the model is executed.- The backend memory manager should allocate any memory that it needs for running the inference.
virtual void Release() = 0;
Release()
is called by theLoadedNetwork
, in its destructor, after the model is executed.- The backend memory manager should free any memory that it previously allocated.
The backend memory manager uses internal memory management to further optimize memory usage.
Workload factories: CreateWorkloadFactory()
Each layer is executed using a workload. A workload is used to enqueue a layer for computation.
Each workload that is created by a WorkloadFactory creates workloads that are specific to each layer. This means that each backend needs its own WorkloadFactory.
All workloads need to:
- implement the IWorkload interface
- implement the Create<x> methods to execute the operator on the backend hardware by:
- reading the input tensors
- writing the result to the output tensors
You can see the example code in CustomWorkloadFactory.cpp
.
Backend context: CreateBackendContext()
The IBackendContext interface defines virtual methods that are implemented by the derived class for the backend, as seen here:
IBackendInternal::IBackendContextPtr CustomBackend::CreateBackendContext(const IRuntime::CreationOptions&) const
{
return IBackendContextPtr{};
}
Here you can see how these virtual methods are defined in armnn/src/backends/backendsCommon/IBackendContext.hpp
:
class IBackendContext
{
protected:
IBackendContext(const IRuntime::CreationOptions&) {}
public:
// Before and after Load network events
virtual bool BeforeLoadNetwork(NetworkId networkId) = 0;
virtual bool AfterLoadNetwork(NetworkId networkId) = 0;
// Before and after Unload network events
virtual bool BeforeUnloadNetwork(NetworkId networkId) = 0;
virtual bool AfterUnloadNetwork(NetworkId networkId) = 0;
virtual ~IBackendContext() {}
};
The IBackendContext interface includes some methods that provide callback-like functionality. These methods are called by Arm NN before and after loading or unloading a network respectively. These methods allow the user to run any code, for example to clear a cache or synch threads, triggered by a specific load or unload network event.
Deciding which backends to assign to each layer: GetLayerSupport()
During optimization, Arm NN must decide which layers are supported by the backend.
The IsLayer<x>Supported()
functions indicate whether the backend supports the specified layer. For example:
bool CustomLayerSupport::IsAdditionSupported(const TensorInfo& input0, const TensorInfo& input1, const TensorInfo& output, Optional<std::string&> reasonIfUnsupported) const { ignore_unused(input1); ignore_unused(output); return IsDataTypeSupported(input0.GetDataType(), reasonIfUnsupported); }
Optimization: OptimizeSubGraph(SubGraph)
The optimizer calls OptimizeSubGraph()
on the selected backend, for each subgraph.
From the IBackendInternal
interface:
OptimizationViews OptimizeSubGraph(const SubGraph& subGraph) const = 0; class OptimizationViews { ... Substitutions SuccesfulOptimizations; // Proposed substitutions from successful optimizations Subgraphs FailedOptimizations; // Subgraphs from the original subgraph which cannot be supported Subgraphs UntouchedSubgraphs; // Subgraphs from the original subgraph which remain unmodified }; struct SubstitutionPair { // Subgraph of Layers from the original graph which should be replaced SubgraphView SubstitutableSubgraph; // A subgraph of new layers which will replace layers in m_SubstitutableSubgraph SubgraphView ReplacementSubgraph; };
Example optimizations might include:
- merging layers, for more efficient execution
- adding permute layers to modify the data layout for execution on the backend
The OptimizeSubGraph()
function does the following:
- If no optimization was attempted for part of the input subgraph, the optimization function adds it to the list of untouched subgraphs.
- If part of the input subgraph cannot be supported by the backend, the optimization function adds it to the list of failed optimizations.
Arm NN tries to re-assign each failed subgraph to other backends, if they are available.
- If part of the input subgraph can be optimized, the optimization function creates a substitution pair.
The substitutable subgraph in the original graph is replaced with the corresponding replacement subgraph.