Write your own Arm NN backend plugin

The example custom plugin provides a useful template for writing your own backend. We will look at the different things that you need to do when writing your own backend. We will use the code from the example plugin to illustrate the process.

Build system integration

Before you can build your custom plugin, you will need to integrate the plugin with the Arm NN build system. Arm NN uses the CMake build management system.

Follow these steps to write your own Arm NN backend plugin:

  1. Create a directory for your custom plugin in armnn/src/backends, for example custom:

    mkdir <armnn_install_dir>/armnn/src/backends/custom
  2. Create a backend.cmake file to specify what needs to be built. The backend.cmake file in the example plugin contains:

    add_subdirectory(${PROJECT_SOURCE_DIR}/src/backends/custom)
    list(APPEND armnnLibraries armnnCustomBackend)
    list(APPEND armnnLibraries armnnCustomBackendWorkloads)
    list(APPEND armnnUnitTestLibraries armnnCustomBackendUnitTests)
  3. Create CMakeLists.txt files in each directory to specify the rules to build the new build targets. For example, here is the CMakeLists.txt file in the top-level custom directory:

    list(APPEND armnnCustomBackend_sources
         CustomBackend.cpp
         CustomBackend.hpp
         CustomBackendUtils.cpp
         CustomBackendUtils.hpp
         CustomLayerSupport.cpp
         CustomLayerSupport.hpp
         CustomPreCompiledObject.cpp
         CustomPreCompiledObject.hpp
         CustomWorkloadFactory.cpp
         CustomWorkloadFactory.hpp
    )
    
    add_library(armnnCustomBackend OBJECT ${armnnCustomBackend_sources})
    target_include_directories(armnnCustomBackend PRIVATE ${PROJECT_SOURCE_DIR}/src/armnn)
    target_include_directories(armnnCustomBackend PRIVATE ${PROJECT_SOURCE_DIR}/src/armnnUtils)
    target_include_directories(armnnCustomBackend PRIVATE ${PROJECT_SOURCE_DIR}/src/backends)
    
    add_subdirectory(workloads)
    
    if(BUILD_UNIT_TESTS)
        add_subdirectory(test)
    endif()
    
  4. Create a backend.mk file to specify the source files. This file is used for Android builds:

    BACKEND_SOURCES := \
            CustomBackend.cpp \
            CustomBackendUtils.cpp \
            CustomLayerSupport.cpp \
            CustomPreCompiledObject.cpp \
            CustomWorkloadFactory.cpp \
            workloads/CustomAdditionWorkload.cpp \
            workloads/CustomPreCompiledWorkload.cpp
    
    BACKEND_TEST_SOURCES := \
             test/CustomCreateWorkloadTests.cpp \
             test/CustomEndToEndTests.cpp
    

Identify and register your plugin

All backends must identify themselves with a unique BackendId.

Here is the code in CustomBackend.cpp that provides the unique ID:

const BackendId& CustomBackend::GetIdStatic()
{
    static const BackendId s_Id{"Custom"};
    return s_Id;
}

Plugins must also register with the BackendRegistry. A helper structure, BackendRegistry::StaticRegistryInitializer, is provided to register the backend:

static BackendRegistry::StaticRegistryInitializer g_RegisterHelper
{
    BackendRegistryInstance(),
    CustomBackend::GetIdStatic(),
    []()
    {
        return IBackendInternalUniquePtr(new CustomBackend());
    }
};

Implement the IBackendInternal interface

All backends need to implement the IBackendInternal interface. Here are the interface functions to implement:

  • IMemoryManagerUniquePtr CreateMemoryManager()

  • IWorkloadFactoryPtr CreateWorkloadFactory(IMemoryManagerSharedPtr)
    • The returned IWorkloadFactory object is used to create the workload layer computation units.

  • IBackendContextPtr CreateBackendContext(IRuntime::CreationOptions)

  • ILayerSupportSharedPtr GetLayerSupport()
    • During optimization, Arm NN needs to decide which layers are supported by the backend.
    • IsLayer<x>Supported() functions indicate whether the backend supports the specified layer.

  • OptimizationViews OptimizeSubGraph(SubGraph)
    • The subgraph to optimize is passed as the input to this function.
    • The function returns an object containing a list of subgraph substitutions, a list of failed subgraph optimizations, and a list of untouched subgraphs.

The following sections look at each of these functions in more detail, as seen in CustomBackend.cpp.

Memory management: CreateMemoryManager()

The purpose of memory management is to minimize memory usage by allocating memory just before it is needed, and releasing it when the memory is no longer required.

All backends must support the IBackendInternal interface CreateMemoryManager() method, which returns a unique pointer to an IMemoryManager object:

IBackendInternal::IMemoryManagerUniquePtr MyBackend::CreateMemoryManager() const
{
    return std::make_unique<MyMemoryManager>(...);
}

In this example, MyMemoryManager is a class that is derived from IBackendInternal::IMemoryManager.

A backend that does not support a memory manager, such as the example plugin, should return an empty pointer, as you can see here:

IBackendInternal::IMemoryManagerUniquePtr MyBackend::CreateMemoryManager() const
{
    return IBackendInternal::IMemoryManagerUniquePtr{};
}

The IMemoryManager interface defines two pure virtual methods that are implemented by the derived class for the backend:

  • virtual void Acquire() = 0;
    • Acquire() is called by the LoadedNetwork before the model is executed.
    • The backend memory manager should allocate any memory that it needs for running the inference.

  • virtual void Release() = 0;
    • Release() is called by the LoadedNetwork, in its destructor, after the model is executed.
    • The backend memory manager should free any memory that it previously allocated.

The backend memory manager uses internal memory management to further optimize memory usage.

Workload factories: CreateWorkloadFactory()

Each layer is executed using a workload. A workload is used to enqueue a layer for computation.

Each workload that is created by a WorkloadFactory creates workloads that are specific to each layer. This means that each backend needs its own WorkloadFactory.

All workloads need to:

  • implement the IWorkload interface
  • implement the Create<x> methods to execute the operator on the backend hardware by:
    • reading the input tensors
    • writing the result to the output tensors

You can see the example code in CustomWorkloadFactory.cpp.

Backend context: CreateBackendContext()

The IBackendContext interface defines virtual methods that are implemented by the derived class for the backend, as seen here:

IBackendInternal::IBackendContextPtr CustomBackend::CreateBackendContext(const IRuntime::CreationOptions&) const
{
    return IBackendContextPtr{};
}

Here you can see how these virtual methods are defined in armnn/src/backends/backendsCommon/IBackendContext.hpp:

class IBackendContext
{
protected:
    IBackendContext(const IRuntime::CreationOptions&) {}
public:
    // Before and after Load network events
    virtual bool BeforeLoadNetwork(NetworkId networkId) = 0;
    virtual bool AfterLoadNetwork(NetworkId networkId) = 0;
 
    // Before and after Unload network events
    virtual bool BeforeUnloadNetwork(NetworkId networkId) = 0;
    virtual bool AfterUnloadNetwork(NetworkId networkId) = 0;
 
    virtual ~IBackendContext() {}
};

The IBackendContext interface includes some methods that provide callback-like functionality. These methods are called by Arm NN before and after loading or unloading a network respectively. These methods allow the user to run any code,  for example to clear a cache or synch threads, triggered by a specific load or unload network event.

Deciding which backends to assign to each layer: GetLayerSupport()

During optimization, Arm NN must decide which layers are supported by the backend.

The IsLayer<x>Supported() functions indicate whether the backend supports the specified layer. For example:

bool CustomLayerSupport::IsAdditionSupported(const TensorInfo& input0,
                                             const TensorInfo& input1,
                                             const TensorInfo& output,
                                             Optional<std::string&> reasonIfUnsupported) const
{
    ignore_unused(input1);
    ignore_unused(output);
    return IsDataTypeSupported(input0.GetDataType(), reasonIfUnsupported);
}

Optimization: OptimizeSubGraph(SubGraph)

The optimizer calls OptimizeSubGraph() on the selected backend, for each subgraph.

From the IBackendInternal interface:

OptimizationViews OptimizeSubGraph(const SubGraph& subGraph) const = 0;

class OptimizationViews
{
  ...
  Substitutions SuccesfulOptimizations; // Proposed substitutions from successful optimizations
  Subgraphs FailedOptimizations; // Subgraphs from the original subgraph which cannot be supported
  Subgraphs UntouchedSubgraphs;  // Subgraphs from the original subgraph which remain unmodified
};

struct SubstitutionPair
{
  // Subgraph of Layers from the original graph which should be replaced
  SubgraphView SubstitutableSubgraph;

  // A subgraph of new layers which will replace layers in m_SubstitutableSubgraph
  SubgraphView ReplacementSubgraph;
};

Example optimizations might include:

  • merging layers, for more efficient execution
  • adding permute layers to modify the data layout for execution on the backend

The OptimizeSubGraph() function does the following:

  • If no optimization was attempted for part of the input subgraph, the optimization function adds it to the list of untouched subgraphs.

  • If part of the input subgraph cannot be supported by the backend, the optimization function adds it to the list of failed optimizations.

    Arm NN tries to re-assign each failed subgraph to other backends, if they are available.

  • If part of the input subgraph can be optimized, the optimization function creates a substitution pair.

    The substitutable subgraph in the original graph is replaced with the corresponding replacement subgraph.
Previous Next