Building an emulation-aware instrumentation client

The ability to instrument emulated applications is a recent addition to the DynamoRIO API. Consequently, most of the samples which come with DynamoRIO (and Arm Instruction Emulator) are not capable of interpreting emulated instructions. This tutorial demonstrates how to modify existing native-only clients to also handle emulated instructions, and to write emulation aware clients of your own.

Before you begin

Procedure

  1. Run the pre-built libbbcount.so client with Arm Instruction Emulator. This counts the number of basic blocks executed by an application:

    armie -msve-vector-bits=128 -i libbbcount.so -- ./example

    This returns:

    Client bbcount is running
    i       a[i]    b[i]    c[i]
    =============================
    0       197     283     86
    1       262     277     15
    . . .
    1021    165     234     69
    1022    232     295     63
    1023    204     235     31
    Instrumentation results:
        449561 basic block executions
          1971 basic blocks needed flag saving
             0 basic blocks did not

    We will change this to write both native and emulated basic block execution counts to stdout.

  2. Add the emulated instruction counter variable. Copy the bbcount.cpp file to bbcount_tut2.cpp in:
     /path/to/your/arm-instruction-emulator-<xx.y>_Generic-AArch64_<OS>_aarch64-linux/samples.

    Edit bbcount_tut2.cpp to add a global emulation counter variable:

                                    bbcount.c                                                           bbcount_tut2.c
    
      /* we only have a global count */                                           |  /* we have global native and emulated counts */
      static int global_count;                                                    |  static int native_count;
                                                                                  |  static int emulated_count;
                                                                                  |
      #ifdef SHOW_RESULTS                                                         |  #ifdef SHOW_RESULTS
      /* some meta-stats: static (not per-execution) */                           |  /* some meta-stats: static (not per-execution) */
      static int bbs_eflags_saved;                                                |  static int bbs_eflags_saved;
      static int bbs_no_eflags_saved;                                             |  static int bbs_no_eflags_saved;
      #endif                                                                      |  #endif
                                                                                  |
      static void                                                                 |  static void
      event_exit(void)                                                            |  event_exit(void)
      {                                                                           |  {
      #ifdef SHOW_RESULTS                                                         |  #ifdef SHOW_RESULTS
          char msg[512];                                                          |      char msg[512];
          int len;                                                                |      int len;
          len = dr_snprintf(msg, sizeof(msg) / sizeof(msg[0]),                    |      len = dr_snprintf(msg, sizeof(msg) / sizeof(msg[0]),
                            "Instrumentation results:\n"                          |                        "Instrumentation results:\n"
                            "%10d basic block executions\n"                       |                        "%10d native basic block executions\n"
                                                                                  |                        "%10d emulated basic block executions\n"
                            "%10d basic blocks needed flag saving\n"              |                        "%10d basic blocks needed flag saving\n"
                            "%10d basic blocks did not\n",                        |                        "%10d basic blocks did not\n",
                            global_count, bbs_eflags_saved, bbs_no_eflags_saved); |                        native_count, emulated_count,
                                                                                  |                        bbs_eflags_saved, bbs_no_eflags_saved);
          DR_ASSERT(len > 0);                                                     |      DR_ASSERT(len > 0);
          NULL_TERMINATE(msg);                                                    |      NULL_TERMINATE(msg);
          DISPLAY_STRING(msg);                                                    |      DISPLAY_STRING(msg);
      #endif /* SHOW_RESULTS */                                                   |  #endif /* SHOW_RESULTS */
          drx_exit();                                                             |      drx_exit();
          drreg_exit();                                                           |      drreg_exit();
          drmgr_exit();                                                           |      drmgr_exit();
      }                                                                           |  }
    
  3. Add the basic block emulation counting function. Modify the instrumentation callback function event_app_instruction() to look for at least one emulated instruction in a block, and if found, increment emulated_count when the block is executed.

                                    bbcount.c                                                                                 bbcount_tut2.c
    
      static dr_emit_flags_t                                                                 |  static dr_emit_flags_t
      event_app_instruction(void *drcontext, void *tag, instrlist_t *bb, instr_t *inst,      |  event_app_instruction(void *drcontext, void *tag, instrlist_t *bb, instr_t *inst,
                            bool for_trace, bool translating, void *user_data)               |                        bool for_trace, bool translating, void *user_data)
      {                                                                                      |  {
                                                                                             |      instr_t *instr, *next_instr;
                                                                                             |
      #ifdef SHOW_RESULTS                                                                    |  #ifdef SHOW_RESULTS
          bool aflags_dead;                                                                  |      bool aflags_dead;
      #endif                                                                                 |  #endif
                                                                                             |
          /* By default drmgr enables auto-predication, which predicates all instructions wit|      /* By default drmgr enables auto-predication, which predicates all instructions wi
           * the predicate of the current instruction on ARM.                                |       * the predicate of the current instruction on ARM.
           * We disable it here because we want to unconditionally execute the following     |       * We disable it here because we want to unconditionally execute the following
           * instrumentation.                                                                |       * instrumentation.
           */                                                                                |       */
          drmgr_disable_auto_predication(drcontext, bb);                                     |      drmgr_disable_auto_predication(drcontext, bb);
          if (!drmgr_is_first_instr(drcontext, inst))                                        |      if (!drmgr_is_first_instr(drcontext, inst))
              return DR_EMIT_DEFAULT;                                                        |          return DR_EMIT_DEFAULT;
                                                                                             |
      #ifdef VERBOSE                                                                         |  #ifdef VERBOSE
          dr_printf("in dynamorio_basic_block(tag=" PFX ")\n", tag);                         |      dr_printf("in dynamorio_basic_block(tag=" PFX ")\n", tag);
      #    ifdef VERBOSE_VERBOSE                                                             |  #    ifdef VERBOSE_VERBOSE
          instrlist_disassemble(drcontext, tag, bb, STDOUT);                                 |      instrlist_disassemble(drcontext, tag, bb, STDOUT);
      #    endif                                                                             |  #    endif
      #endif                                                                                 |  #endif
                                                                                             |
      #ifdef SHOW_RESULTS                                                                    |  #ifdef SHOW_RESULTS
          if (drreg_are_aflags_dead(drcontext, inst, &aflags_dead) == DRREG_SUCCESS &&       |      if (drreg_are_aflags_dead(drcontext, inst, &aflags_dead) == DRREG_SUCCESS &&
              !aflags_dead)                                                                  |          !aflags_dead)
              bbs_eflags_saved++;                                                            |          bbs_eflags_saved++;
          else                                                                               |      else
              bbs_no_eflags_saved++;                                                         |          bbs_no_eflags_saved++;
      #endif                                                                                 |  #endif
                                                                                             |
                                                                                             |      for (instr = instrlist_first(bb); instr != NULL; instr = next_instr) {
                                                                                             |          next_instr = instr_get_next(instr);
                                                                                             |
                                                                                             |          if (drmgr_is_emulation_start(instr)) {
                                                                                             |              drx_insert_counter_update(drcontext, bb, inst,
                                                                                             |                          SPILL_SLOT_MAX + 1,
                                                                                             |                          IF_AARCHXX_(SPILL_SLOT_MAX + 1) & emulated_count, 1, 0);
                                                                                             |              return DR_EMIT_DEFAULT;
                                                                                             |          }
                                                                                             |      }
                                                                                             |
          /* racy update on the counter for better performance */                            |      /* racy update on the counter for better performance */
          drx_insert_counter_update(drcontext, bb, inst,                                     |      drx_insert_counter_update(drcontext, bb, inst,
                                    /* We're using drmgr, so these slots                     |                                /* We're using drmgr, so these slots
                                     * here won't be used: drreg's slots will be.            |                                 * here won't be used: drreg's slots will be.
                                     */                                                      |                                 */
                                    SPILL_SLOT_MAX + 1,                                      |                                SPILL_SLOT_MAX + 1,
                                    IF_AARCHXX_(SPILL_SLOT_MAX + 1) & global_count, 1, 0);   |                                IF_AARCHXX_(SPILL_SLOT_MAX + 1) & native_count, 1, 0);
                                                                                             |
      #if defined(VERBOSE) && defined(VERBOSE_VERBOSE)                                       |  #if defined(VERBOSE) && defined(VERBOSE_VERBOSE)
          dr_printf("Finished instrumenting dynamorio_basic_block(tag=" PFX ")\n", tag);     |      dr_printf("Finished instrumenting dynamorio_basic_block(tag=" PFX ")\n", tag);
          instrlist_disassemble(drcontext, tag, bb, STDOUT);                                 |      instrlist_disassemble(drcontext, tag, bb, STDOUT);
      #endif                                                                                 |  #endif
          return DR_EMIT_DEFAULT;                                                            |      return DR_EMIT_DEFAULT;
      }                                                                                      |  }
    

    There are three things to note about this change:

    1. The for() loop uses instrlist_first() and instr_get_next() to look at each instruction in a block. This is a standard DynamoRIO method used in many clients.

    2. The drmgr_is_emulation_start() function is used to detect if an instruction is the start of a sequence of instructions which are emulating a non-native instruction. There is also a drmgr_is_emulation_end() function which detects the end of the sequence but it is not required in this client as we only want to know if there is at least one emulated instruction in the block. See opcodes_emulated.cpp as an example of how drmgr_is_emulation_start()and drmgr_is_emulation_end() are used together.

      Note

      The reference documentation for these functions is not yet available at the DynamoRIO web site. See Emulation Functions Reference for a full description of these functions.

    3. Instead of using dr_insert_clean_call() as in opcodes_emulated.cpp, this client uses drx_insert_counter_update() to increment native_count and emulated_count.
      The difference is that dr_insert_clean_call() inserts a user-defined function which is run when the block is executed, whereas drx_insert_counter_update() inserts its own code to increment a variable which is run when the block is executed.
      See the DynamoRIO API reference documentation for more details.
  4. Download the files bbcount.c and bbcount_tut2.c and compare them with a diff viewer to look at the modifications in full.

  5. To build the modified client, add bbcount_tut2.c to /path/to/your/arm-instruction-emulator-<xx.y>_Generic-AArch64_<OS>_aarch64-linux/samples/CMakeLists.txt:

    . . .
    add_sample_client(bbcount     "bbcount.c"       "drmgr;drreg;drx")               
    add_sample_client(bbcount_tut2 "bbcount_tut2.c" "drmgr;drreg;drx")               
    add_sample_client(bbsize      "bbsize.c"        "drmgr")
    . . .
  6. Run cmake. Note that the current version of ArmIE (18.4) requires that clients are built with GCC 7.1.0:

    cmake .

    This returns:

    -- The C compiler identification is GNU 7.1.0 
    -- The CXX compiler identification is GNU 7.1.0
    -- Check for working C compiler: /opt/arm/gcc-7.1.0_Generic-AArch64_SUSE-12_aarch64-linux/bin/cc
    -- Check for working C compiler: /opt/arm/gcc-7.1.0_Generic-AArch64_SUSE-12_aarch64-linux/bin/cc -- works
    -- Detecting C compiler ABI info
    -- Detecting C compiler ABI info - done
    -- Detecting C compile features
    -- Detecting C compile features - done
    -- Check for working CXX compiler: /opt/arm/gcc-7.1.0_Generic-AArch64_SUSE-12_aarch64-linux/bin/c++
    -- Check for working CXX compiler: /opt/arm/gcc-7.1.0_Generic-AArch64_SUSE-12_aarch64-linux/bin/c++ -- works
    -- Detecting CXX compiler ABI info -- Detecting CXX compiler ABI info - done
    -- Detecting CXX compile features -- Detecting CXX compile features - done
    -- Configuring done
    -- Generating done
    -- Build files have been written to: /path/to/your/arm-instruction-emulator-<xx.y>_Generic-AArch64_<OS>_aarch64-linux/samples
  7. Run make:

    make

    This returns:

    . . . 
    Scanning dependencies of target bbcount_tut2
    [ 46%] Building C object CMakeFiles/bbcount_tut2.dir/bbcount_tut2.c.o
    [ 48%] Linking C shared library bin/libbbcount_tut2.so
    Usage: pass to drconfig or drrun: -c /path/to/your/arm-instruction-emulator-<xx.y>_Generic-AArch64_<OS>_aarch64-linux/samples/bin/libbbcount_tut2.so
    [ 48%] Built target bbcount_tut2
    . . .
  8. 
        

    Copy the built client from /path/to/your/arm-instruction-emulator-<xx.y>_Generic-AArch64_<OS>_aarch64-linux/samples/binto /path/to/your/arm-instruction-emulator-<xx.y>_Generic-AArch64_<OS>_aarch64-linux/samples/bin64:

    cp bin/libbbcount_tut2.so ./bin64/ 
    file bin64/libbbcount_tut2.so bin64/libbbcount_tut2.so: ELF 64-bit LSB shared object, ARM aarch64, version 1 (SYSV), dynamically linked, not stripped
  9. Run the modified client:

    armie -msve-vector-bits=128 -i libbbcount_tut2.so -- ./example

    The output now includes a count for blocks which contain at least one emulated instruction:

    Client bbcount is running
    i       a[i]    b[i]    c[i]
    =============================
    0       197     283     86
    1       262     277     15
    2       258     293     35
    . . .
    1021    165     234     69
    1022    232     295     63
    1023    204     235     31
    Instrumentation results:
        449306 native basic block executions
           256 emulated basic block executions
          1971 basic blocks needed flag saving
             0 basic blocks did not

Results

The output now includes a count for blocks which contain at least one emulated instruction.

Related information