Streamline for Bare-metal Systems

 Streamline for Bare-metal allows the power of Streamline to be used in exciting new areas such as Cortex-R and Cortex-M based devices. It also supports the use of all the features from regular Streamline on devices that don’t have a Linux based operating system, these features include:

Hardware Counters: Choose which hardware counters to sample and see the results in easy to interpret graphs, allowing the user to easily identify bottlenecks in their system.

PC Sampling: Shows the user where their application, or even system, spends most of its time, whether at process level, function level or even line by line in source code. The user can then optimize their code and rerun Streamline to see if their optimizations have had any impact on the system

Custom Counters: Easily add counters for custom IP and have Streamline collect data from them to display in graphs. 

Annotations: Allow the user to place markers into their code to easily identify when certain steps occur in Streamline and quickly see the hardware counter data or where CPU time is spent during that period.

Heat Map: Shows exactly which tasks are working on which cores. This allows the user to easily see how all their tasks are being scheduled as well as which cores are free to receive more work.


Streamline Bare-metal works by compiling some generated code from Streamline into the application. This code will collect all the performance data from the system and then transport it from the system so it can be imported back into Streamline.


Steps for using Streamline Bare-metal

The following four steps are involved in using Streamline Bare-metal:

Generate: Guided by a wizard, Streamline will generate agent code that is unique to the system. It will only collect the information the user wants about the cores that are interesting in their system.

Instrument: The code needs to be instrumented with the calls to the generated Streamline agent. As a minimum, there needs to be a call to the initialization function and then the user needs to choose when they would like to sample the counters and pc, whether this is in an interrupt handler or at various points in the code.

Run: From there the application needs to be run to collect the data.

Import: Once the run of the application is complete, the data that has been collected needs to be imported into Streamline, after which the user can use Streamline just like they would if they had collected the data from Linux.


Transporting data of the device

We understand that there is a wealth of different systems, each with different trace capabilities, so Streamline Bare-metal can get the trace and profiling information from the device in a variety of ways:

Main memory: Streamline can place all the data into main memory where the user can transport it off the device, this solution is perfect if the user doesn’t have access to any trace hardware on the device.

STM: Streamline can transport the data via STM, which is beneficial if the device hasn’t much memory to store the data. The Streamline agent can send the data over STM to be captured by DSTREAM. The user can then take this information and import it directly into Streamline.

ITM: Transporting the data via ITM is ideal for any Cortex-M related devices. When selecting this option the data will be placed in the DSTREAM buffer and. Using this mode Streamline will automatically capture the hardware counters without the need to manually call the sample function.