Streamline Performance Analyzer

Arm DS-5 Development Studio Streamline performance analyzer enables you to get the best out of your system’s resources and create high performance, energy efficient products. Its innovative user interface brings together system performance metrics, software tracing, statistical profiling and power measurement to present you with a system dashboard where you can quickly identify code hotspots, system bottlenecks and other unintended effects of your code or the system architecture. Take an in-depth look at Streamline or read the Streamline FAQ.

  • Speed Up Your Code

    Find out where the CPU is spending the most time, improve code parallelization for multicore platforms and tune code for optimal cache usage.

  • Reduce Energy Footprint

    Monitor actual power consumption with the Arm Energy Probe, spot where you can improve power management and optimize compute tasks for efficiency.

  • Balance System Resources

    Analyze and optimize Mali GPU utilization, monitor CPU and GPU cache usage and system memory. Check load distribution across multiple cores.

  • Customize it for Your System

    Connect your own data to Streamline analysis views, extend the open source driver to monitor variables and augment your code to send printf-like messages to Streamline.

The right level for modern complex systems

We are great fans of Arm CoreSight trace. It is the best technology for so many use cases. But the reality is, on today’s very fast and complex multicore SoCs instruction level is just the wrong level of abstraction for system analysis. Streamline for Linux and Android uses a hassle-free architecture, based on a software agent (named Gator), to collect all statistics you require to analyze your system, at a fraction of the cost of a high-end trace unit. Gator is open source (therefore extensible) and typically does not take more than 3% of CPU time on a single core device. Learn more about real-life Streamline use cases in this blog series.

Multicore system optimization made simple

The potential performance gain of an extra core can be easily missed because of issues like poor thread synchronization and sub-optimal parallelization. For SMP platforms, Streamline features per-core and per-cluster statistics to help you quickly verify your system utilization. Moreover, as if you were Superman, the X-Ray mode makes you see through the software threads’ tracing to find out on which core they were running at any moment.

Reap the energy efficiency advantage of your Arm device

ARM Energy Probe and National Instruments DAQ unit

We are very proud of our processor technology and its indisputable energy efficiency leadership. However, between us designing our IP and your final product shipping, many things can dramatically impact on energy consumption. In this respect, we know your software can be either the hero or the villain of the day.

When paired to an Arm Energy Probe or National Instruments DAQ unit, Streamline can acquire real power data from your board and correlate it with all the other software and hardware statistics, including DVFS and cpuidle, to show you the true picture of your power management. Alternatively, Streamline can read and display these measurements directly from your Linux hwmon subsystem. Learn more about how to use the Energy Probe with the hwmon subsystem in this blog. Watch the Energy Probe introduction video.

Integrated Arm Mali Graphics, OpenCL and CoreLink CCI Performance Analysis

Graphics intensive tasks, such as sophisticated user interfaces and gaming content, do not run in isolation in just one processor. For this reason, you need to have visibility of the performance across application and graphics processors. Streamline links up to Arm Mali GPU drivers to provide a wide range of statistics on OpenGL® ES 1.1 and 2.0 usage, over 300 software and hardware performance counters and samples of the frame buffer, enabling a new breed of high performance, energy efficient content. Learn about GPU optimization for Mali Utgard and Midgard devices with our practical guide.

Streamline also supports visualization of OpenCL dependencies, helping you to balance resources between GPU and CPU better than ever. By making clever use of processor loading, significant performance gains are possible, giving your customers a slicker, more responsive, more engaging experience on the latest generation of mobile devices.

In addition, Streamline highlights bottlenecks coming from fabric resources such as cache memories and the CoreLink CCI-400 by also reading and displaying its performance counters.

ARM Streamline on DS-5 v5.21 showing Mali counters and OpenCL support

User Annotations

Streamline also provides visibility of high-level events in the software, which are important to measure time between events and understand the relationships between events, thread activity and system resources. From tracking machine state changes on a timeline, to correlating frame buffer content with performance issues, all you need to do is to write (yes, printf style) into the gator driver from either user or kernel space.