Some modern software, particularly media codecs and graphics accelerators, operate on large amounts of data that is less than word-sized. 16-bit data is common in audio applications, and 8-bit data is common in graphics and video.
When performing these operations on a 32-bit microprocessor, parts of the computation units are unused, but continue to consume power. To make better use of the available resources, SIMD technology uses a single instruction to perform the same operation in parallel on multiple data elements of the same type and size. This way, the hardware that normally adds two 32-bit values instead performs four parallel additions of 8-bit values in the same amount of time.