All signs indicate a healthy continuing demand for technology that can support ever more demanding eye-candy and apps on very high resolution display devices

Rob FarberMobile technology is where the money is right now in computer technology. Current leadership class supercomputers are “wowing” the HPC world with petaflop/s performance through the combined use of several thousand GPUs or Intel Xeon Phi coprocessors, but in reality the sale of a few thousand of these devices is insignificant when compared against the 1.5 billon cellphone processors and 190 million tablet processors that are expected to be shipped by the end of 2013. Heavy demand for retina quality displays and long battery life in laptops, tablets and cellphones provide a marketing paradise for technology vendors, as better looking devices easily entice consumers to abandon still-working technology to purchase new products. The market will undoubtedly remain hot in mobile computing for many more device generations as technology companies struggle to deliver interactive performance for high-resolution, touch-enabled, true-color displays while hobbled by fixed power budgets dictated by limited space and current battery technology. Mobile technology winners will be those companies that can avoid being crushed between the rock of high-pixel displays and the unyielding limitations of fixed power budgets.

To get a sense of the display resolution numbers game, laptop vendors currently offer 3200x1800 touch-enabled laptops, while newer 3840x2160 15.6-inch displays are being considered for next-generation devices. Smaller 10-inch screens in the tablet market can be purchased with 2560x1600 resolution screens. Meanwhile, high-end cellphones are able to natively display HD television content on their small, yet still 1080p compatible, 1920x1080 resolution screens. From a performance perspective, all of these devices need to be able to move and render potentially hundreds of megabytes of display data per second without overheating or decimating battery life.

Currently tablet and cellphone devices appear to be using specialized GPU hardware that has been power optimized to perform common image and “eye candy” operations, such as rendering, scaling, translating and warping 2-D images. (These GPUs tend to be earlier generation GPU architectures that are not generally programmable in languages such as CUDA or OpenCL.) Additional on-device logic helps provide power-efficient video decoding. Battery life remains good so long as the user performs tasks that map to these optimized operations. This approach appears to be working well, as this relatively small set of operations is all that many users require for Web browsing, reading and viewing pictures — or 90 percent of their device usage.

More aggressive users require the capabilities of multi-core processors and hybrid GPU combinations that provide an additional high-performance discrete GPU for 3-D and accelerated gaming performance. The caveat with the current technology is that battery life will be significantly shortened when using these additional computational capabilities.

The promise of a programmable GPU pipeline is on both NVIDIA and Intel roadmaps, which should greatly accelerate the development of large numbers of advanced applications that can run in a power-efficient fashion. Potential markets include augmented reality, real-time signal and image processing, better speech recognition, plus dazzling eye-candy and console quality gaming.

In response to intense competition by the various ARM powered chipsets for the billion unit mobile market, both tablet processors and SoC (System On a Chip) designs are now first-class citizens in the Intel marketing and published roadmaps. The quote most often bandied about by the press is Intel’s goal to “boost CPU processing by five times by 2016, and GPU performance by as much as 15 times.”

Traditionally, Intel has been refining and releasing new designs according to an alternating “tick tock” cycle where manufacturing and process improvements are introduced during a “tick” in the cycle, after which new architectures are introduced during a “tock” in the cycle. This alternating process improvement followed by architecture improvements reduces risk while leveraging both Intel’s strong architecture and manufacturing process expertise.

We are currently in a “tick” phase, during which Intel is refining their manufacturing capabilities, which is why we are hearing about upcoming 14 nm devices. (Most current mobile processors are built using a 28 nm fabrication process.)The “tock” portion of the cycle will occur in the 2015 timeframe, where we can look for the introduction of the next generation of Skylake processor family for laptops (along with workstation and HPC systems) and the Broxton tablet and smartphone chips.

According to roadmaps released by Intel, the Skylake processors will support enhanced 512-bit AVX 3.2 instructions to speed both vector and GPU-like SPMD applications. These processors contain integrated GPUs that are purported to support a form of unified memory addressing (described as similar to AMD’s Heterogeneous System Architecture) that will simplify the usage of integrated graphics engines for general-purpose applications (e.g general programmability). Thus, we see Intel moving to blur the line between programming a CPU and GPU for consumer devices, much like the Intel Xeon Phi family is attempting to blur the CPU/massively parallel coprocessor distinction for workstations and CPU clusters. There appears to be debate if the Skylake processor family will be fully integrated into an SoC during the 2015 timeframe. If so, perhaps the Skylake processors will be utilized in other mobile devices, such as tablets.

Intel’s chief executive Brian Krzanich admitted at the firm’s November 2013 annual investor relation day that the company had become “insular” over the past few years and failed to meet market demands. To demonstrate nimbleness and a clear change of heart, the 2014 generation of Intel Atom mobile processor (codename SoFIA) will be developed outside of the Intel factories. In 2015, all Intel tablet and smartphones chips will converge into the Broxton family of mobile processors. The all-important GPU for driving those high-resolution displays (and thus capturing the eyes and hearts of consumers) is described as a Skylake-generation graphic processor.

While it is easy to get lost in the various processor codenames and interpretations of leaked Intel information and roadmaps, it appears that Intel is literally “betting the company” on the Skylake GPU architecture. The challenges are great, because any lack in visual appeal, interactive response, programmability, power consumption, ruggedness or heat dissipation will bring failure in the market and make the processor design efforts for naught. While not guaranteed, features mentioned in passing, such as unified device memory, indicate that Intel is also looking to incorporate general-purpose programming in the graphics pipeline. If so, then the debate over OpenMP 4.0 versus OpenACC will likely become a hot topic in the mobile computing forums.

Intel has no choice but to play catch-up in the mobile markets. Just as they are now stepping outside of the tried and true Intel “tick tock” design cycle to bring SoFia to market in 2014, so is Intel leaving the door open to an accelerated design cycle during the “tick” phase. In particular, the Broxton processors have a different chip design compared to predecessors, and are to be built into what Intel executives called a “chassis” to which other components can be easily connected. For this reason, the Broxton design allows derivatives of the chip to be created at a faster pace, said Intel CEO Brian Krzanich. However, Intel executives acknowledged that SoFIA and Broxton are being released earlier than expected. “Three months ago, this wasn’t on the road map,” Krzanich said.

NVIDIA continues along their publically announced Tegra roadmap for mobile devices with the exception of an updated schedule for Tegra 4i that adds an LTE modem. While Qualcomm still dominates the ARM marketplace, NVIDIA hopes that certification of the Tegra 4i/LTE combination by AT&T will cause a resurgence of sales for the Tegra mobile processor line in 2014. Vendors also should look for the first of the Tegra Logan mobile processors in 2014. It is certain that NVIDIA is placing big bets that the integrated Kepler GPU in the Logan SoC — the fastest graphics core ever integrated into a mobile SoC — will spur sales even further. Demos of the Tegra 5 (a.k.a Logan) devices show that the Kepler GPU core can render lifelike yet totally machine-generated faces in real-time at 1080p resolution while consuming only one watt of power! The significance of this achievement can be appreciated by considering that the Titan desktop GPU requires 250W of power to perform the same rendering task.

NVIDIA has been challenged by the need to build the software infrastructure along with the hardware technology to make so much computational power available to the mass market. Part of the problem lies in the fact that Android and application developers need to program to the lowest common denominator of functionality. Otherwise, the operating system and customer base will be become locked to the most powerful technology, thus excluding products from major players such as Qualcomm and Intel.

By integrating a Kepler GPU in the Tegra 5 (Logan) SoC, NVIDIA has provided a proven, mature GPU technology that can support high-resolution devices from cellphones to tablets and laptops. In addition, the Kepler GPU leverages the tremendous programming advantages of CUDA, OpenCL and OpenACC to make this computing capability available to a large experienced pool of application developers. If the Logan/Keper bet pays off for NVIDIA, customers should see console-quality gaming running on cellphones and tablets in 2014 along with very cool 3-D eye-candy added to existing applications. It is likely that new applications will start appearing by 2015 that utilize the Kepler programmable features along with the potential start of a gold rush that will bring the promise of augmented reality to the mobile market.

Apple recognized early the importance of the display in their mobile devices. Enough said about Apple’s success with their approach except to note that the term “retina display” is now ubiqui-
tously recognized by consumers of mobile technology. Rather than being a reseller of display technology, Apple has actively invested in the supply chain and intervened in the technical direction of its displays. However, the rest of the industry appears to be catching up, which means Apple is no longer able to rely on the unique features of its displays as a long-term differentiator. The choice of next-generation GPU technology will be an important one for Apple, as the GPU and associated software will dictate the visual appeal of the Apple products. Unlike NVIDIA, Apple tightly controls the operating system and internal software of its products. While this strategy helped with being first to market by avoiding the “lowest common denominator” problem facing Android devices, it may cause issues with maintaining their status as the visual display leader now that companies like NVIDIA and Intel are making it easier to program the graphics pipelines in mobile devices.

Apple was the OpenCL innovator, but they have apparently relinquished their role as the OpenCL leader.

Qualcomm is the mobile processor market leader, with a 53-percent market share. At their financial analyst day in New York City, Qualcomm signaled their intention to become more competitive in the modem space — especially on LTE. This move by Qualcomm likely explains the accelerated integration of the LTE modem by NVIDIA. The new Snapdragon 805 mobile processor is claimed to be the first to support Ultra HD (4k) displays and user interfaces along with Ultra HD gaming, with a 40 percent increase in performance while consuming 75 percent less power. In addition, the Snapdragon 805 can support camera resolutions as high as 42 million pixels. All this is consistent with a market leader following an incremental approach to improve connectivity of their mobile devices while continuing to upgrade the visual experience and battery life of their massive customer base. (It is worth noting that analysts are skeptical of the 4k resolution gaming claim.)

In a nutshell, the market looks fantastic for consumers of mobile technology according to the published roadmaps through 2015. The Qualcomm strategy is a fairly safe incremental plan for growth while waiting to see how the price competition of NVIDIA Tegra 4i and potential of the aggressive Kepler/Logan performance technology affect customer demand. Under new leadership, Intel appears to be briskly revising its mobile technology roadmap to capture market share. Meanwhile, Apple remains a premier brand for mobile devices.

Who will capture the hearts and minds of the billion+ unit mobile market fundamentally depends on the technology utilized to avoid being crushed between the rock of high-pixel displays and the unyielding limitations of fixed power budgets. The lessons learned from the past indicate that limited, yet highly optimized hardware works well for the initial wave of customer demand, while more general-purpose programming capabilities are required as customers become educated and more demanding. For this reason, it is likely that the ability to support a higher level programming model such as OpenMP 4.0 or OpenACC (or possibly CUDA or OpenCL) will become a key factor in future mobile success. In addition, the creation of more 3-D WebGL content will highlight limitations in current devices and make them look dated. All signs indicate a very healthy continuing demand for mobile technology that can support ever more demanding eye-
candy and mobile applications on very high resolution display devices.

Rob Farber is an independent HPC expert to startups and Fortune 100 companies, as well as government and academic organizations. He may be reached at