Technology That Interests Me

Below are my ramblings about technology that interests me. Use the index just below to jump down the page. Enjoy!

Processors Enter the Era of Role Your Own  
ARM® CoreLink™ CCI-500 Cache Coherent Interconnect It's a big.LITTLE Idea
Nostalgia for a Dell System 425E
Microsoft HoloLens - Hype or A Leap Forward in Virtual Reality?
How Not To Go Psycho Over Psychometric Tests
Nvidia Is Suing Samsung and Qualcomm
Moore's Law Meets Economic Reality Governments Choosing Futuremark® PCMark® Over BAPCo® SYSmark®
Five Applications of Video Encoders in Integrated & Discrete Graphics "Massive Skylake Leak"
AMD's Fading Channel Presence Denver in Cupertino
Will Broadwell GT2 Graphics Beat Kaveri? Amazon Fire TV Has Potential But Not Yet Worth $99 as an Upgrade
Amazon Fire TV's Voice UI Doesn't Make You Work to Relax Don't Mock the Game Console, Your 2016 Gaming PC May Be Like Them!
The 4 Real Reasons Intel's Cherry Trail Will Beat AMD's Beema
Your Viewpoint




Processors Enter the Era of Role Your Own

June 19, 2015 - In recent weeks there has been a flurry of articles in the press about Intel® acquiring Altera® and Recon Instruments and along with them gaining access to ARM® processors. To be clear, they have acquired companies that have license agreements with ARM for architecture or core designs. Just which core architecture generations isn’t that clear but this isn’t Intel’s first access to ARM IP for processor or bus architecture.

What may be more interesting is why Altera is so important to Intel to remain competitive. Recent job openings have left those with interviews walking about with surprising information indicates there is something new in the wind that could be threatening to Intel. It seems that if you have enough resources ($$$) and sufficient motivation you can now afford to role your own processor and to integrate into custom accelerators to differentiate your SoC to increase your products value proposition.

I have heard in “afterhours research” at local taverns that both Amazon® and Oracle® are currently rolling their own processors. While Amazon may be thinking of improvements in the next generation Kindle Fire to better compete against Apple™ and Samsung™, my guess is that they are just as interested in providing a better processor for their servers.  So Amazon may doing more than one custom design. Obviously, Oracle is thinking what’s next for their Sun server line. Both of these company’s designs are reported to be based on ARM cores.

So what does that mean to x86 server chip providers Intel and AMD.  For AMD it means there is a lot of pressure on their semicustom business plan and on their remaining processor product plans. AMD doesn’t have very deep pockets and they have increasingly downsized so they don’t have a lot of engineering resources either.  Many of their programs under development just a year or so ago are now dead or in limbo.

Even if AMD can eventually ship a great low power x86 or ARM-based server chip, there may not be a very large market for it if big companies roll their own processors and Intel continues to grab most of the remaining market. The “analyst suggestion” reported by Fudzilla that AMD might merge with (or be bought out by) Xylinx is logical speculation, but probably not more than that.  Reportedly, others have looked and walked away.

And that leaves Intel. Buying Altera is a great response to the new threat. And I don’t say that just because Altera has ARM licenses. It is because Altera opens the door to semicustom designs with an array of custom accelerators.  Now customers with pockets not as deep as Oracle or Amazon, or maybe customers with just as deep pockets but without the desire to DIY, can turn to Intel for a rapid semicustom solution.

So, it has been almost 35 years since the first PCs and about 25 years since AMD began competing in the x86 processor supremacy wars.  Now that paradigm is over.  Not only is ARM the new competition but there isn’t just from one competitor.  The art of processor design is no longer in just the hands of two.  Now multiple companies that can afford to roll their own processor designs rather than depend on traditional processor suppliers Intel, AMD, Qualcomm, Samsung, etc. and the competitive dynamic will never be the same again.

 



ARM® CoreLink™ CCI-500 Cache Coherent Interconnect

HyperTransport InterconnectApril 16, 2015 - In 2000 I took the stage at Microsoft’s Windows Hardware Engineering Conference (WinHEC) to announce a new technology from AMD codenamed “Lightening Data Transport” or “LDT”.  Branded the HyperTransport™ Technology interconnect (HTT) when it shipped, HTT allowed multiple processor/north bridge complexes to be interconnected. With the introduction of AMD’s “K8” Opteron processors the northbridge was integrated and HTT replaced the frontside bus architecture licensed from Intel that was used in prior AMD processors.

HyperTransport Technology was part of AMD’s ambitious plans to expand into 4-way and larger servers and other markets including telecom. Since HTT allowed multiple processors to be interconnected a protocol transmitted across the interconnect maintained cache coherency between processors. The architecture also made it possible for total memory size to scale up as processors were added.

AMD’s plan also included interconnecting special purpose coprocessors and possibility replacing the PCI interface to graphics cards.  In its first revision HTT offered significantly more bandwidth to system memory compared to PCI which could improve graphics performance. Unfortunately, no one every built a graphics chip with a native HTT interface.

Key HTT Benefits

  • Greater bandwidth between processors or devices (I/O, graphics, special purpose accelerators)
  • Multiple paths between processors or devices (i.e., less likelihood of bandwidth bottlenecks)
  • Memory scales up with the number of processors

Lesser Known Benefits

  • North and south pathways in each link eliminate buss turn around found in the traditional frontside bus
  • Scheme for prioritizing transfers (e.g., isochronous traffic gets priority)
  • Packet based (multiple “channels” can share a link)
  • Snoop filtering prevents unnecessary probe transmission when data is not in other processor caches

CCI-500 SoC ArchitectureIn 2008 Intel copied much of what made HTT an AMD advantage and added a few improvements of their own.  Intel called it the Quick Path Interconnect (QPI). Since, both HTT and QPI have been revised multiple times to improve bandwidth and add other performance enhancements.

Today, more and more features are integrated into processors including many of the traits and benefits of HTT and QPI.  Here’s a block diagram of an ARM multiprocessor. This entire design is on one SoC. Note the Cortex-A processor clusters, the Mali™ graphics processor, other I/O and the memory controller all interconnected via the ARM CoreLink™ CCI-500 Cache Coherent Interconnect.

This block diagram could be for a processor in a tablet or phone, but it also could be the basis for a server.  With a few tweaks to support the proper external interconnect (e.g., 10 or 40 Gigabit Ethernet) and a few more Cortex-A processor clusters it could be a formidable server-on-a-chip.

Let’s Break Down the Server Analogy

Lets take a look at what HTT and QPI began providing a decode or more ago at a system level and what is available now within an SoC using ARM's CoreLink CCI-500 architecture.

 

HTT or QPI Interconnected Server

CCI-500 Internally Interconnected Server

Processor

Processors with 4-cores or more
Cores share large Level Cache

On-die 4-core cluster
Cores in cluster share large Level Cache

System Memory

Memory size scales with # of processors
Up to 4-channels of memory per processor

Memory subsystem sized to serve # of processors
Supports up to 4-channels of memory

Interconnect

New HT and QPI revisions add bandwidth
Processors snoop other processor’s caches
   to insure cache coherency
Snoop filter prevents unnecessary snoop
   probe traffic

CCI-500 has 2X the peak system bandwidth
    of CCI-400
Clusters snoop other cluster’s caches
   to insure cache coherency
Snoop filter prevents unnecessary snoop probe
   traffic

Graphics

Graphics performance is improved by greater
   interconnect bandwidth to system memory

Graphics performance is improved by greater   interconnect bandwidth to system memory

Accelerators

HTT or QPI can offer high memory bandwidth
and fast data exchange for I/O & accelerators

CCI-500 offers improved memory bandwidth and fast  data for I/O & accelerators (e.g., 4K video)

Customization

Proprietary, integrate at the board-level

Proprietary, integrate at the SoC-level

Extra Benefits

Can support on-die security coprocessors
Can extend to many coherent processors

Supports TrustZone™ Secure Media Path
Supports up to 4 coherent processor clusters
Supports big.LITTLE™ clusters

            
The Client Benefit

When HTT was introduced it seemed like a lot of overhead for a client system but by moving the interconnect onto the SoC that external complexity is gone. The benefits of large system memory bandwidth and low latency transfers remains. While a tablet, phone or notebook is unlikely to need four processor clusters there are several two cluster, big.LITTLE devices available now from ARM partners. These devices offer an almost ideal range of performance and power efficiency for mobile devices.

Summary

ARM has brought us a new day in SoC scalability. The performance, scalability and power profiles give ARM and its partners unique opportunities to continue to innovate exciting new devices for both client and server using CCI-500.



It's a big.LITTLE Idea

big.LITTLE Update

April 3, 2015 - When you right a response to someone's questions you can expect to generate more questions. This is a good one, "Is big.LITTLE ARM's answer to Intel's 14nm process?" The short answer is "Yes, but..." It is true that Intel has the best semiconductor process in the world. It is also true that the Intel processes that use 3D transistors (aka FinFET transistors) have exceptionally low leakage and ability to deliver frequency of operation with low dynamic power. For this reason, Intel can get a wider performance vs. power profile for its processor cores than if they were built in any competitors foundry process (e.g., TSMC or Samsung). Hence, big.LITTLE helps ARM partners to design and build in foundries like TSMC very competitive products with very good and perhaps equal or better performance vs. power range than Intel cores in Intel's process technology.

"Do ARM big.LITTLE cores outperform Intel cores in their respective process technologies?" That's the so called $64,000... make that $64 billion question. I don't believe ARM has quite the same execution efficiency as Intel but I cannot provide hard data to prove that at this moment. But, it is clear that ARM cores and big.LITTLE designs are being used to create products that are highly competitive in phones, tablets, servers and IoT devices.

If anything proves that ARM and its partners are succeeding it is the fact that Intel has been spending over $1 billion per quarter in "contra revenue" to buy its way into the afore mentioned markets currently dominated by ARM partners. That expenditure may sound a bit fishy, especially in light of the money Intel had to pay AMD in the past for anti-competitive actions but it appears its completely legal for Intel to pay for the design and development of their customers' products with contra revenue in order to get tablets, phones, etc. into the market with Intel processors inside. But even Intel, with its very deep pockets can't keep spending like that. Eventually, the processor designs and semiconductor process technologies have to stand on their own.

One final note about competition, there are other factors at play. Intel tends to commoditize every market it enters. Soon, it takes over all the technology it can within the markets it want to play in. That ultimately guts the differentiation that the manufactures have in their products leaving the manufacturers to compete on price and software bundles. Many of ARM's partners don't want to see that happen to their markets; hence, you can expect larger, better established ARM partners to avoid partnering with Intel if they can continue to differentiate and compete by staying with ARM-based processors. The long term problem will be if Intel gains enough of the smaller guys or develops much superior products that staying with ARM as a manufacturer stops being so attractive. In the end, it's all about what the customers of the process manufacturers who develop, manufacture and market the end product have to do to compete.

Original Article

March 19, 2015 - Recently I was asked, "what's the big deal with ARM's big.LITTLE processors?" In case you aren't familiar, big.LITTLE is ARMs pairing of their high-performance Cortex A processors with deep pipelines and out-of-order execution with somewhat more modest performing but ultra-power efficient Cortex A processors with in-order-execution and short pipelines. A couple of big.LITTLE examples would be the new Samsung Exynos 5433 with four Cortex-A57 and four Cortex-A53 cores or MediaTek's MT6595 with four Cortex-A17 and four Cortex-A7 on the SoC.

big.LITTLE Power ProfilesThe idea behind big.LITTLE is to dynamically match the cores with the best balance of power efficiency and performance to the threads running at any given instant. A thread that does a lot of work and demands high performance is dispatched to a high performance core. A light weight thread that does little work is dispatched to a more power efficient core.

Here's an ARM chart for the "big" Cortex-A15 and "LITTLE" Cortex-A7 cores showing their performance vs. power profiles. The A15 and A7 were the first ARM cores with big.LITTLE capability.

Voltage and frequency modulation are used by most modern processors to scale the frequency and voltage of the processor to match its performance level to the demands of the thread it is running. When introduced over a decade ago, this concept (aka P-states) made mobile computing devices much more efficient. The processor was slowed for light loads and sped up for heavy loads. Since the voltage and frequency of the processor core were lowered together, the power consumption for the processor cores was greatly reduced for light loads. Modulating voltage and frequency, and therefore the performance and power, in this way can be understood as moving up and down the power vs. performances curves in the power vs. performance profile graph above.

A decade ago when frequency and voltage modulation were introduced it resulted in greatly improved battery life and enabled notebooks to play a 4 hour movie without running out of battery power. Today's tablets and smartphones need even greater range for performance verses efficiency to achieve the needed blend of peak performance capability and the high power efficiency that leads to longer battery and lower surface temperatures.

Why can't a core cover a greater range? Well a couple of things set the end points for the curve in a given process technology. The process technology has an upper clock frequency limit and the number of gates in the core design determines the power at that frequency.  A "big" high performance core has many more transistors than a "LITTLE" high efficiency core, so it consumes more power at its maximum frequency of operation.  So, the upper point on the operating curve is set by the process technology and the complexity (i.e., number of transistors) in the core design.

The lowest point on the operating curve for a core is also influenced by the process technology. When running with a lower clock frequency, below a certain limit you can no longer reduce the operating voltage and still have the core function. Since the power is proportional to the square of the voltage and only linearly proportional to the frequency, most of the benefit of going to a slow frequency of operation is eliminated if the voltage cannot be lowered further.

In addition, the process technology’s gate leakage impacts the minimum power a core design can achieve. The more complex the core design (e.g., long pipeline), the more gates and the more leakage.  In complex SoCs with billions of transistors this “static power” can significantly contribute to the lowest power that a core can achieved.  Fortunately, power switches on the SoC’s die can completely turn off high performance cores like the Cortex-A15 with lots of transistors after a thread is migrated to a highly power efficient core like the Cortex-A7.

Cortex-A15 vs A7 Pipelines In the illustration on the right, the smaller Cortex-A7 core compared to the larger Cortex-A15 core has fewer features; no out-of-order execution, shorter pipelines, and issues a maximum of two instructions per clock cycle instead of three. All that makes the core performance lower but it also means the Cortex-A7 core uses much less power. 

Of course, the reverse is true.  If a thread needs greater performance it can be migrated from a smaller core to a larger core. The process of migrating a thread from one core type to another includes:

  • The core running the thread is stopped,
  • Its core state is saved,
  • The destination core is reset
        (the destination core's cache if invalidated)
  • The core state is restored to the destination core
  • The destination cores starts running the thread 

The migration task takes place so quickly that little opportunity to continue working on the thread is lost.

Initially, big.LITTLE SoCs only swapped threads between the big and LITTLE cores by a migration process that relied on a driver or other means of augmenting the OS's own ability to assign threads to cores. Now, that is becoming unnecessary. It is because newer versions of operating systems are becoming more big.LITTLE capable and can now assign threads to the appropriate core and migrate them without any assistance.

The OS can even use any combination of the small and large cores as needed including using all the cores at the same time.  Eliminating the migration of threads by augmenting the OS also enables SoC designs with a different number of big and LITTLE cores. A SoC with two big cores and four LITTLE cores might be more appropriate for use in a device that only occasionally needs to run heavy workloads. It also is a smaller and cheaper to manufacture SoC than one with eight cores.

ARM's big.LITTLE is a big deal because it expands the performance vs. power profile dynamic range in process technologies available from foundries like TSMC. For ARM-based cores to compete against cores built in Intel's leading edge process this is a necessity. At the same time ARM has advanced the state-of-the-art by extending both the hardware and OS support needed for supporting asymetrical cores.

Finally, I would be remise if I didn't point out that 64-bit ARM cores are now entering the market that support the big.LITTLE concept. The Cortex-A57 and the Cortex-A53 support both ARM's 32-bit and 64-bit instruction sets and have pipelines similar in length and configuration to the Cortex-A15 and Cortex-A7. The Samsung product I mentioned above uses these 64-bit cores. Also, you may also have heard the recent announcement about the Cortex-A72 which is a big core with about 30% greater performance over the Cortex-A57. The Cortex-A72 can be used in SoCs with the existing Cortex-A53.

ARM 64-bit core products will open doors to new markets for ARM processors including extending ARM-based processors into high performance servers. In a few days I'll write something about ARM core clusters and their Cache Coherent Interface (CCI-500) which can interconnect multiple core clusters to provide very powerful multicore computing solutions.



Dell 425E Ad

Nostalgia for a Dell System 425E

February 16, 2015 - I recently moved across Austin to a new home. While unpacking I came a cross a folded up Dell ad proof from the 90s. A portion of the two page spread is on the left. The "Power Monger" is me. And the PC that is cropped out of the picture is the Dell System 425E, a 25MHz 486 platform with EISA-bus, 4MB DRAM, 80MB hard drive and Weitek floating point coprocessor. Base price was $7,899 which we thought was a great value since the equivalent Compaq was priced at just over $12,350. Adjusted for inflation, the Dell would today cost $14,307 and the Compaq $22,375. That makes you really appreciate how much you can get today for less than $500.

The ad for the Dell 425E brought a bit of nostalgia for me. I was hired into Dell as Senior Staff Product Planner for "Project Olympic" by Glenn Henry, the Senior VP of Operations and a former IBM system architect and development manager. Olympic was Glenn's idea, a Swiss Army knife with two platforms, one desktop and one tower. There were four motherboards, one EISA and one Microchannel for each chassis. The processor, memory and northbridge were on cards that plugged into any motherboard. There were three different processor cards and a desktop could take up to 2 and the tower could take up to 3. With multiple cards, the desktop and tower were essentially Dell's first multiprocessor workstation and server. All of this complexity was supported by custom silicon being designed inside of Dell. We took the project all the way to building the pilot runs for the EISA desktop and tower before this Olympic-size project was cancelled. Michael Dell wrote about the lessons learned in his book. For my version of the cancellation story you'll have to by me a beer.

Back to the System 425E - Olympic took about 1.5 years to get to the gallows. During that time Dell fell behind the industry with Compaq, IBM, Gateway and others shipping 486 systems. We needed to do something and fast. The System 425E was that something and we cranked it out in about six months, record time for a company who was used to doing almost everything from motherboards to sheet metal and plastics. To get the platform out quickly, Dell did something it hadn't done since Michael left the dorm room business that came to be Dell, we bought an off the shelf motherboard. The testing and regression department quickly got their hands on all the available 486 boards and picked a couple of candidates and we struck a deal with one vendor as the supplier. Chassis, power supplies and a lot of marketing materials developed for Olympic were quickly reworked for the 425E. During this time, revenue was down so we literally reduced the number of fluorescent tubes in the ceiling lights and adjusted the thermostat to help save money until the 425E could ship. The System may have been late, but with its aggressive pricing it was a success. It also was a forerunner for the Dimension platforms that only used third party motherboards.

I got into the ad because at the last minute they decided that VP Mike O'Dell's name was too similar to Michael's. The photo shoot was scheduled and couldn't be delayed if we were to make deadlines for PC Magazine and PC World. So, I volunteered. I went to the phone studio on South Lamar and a makeup artist and a hair stylist work me over for a couple of hours before the photos. The suit which would have fit Mike O'Dell well was pinned up the back to make it look good on me. The photo shoot took a couple of hours with the photographer asking me to make minor adjustments for each shot. Toward the end, the keyboard with solid metal case started to get really heavy and I had to stop for 10 minutes for my arm to stop shaking. The ad made it to the magazines on time and my mom kept the only copy of PC Magazine she ever bought on her coffee table in Houston to show friends her son who lived in Austin.   (Full ad PDF)



Microsoft HoloLens - Hype or A Leap Forward in Virtual Reality?

January 23, 2015 - This week Microsoft made as much news with their HoloLens virtual reality goggles as they did with the role out of the Technical Preview of Windows 10. Further, some industry writers took to their websites to say how magical the HoloLens experience was. Having a somewhat jaundiced opinion of Google Glass and Oculus Rift not delivering on their promise I want to make a few comments about what it will take for HoloLens to make a convincing, and broadly useful virtual reality experience.

First, HoloLens appears to have gone with a combination of the immersive experience of Oculus Rift, while maintaining the connection to the real world environment like Google Glass. How that is done appears to be unique but at this moment remains unclear without the ability to have a hands one examination of the gear. So, after reading several articles and not finding details I'm going to speculate! At least until I can get my hands on a developer kit.

HoloLens GogglesTo the right is a publicity shot of the HoloLens device. I have chosen this image from several that Microsoft released because of the angle of the view and the detail that can be seen when I manipulated the gamma of the photo in Photoshop. HoloLens marketing videos show it superimposing objects and annotations over real world images. Now, HaloLens could just display that on the visor's lenses but then the displayed objects would be transparent. If that were the case, when HoloLens flies a superhero across your field of view the superhero would be as transparent as George Reeves in those early 1950 Superman TV shows.

But the Microsoft HaloLens videos show virtual augmentation objects that look solid. To do that HaloLens would have to obscure what is behind the object. Doing that optically might be possible but I think my gamma corrected image of the HoloLens goggles shows it is done differently. There appear to be two video cameras mounted above each eye to give the system a stereoscopic (3D) view of the real world environment. If this is true, then the real world and virtual objects can be composited in the computer's graphics engine. That would permit the background behind the object to be fully obscured and the virtual reality object would look solid.

The world beyond the stereo screens in the HaloLens goggles would have to be obscured. So, I wonder, if the front of the goggles are photochromic material that fogs over when they are in use, making the images projected on the internal screens the only thing being viewed. My guess that this or some other means is used to make the images projected on the lenses your sole visual environment.

If my speculation about the stereoscopic cameras and image compositing are right then I find HaloLens has much greater potential that Oculus Rift. Then, you really could do more than game. You could use the goggles to visualize designs before they are built and you could remotely teach someone how to fix the broken device they hold before them. You can also provide virtual tours of environments humans can not actually visit; from heart ventricles to asteroids.

But, there is just one more technical thing that I hope Microsoft has addressed better than Oculus and other virtual reality systems to date: motion control. If you place someone in a virtual 3D environment and have them move about you must track the users head movements with great precision or the 3D environment seems to wander and float. So far, the VR systems I've tried only get that 80 - 90% right. That may be okay in a game but for extensive wear or in an augmented reality environment you need to virtual objects to be rock solid in their position, movement and changes in perspective (consider what would happen walking around a virtual object inserted into the real world).

Here too, the cameras in HaloLens may provide an edge over other VR systems' gyros and accelerometers. Recognition software can track the edges of real world objects to help provide the needed information to accurately position virtual objects. If Microsoft is doing that I have to say they have made multiple breakthroughs with HaloLens and I look forward to trying them.

But, just don't expect me to walk around with them all day like a Gl###hole!



How Not To Go Psycho Over Psychometric Tests

   

January 9, 2015 - Last year I posted on my How To... page my September 26th presentation to Austin's Launch Pad Job Club on how to improve your score on psychometric pre-employment tests.

The Launch Pad Job Club records and posts presentations made at their meetings and they recently posted my presentation. Here I've taken their YouTube video and inserted the slides from my presentation so its easier to follow. You can view a PDF of the PowerPoint presentation here.

I hope that this video will be helpful for job seekers.  Also, I would like to give my special thanks to Kathy Lansford-Powell of the Texas Workforce Commission and Launch Pad Job Club for her support in making this presentation.

Note: The LPJC's video quality is below par do to poor lighting and my re-encoding of the original YouTube video.



Nvidia Is Suing Samsung and Qualcomm

September 5, 2014 - Nvidia asserts that Samsung and Qualcomm are infringing on 7 of its patents. I expect the legal departments at both Samsung and Qualcomm will soon file counter suits and eventually it will come to an undisclosed settlement where no one other than the legal firms hired gets anything. EETimes lists the patents in an article by Rick Merritt.

More interesting to me is the "I told you so" article on SemiAccurate.com. While I agree with Charlie Demerjian's take on Nvidia's actions I wonder if he realizes the irony of his article with many links to his past articles supposedly foretelling of Nvidia's motives. I say "supposedly" because you have to pay a subscription fee of $1000 annually to have received the benefit of this foreknowledge. So, the article itself seems like trolling to me. Just sayin'.



Moore’s Law Meets Economic Reality

July 12, 2014 - I have attended IDF for many years and each year that attendees are repeatedly reminded Gordon Moore was one of the founders of Intel and we assured that Moore’s Law is not ending.  Perhaps it isn’t ending but that doesn’t mean it isn’t slowing and that semiconductor manufacturing may have to be transformed substantially for it to continue.

Friday, Rick Merritt of EE Times gives his report “ 13 Things I Heard at Semicon West”.  The second thing that Merritt reports he heard at Semicon is “ Moore’s Law has definitely slowed”.  His reasons why are not completely new.  I listed these same factors in a report last year and then a few months later at IDF again heard Intel deny them.  That denial may have to be somewhat more nuanced this year.  In any case… here’s his list of Merritt's (and my) reasons:

  1. Extreme Ultraviolet Lithography continues to be delayed – the problem lies in the relatively low power of the currently available light source.  Dimmer light means longer exposures.  ASML’s EUV scanners are only able to process 27 – 35 wafers an hour.  That’s about 1/3 the minimum productivity needed.  That hit in productivity would definitely impact the bottom line for any semiconductor company using ASML's current EUV scanners.

    Intel has long been the first purchaser of the newest generation lithography equipment and in the process has footed a big chunk of the bill for companies like ASML’s recovery of their NRE.  But Intel shuttered a new fab building for the next generation of process technology before any equipment was installed.  Without Intel moving forward with EUV in a big way the manufactures of the equipment will be delayed in their product rollouts and the whole industry’s adoption rate for the process will be impacted.

  2. The economic advantage of the next generation technology is being further lessened.  Serge Tedesco, a lithography researcher at CEA-Leti, showed a graphic at Semicon similar to one shown in the past year by NVIDIA.  The cost per node achieved in past process technology shrinks is significantly decreased below 28nm.  Tedesco’s graphic expressed the number of transistors you can purchase with a dollar in each process node back to 180nm and forward to 16nm.  Here’s part of his data in table form.  (Tedesco sites The Linley Group as the forecast source.)

Year 2008 2010 2012 2014 2015
Process 65nm 40nm 28nm> 20nm 16nm
Transistors 11.2 million 16 million 20 million 20 million 19 million
Gain (Loss) - 43% 25% 0% (5%)

Without the financial motivation of more transistors per dollar a product will need to benefit from other benefits of processes smaller than 28nm to justify its use.  Some of those benefits include a small die size leading to a smaller package or greater integration within the package and lower power dissipation.  In phones and tablets these may be worthwhile benefits.

At Semicon Janice Golda, Director of Strategic Sourcing for Lithography is quoted saying Intel plans to make 10nm
chips without the delayed scanners and is exploring how to make 7nm chips without EUV.  How might Intel or any other semiconductor manufacturer build a 10nm or small process technology without the very short wavelengths of EUV?  Merritt points to DSA or Directed Self Assembly as one promising alternative discussed at Semicon.  Merritt doesn’t spend much time explaining DSA but you can find a wealth of information online for this technology.

The concept is to spin coat the surface of the wafer with a mixture of two copolymers and bake.  The heat causes the polymers to self-orient in such a way as to create a nanoscale pattern of either a regularly spaced array of circles or lines.  One of the polymers is etched away leaving holes or ridges.  The openings created can be used to defuse dopants into the wafer or to construct interconnections.  This is a new approach which is going to take time and money to refine into a mass production technique; hence, it probably won't come to the rescue of the cadence of Moore's Law anytime soon.

While my explanation of DSA is an over simplification.  I found the most effective way to find more information to help understand DSA is to Google images not the web in general.  Since DSA is a three dimensional technique for creating chip structures this isn’t too surprising.  It’s how I found this IEEE Spectrum article which is a good primer.

If you want to know more I'll save you the trouble of typing the topic into Google image search.  Just click here.



Governments Choosing Futuremark® PCMark® Over BAPCo® SYSmark®

PCMark 8 Box

July 7, 2014 - Today Futuremark put out a press release saying key Brazil, France, Northern Ireland and the European Union are specifying PCMark to measure minimum PC performance in invitations to tender.  I am very happy to see this.  While at AMD I was a representative to BAPCo during the development of SYSMark 2012.  And, in the 15 months I both participated in the competitive analysis of PCMark for AMD and wrote a white paper for use by AMD sales and marketing endorsing PCMark and setting criteria for selecting desktop and notebook PCs using the latter benchmark.  In the last year AMD saw a positive shift in government sales due to the strong scores AMD APUs received on PCMark 8.

You might say I've seen the inside and the outside of both BAPCo and Futuremark; and their benchmark development processes.  Iron fisted NDAs prevent me from saying anything in detail about either, but I can say I do believe this is good for both AMD and the industry.  PCMark is a contemporary benchmark that leverages the compute power of the SoC including the capabilities of the GPU.  Futuremark strives to develop an unbiased benchmark and while it takes input from all interested parties (chip makers, PC companies, ISVs and users of PCMark) they do seem to do a good job of keeping undue influence from any one stakeholder at arms length.

In his post referencing the press release, Patrick Moorhead says, "This is actually a pretty big deal. Where does that leave BAPCO?"  I think the answer is an additional reduction in relevancy.



Five Applications of Video Encoders in Integrated and Discrete Graphics

July 2, 2014 - Often video playback and encoding capabilities in graphics cards and integrated into SoCs like the Intel Processors and AMD APUs don't get much press but they are becoming more important as new and creative uses have emerged. But this is a rapidly developing area where innovation has quickly replaced technology that seemed like the cutting edge of just a year or so ago with a new approach. Let's take a look at some of what has emerged and is coming.

Transcoding HD Video

When AMD developed their APU they were being visionaries by including graphics and hardware video decode/encode, but Intel took it much further by significantly beefing up the bandwidth of the video decode and encode blocks in their Sandy Bridge and subsequent processors. This was why Intel® Quick Sync was able to run circles around AMD first APUs when transcoding movies to the lower resolutions on mobile devices. Quick Sync decoded the HD movie file and then re-encoded it for the mobile device; e.g., going from 1080p to 480p. But mobile device resolution has now risen to HD levels and 480p video isn't good enough so using Quick Sync doesn't seem so important any more.

Casting to Larger Screens

Another use for the video encoder embedded in the same logic as the GPU is to scrape the screen buffer and encode it for transmission over Wi-Fi to a larger screen.  About 4 years back Intel created a partnership with NetGear to make receivers for HDTVs to facilitate a proprietary solution on Intel notebooks called Intel® WiDi (wireless display). Since, other vendors have made cheaper receivers and Intel has improved the resolution and reduced the latency for WiDi. And, AMD has come up with the own AMD Wireless Display (AMD). But, the Wi-Fi Alliance has since developed a non-proprietary standard for doing the same thing called Miracast. Miracast is supported by multiple smartphone manufacturers, AMD's Wireless Display (AWD) and Intel's WiDi.

But again, the market impact of one set of innovations (WiDi and Miracast) have been weakened by Google's innovative Chromecast that decouples the ability to stream from a mobile device from the hardware platform and cuts the cost to having the Chrome browser installed (free) and purchasing the $35 Chromecast dongle for your HDTV. Google also recognized that most of what was being sent to the TV from the mobile device was content on the Internet. Chromecast bypasses the bottleneck of the notebook, tablet or smartphone and turns them into a remote control for the Chromecast dongle that access content directly from the Internet.

Game Streaming

If you are hard core into gaming you may be so proud of your ability to play the most challenge games you can garner an online audience to watch you play. This is true whether your preferred platform is an Xbox, PlayStation or a high-end gaming PC. Gamers could stream to Ustream or YouTube but they have their own online video portal in Twitch.tv. The WSJ lists Twitch.tv as the fourth largest source of Internet traffic during peak hours (behind Netflix, Google and Apple).

How you broadcast your game play to Twitch.tv depends on your game and platform. Many PC games include Twitch support built-in. If the game doesn't have Twitch support built-in you can use broadcast software like Open Broadcaster Software (OBS) or XSplit.  Twitch.tv also has dedicated software for the Xbox 360, Xbox One and PS4.

You can use an video capture card inside or external capture device that connects through USB in a second PC to handle the streaming. Many game console users take this approach. But, if you have a relatively new processor the internal ability to scrape the screen buffer and encode the video can be successful used instead of external hardware.

NVIDIA ShieldNVIDIA® Shield™ Streaming

This is a different twist. NVIDIA has a new creation called the NVIDIA Shield. It is a multi-functional handheld gaming device that can play casual games on its multi-ARM core based processor.

But Shield can also be used as a remote game controller and display for a PC with NVIDIA GTX graphics card. The PC plays a high-end game and streams the display to the shield. I haven't had a chance to play with the Shield yet but this sounds like a great device to have in your desk drawer for break time. (If you can tear yourself away from it after the coffee break is over!)

I believe the key to the Shield's success will be the latency for game play. If there is too much lag you'll die (in the game) before you can get your shot off. The Shields small screen should provide some technical advantage. The GTX graphics card in the PC won't have to encode as high a resolution image to create a satisfactory display on the Shield. That reduced image size means a smaller data stream per frame that has inherently lower latency. But that still doesn't address the all of the "lag death" problem because the total of the game controller button input and display streams latencies have to be very fast to not make effective game play a problem. For now this is probably the newest innovative use of graphics video encoding.



"Massive Skylake Leak"

June 27, 2014 - Often more accurate than "Semi" and without annoying fees the guys at WCCF Tech have aggregated recent leaks on Skylake. It's well worth the read.  Here's a teaser from their article.

Skylake Configurations



AMD's Fading Channel Presence

June 24, 2014 - Today's news feeds contain an interesting juxtaposition of articles about AMD's channel presence. First up is the news of VP Roy Taylor's tweet about the water cooled 219W FX-9590.  A processor that could well be substituted for a heater in some game boy's dorm room this Winter.   Outside of that limited market, it is hard for me to fathom why any system builder would consider the FX-9590.

Next is a CRN article that talks about their recent survey of system builders showing Intel's continued lock on the channel.  CRN's analysis contains some stinging statements that make you wonder why products like the FX-9590 exist. CRN says, "Though AMD offered an improved product line, the lack of a smart GTM to tap system builders was the key reason why it failed to gain market share and mind share." Later in the article CRN says, "While AMD saw some momentum in the past few years, partners said that in FY2013-14 that momentum faded."

And the market efforts of AMD were dissed too. "Except for a few DIY customers who demand AMD products, AMD continues to be push brand. A lack of marketing during the last fiscal further diminished its brand visibility. Even its newly launched APUs—Kaveri and Kabini—were not properly promoted and hence failed to create enthusiasm in the market. AMD made little progress in the white box server market with very few Opteron SKUs available."

The system builder market is contracting. It is down over 30 percent compared to last year. The AMD DIY market is a small microcosm of that shrinking market.  Many companies have shifted their efforts to mobility, the cloud and big data. AMD seems to be dabbling there while putting many of their eggs in the semicustom silicon (for now game console) basket. While stockholders may congratulate AMD on moving its focus away from just the PC market its focus still seems hazy and risky.  IMHO in that situation the FX-9590 just adds to the fog.

Intel uses its lucrative client and server processor businesses to finance their development in mobile and elsewhere. AMD's semicustom business relies on IP that is developed for its client PC and graphics businesses.  It remains to be seen whether the profitability found with the game console business and the milking of the remaining PC business can help finance the development of new IP that ultimately sustains AMD's client, server and semicustom businesses.  After 19 years there I hope they succeed, but I have to question their lack of focus based on today's news.



Denver in Cupertino

June 23, 2014 - The Hot Chips: A Symposium on High Performance Chips is coming up August 10-12 in the Flint Center in Cupertino but one of the most highly anticipated sessions will be about Denver, NVIDIA's long-time-in-development processor.

NVIDIA has a history of redefinition of products by sometimes swapping external project codenames for different internal projects.  In the case of Project Denver the definition seems to have been changed as the market changed. A few years ago Denver was rumored to be NVIDIA's x86 processor. Now Denver is understood to be an ARM v8 processor.  And as all processors from AMD and Intel today, Denver integrates NVIDIA's graphics.

So how could Denver pivot from being an x86 processor to an ARM v8 processor in mid-life. That is rumored to be thanks to the Transmeta (Remember them? No? Then read the Wikipedia summary here.) code-morphing IP that NVIDIA acquired when Transmeta was shuttered.

Rumors? We will find out how much truth there has been in the rumors about Denver that reach all the way back to 2010 on Monday afternoon, August 11 at the Flint Center.  It is possible that Denver has a full custom core implementation of the ARM v8 instruction set without code-morphing IP.  If so, NVIDIA will be following in Qualcomm's footsteps.

Finally a plug for my friends at AMD.  The same afternoon AMD's Ben Sander and Dan Bouvier will be detailing application of AMD's new Kaveri APU's Heterogeneous Compute capabilities.



Will Broadwell GT2 Graphics Beat Kaveri?

May 7, 2014 – Up until now the performance level of Broadwell Graphics has been a matter of complete speculation or Intel created PR buzz.  It is safe to say it will be better than Haswell Graphics, but will it beat the graphics in AMD’s new Kaveri?

First lets line up an apples-to-apples comparison.  AMD does not attempt to compete against Intel’s halo products.  While margins there are great, the volumes are low.  AMD sticks to the mainstream which means they will likely position Kaveri A10 and A8 APUs against Intel Core i3 and Core i5 processors with GT2 graphics.  AMD positions their Dual Graphics solutions that use an external budget-level graphics chip working in tandem with the integrated graphics in the APU to compete against Intel’s GT3 and GT3e graphics.  So, sticking to the mainstream where the volume sales are let’s do a back of the envelop estimate of what we can expect from Broadwell’s GT2 graphics.

GT2 Progression First, the Broadwell GT2 graphics’ 24 EUs is a bit of a disappointment to me, and probably a cause for celebration at AMD.  Here’s why.  When Intel went from the 32nm Sandy Bridge processor to the 22nm Ivy Bridge processor they took advantage of the process shrink to increase the size of the GT2 graphics substantially.  AMD probably feared Intel would do the same with Broadwell. 

If they had, Broadwell would have more EUs.  Using the same 33% increase as the last shrink would indicate 26.7 EUs.  But EUs are generally laid out in groups of four so that could have meant Broadwell would increase the count to 28.  If Intel had wanted to be really aggressive, they could have increased the EU count to 32 but die cost/sales margin economics certainly prevented that.

The reason I’m putting so much emphasis on EU count is:

          EU count X threads per EU X Peak GPU frequency is the formula for Peak GPU GFLOPS.

GPU GFLOPS is the single most important driver of 3DMark benchmark scores and GPU compute on OpenCL code performance.  Both are areas where AMD has a long term claim to superiority.

So, if Broadwell’s GT2 graphics continues to run at up to 1.2GHz:

          the Peak Double Precision GFLOPS = 1.2 X 24 X 8 = 230.4.  

If Intel increases the peak GPU frequency to 1.3GHz, then the number rises to 249.6.  If the number of EUs had been 28 the peak DP GFLOPS would have risen to 268.8.  If Intel had done both and the Broadwell GT2 graphics would have scored 291.2 peak GFLOPS which would have been more competitive, but for now it doesn’t appear they did either.  But as a caveat, Broadwell’s specs are still officially unannounced.

Now let’s look at the top of what AMD has announced, the Kaveri A10-7850K which is announced.  AMD’s graphics architecture is arranged a bit differently.  There are fewer CUs than Intel’s EUs, but each CU handles many more threads.  The formula for peak DP GFLOPS is:

          Peak GPU GFLOPS = CU count X threads per EU X Peak GPU frequency

For the A10-7850K there are 8 CUs with each handling 64 threads.  The peak GPU clock is 0.720 GHz. 

          Therefore, the Peak GPU GFLOPS = 8 X 64 X 0.72 = 368.64.

Now let’s look at the more pedestrian AMD A8-7600.  The number of CU is reduced to 6 but the number of threads per CU and the peak GPU clock remain the same. 

          The Peak GPU GFLOPS = 6 X 64 X 0.72 = 276.48.

GT2 Competitiveness While many other improvements are said to be in the Broadwell GPU, this key number indicates to me that overall, AMD’s mainstream APUs should hold their own against Broadwell on both the 3DMark and OpenCL benchmarks.  The top scoring A10-7850K should trounce the mainstream Broadwell products with GT2 graphics with up to 33% better performance.

I expect that AMD will be crowing about this ounce Broadwell has launched and solid benchmark numbers are available, but meanwhile AMD will have to tolerate the rolling thunder that Intel will generate leading up to the launch.  During that time Intel will undoubtedly be talking up the GT3 graphics performance which is expected to be twice as high as GT2.  If Intel wants that can talk about it in terms of a single precision GFLOPS score which are usually twice as high as double precision GFLOPS scores.  That will give Intel the chance to claim a ground breaking 1 TeraFLOP performance level, well beyond AMD’s A10-7850K’s graphics performance. 



Amazon Fire TV Has Potential But Not Yet Worth $99 as an Upgrade

Three Stars  3 out of 5 Stars

Amazon Fire TV Update – Saturday, April 5, 2014

Fedex delivered mid-day and I soon was checking out the OBE for Fire TV.  In Amazon.com review style, here are my pros and cons.

Pros:

  • My Amazon Prime account was already activated
  • Voice recognition is accurate
  • Can quickly install apps for popular streaming services like Netflix and Hulu Plus
  • Apps for popular music services including Vevo and Pandora
  • Does a reasonable job of playing popular game apps like those on your phone or tablet
  • Access to your photos stored in Amazon’s cloud
  • Bluetooth remote
  • Logical installation process with on screen video showing you how to use the remote
  • Compact and easy to hide - you could Velcro it to the back of your TV to get it out of site

Cons:

  • Voice recognition only takes you to Amazon as your streaming video source
  • Voice recognition does not work within other apps like Netflix
  • No cross provider search
  • You still have to cursor through a virtual keyboard to enter user IDs and passwords for you Wi-Fi and third party apps like Netflix
  • Amazon cloud music player app was surprisingly not there (surely this will come later)
  • 5+ minute delay to download and install updates before Fire TV can be used

Conclusion

The voice search on Amazon’s own streaming service is a differentiator, but there are not as many services available as on competitors boxes like Roku.  Fire TV does have a nice interface that has great potential if Amazon can get third party apps voice search enabled.  The optional game controller may make this a low cost alternative to the Xbox One or PS4 consoles for families with younger children.  But if you own a Roku, Apple TV or a recently design Smart TV that has the services you like, it is probably not worth paying $99 to replace them just for limited voice search capability.

In the Fire TV box Fire TV UI
Inside the box Voice Search



Amazon Fire TV’s Voice UI Doesn’t Make You Work to Relax

April 3, 2014Amazon Fire TV

A decade ago Philips did a study of why products succeed based on the user’s modality; i.e. whether the user was working, at play or relaxing.  They concluded that products that take the user out of their preferred modality would fail while those the maintained the modality were more likely to succeed.

Generally, the success of current attempts at stream programming off the Internet have been hampered by user interfaces that are too burdensome.  If you are home from work and too tired to make anything but a microwaved meal you are not going to fuss with the UI of your streaming device to try to find a movie or TV program to watch.

But fuss you must by moving a cursor around a virtual keyboard matrix or typing on a tiny keyboard on the remote.  Minutes later you may find what you are looking for.  Too often you are disappointed to find that the service you searched doesn’t have that program or movie.  Grrr!  You can search another service or flip back to the satellite or cable service and realize why those companies are still getting paid so much each month.

The $99 Amazon Fire just introduced is the latest attempt to change that.  The fire has a voice recognition system integrated in with the microphone in the remote.  Like Apple’s Siri, Okay Google and Microsoft Cortana, Amazon believes that voice is key to simplifying the UI so users won’t have to work to relax.

My guess is Amazon has learned with all the mobile voice recognition systems and the recognition happens in the cloud.  That way the quality of the recognition can be steadily improved and popular searches more easily recognized.  So, I expect pretty effective recognition from Fire TV.  Since Google and Apple both have good recognition capability you can bet this feature will quickly be implemented into their stream platforms.

Amazon says they have partnered with Netflix to implement the search feature, but to be truly successful Amazon is going to have to make their search feature work across all the offered services at the same time.   If I want to watch Breaking Bad, I don’t want to have to check Netflix, Hulu Plus or elsewhere to know who has the program or if no one does.  How seamlessly Amazon handles this detail will have a lot to do with Fire TV’s success.  I’ll add a comment or two about that after my Fire TV arrives on Saturday.

Fire TV Game ControllerFinally, I would be remiss if I didn’t mention that Fire TV does gaming too.  For an additional $39.99 you get a wireless game controller that Amazon claims runs for 55 hours on a couple of AA batteries.  The Amazon website lists a number of casual gaming apps available for Fire TV that you are likely familiar with because they run on your smartphone or tablet.  The game remote also doubles as the media remote and can scroll through menu options, but oddly lacks the microphone for the voice UI.  So, keep the original remote handy if you don’t want to be in virtual keyboard matrix hell whenever you want to watch a movie or TV program.




Don't Mock the Game Console, Your 2016 Gaming PC May Be Like Them!

2016 PC

March 21, 2014 - Change is constant, and change is coming.  The top Core i7 PC processors from Intel are a good example.  They are Extreme Editions with six or more cores that borrow from Intel’s server architecture.  Few PC users buy or build platforms based on these expensive halo parts.  But,  I believe by 2016 CPUs with more than four cores won’t be considered so extreme.  I expect many high-end of mainstream PCs, especially for gaming enthusiasts will have eight or more cores.  Here’s why…

Intel’s tick-tock strategy is continuing, although it has been recently delayed a bit by sluggish market uptake of new generations of technology, and it may be delayed in the future by lack of a strong enough extreme ultraviolet light source used in manufacturing.  However, be assured that 14nm processors are coming this year, and 10nm processors will probably arrive sometime in 2016.

As the process technology shrinks, so does the maximum possible power per core.  The maximum TDP for quad core processors with air cooling will be falling as we go through the next few processor generations in a way that hasn’t happened before.  Enthusiasts accustomed to owning desktops with quad core 130W processors will find the TDP dropping to 65W or less over the next three years.  Along with the drop in TDP, there will be a drop in the CPU core clock frequencies.

So, where will the performance come from?  Some of it will come from better cores, but much of it will come from more cores.  Put eight CPU cores in the package, and your back up to 130W; but will you get the level of performance expected from twice the cores?  That will depend on the software you will be running in the future.  For years, processor manufacturers have been adding more cores to server processors, and that’s justified because servers tend to run multiple instances of the same code.  Hence, all the cores can stay busy.

But client PC applications aren’t the same.  Client users mostly run one application at a time.  Yes, they may have several small apps and Windows services running, but for the most part those threads aren’t a very heavy workload, even for one core.  So it is up to the applications developers themselves to develop code with greater parallelism.  That means more threads running in parallel to effectively use the available resources of eight CPU cores.

In 2009, Tom’s Hardware ran a suite of benchmarks on a Core i7 processor operated as a single core, dual core, triple core and quad core processor to find out what types of applications benefited from more cores.  While synthetic benchmarks scaled well with more cores, they concluded  most games did as well with three cores as with four.  The games tested were not sufficiently threaded to keep all four cores busy.  While it would be reasonable to say games may have become more threaded in the last five years, you have to ask whether games have become sufficiently multithreaded to fully utilize eight cores.  My guess is that they haven’t yet, but I also see why they will.

While many game PC enthusiasts like to make fun of the Xbox One and PS4 game consoles, the answer to what will make their future 8-core game PC perform well in 2016 may have its roots in those game consoles.  Both have 8-core processors.  Both have CPU cores that are relatively weak compared to  expectations for Intel’s mainstream CPU cores in 2016.  Which means that the software developers have good reason to multithread their games to use all the performance of available 8 CPU cores.  Much of what the developers learn with the Xbox One and PS4 code development will end up in the PC versions of those games and in future games.

So, game PC enthusiasts shouldn’t be so snooty!  They may well enjoy superior performance on their platforms, but in the future it may be due to the very game consoles they mock.




The 4 Real Reasons Intel's Cherry Trail Will Beat AMD's Beema

March 17, 2014 – Most pundits focus on the technical, but generally Beema is good product that should hold its own, but there are other issues.

Last week Asharf Eassa blogged on Motley FoolDoes AMD Stand a Chance Against Intel’s Cherry Trail?”.  I am sure that Intel’s PR department was thrilled to see the Intel slides and statements that reinforced the idea that Cherry Trail is the superior chip.

But it could be easily argued that echoing Intel’s statements is just that, echoing PR that is intended to put Intel and its products in the best possible light.  Rightly, you probably think every company does that and it’s no surprise.

So, you might think I am going to mount a big technical defense of AMD.  No.  I’ll just do a small one to set up why I think what most authors focus on isn’t the real reason Cherry Trail will be a strong contender in the fourth quarter of 2014.

Cherry Trail has a lot going for it.  A new 14nm process, a graphics engine four times wider than Bay Trail, and an improved CPU.  I expect about a 20% improvement in CPU performance and up to a 3.2X performance improvement in GPU performance.  The “up to” depends on the benchmark and whether Intel maintains or improves the boosted GPU clock.

In the second half of 2014 AMD’s Beema will compete with Cherry Trail-M and Mullins will compete with Cherry Trail-T.  (Mr. Eassa follows Intel's lead by comparing Cherry Trail to last year's Kabini.)  Like the two Cherry Trails, Beema and Mullins are cousins with much the same logic but with power and I/O profiles tuned to two different power profiles, notebook and tablet.  Like Cherry Trail vs. Bay Trail, AMD is claiming a 20% improvement in CPU performance and a significant improvement in GPU performance.

So, where’s the Intel advantage?  It isn’t really CPU or GPU performance.  Beema’s predecessor, Kabini actually held its own against Bay Trail.  And Kabini kicked the Bay Trail GPU’s butt.  So, it is reasonable to expect Beema to hold its own against Cherry Trail on CPU performance and probably maintain a lead, albeit a smaller lead, over Cherry Trail’s graphics.

Many would say that Intel’s advantage is much lower power.  Admittedly, Intel’s 22nm and 14nm processes are superior to either of the TSMC or the Global Foundries 28nm processes being used by AMD.  But I’ll point out that in mobile designs the processor cores are powered off the great majority of the time; hence, other components like the display and I/O actually use most of the battery’s power.  The difference in AMD and Intel power usually results in less than an hour difference in run time.

So how does Intel’s Cherry Trial win?  Here’s my answer, by the numbers…

  1. AMD is positioned by PC manufacturers as the value solution and the critical battery size and power usage of the other components in the platform result in lower battery life and a less impressive user experience with AMD-based systems.  Unfortunately, the positioning results in an impression of AMD having lessor capabilities.  It doesn’t impact the PC manufacturer because they sell more upscale systems, too.

  2. AMD is focused elsewhere.  They are making money with their semicustom division with Xbox and PS4.  They say they have other semicustom deals in the works.  They have steadily laid off people on both the marketing and engineering sides of their client PC business.  That certainly makes it harder for AMD to compete in this segment.

  3. Intel has lots of cash in the bank and divisions that generate lots of revenue, so it has been willing to throw a few billion dollars of “contra revenue” to manufacturers to help design and implement platforms using Bay Trail.  That will continue with Cherry Trail.  AMD has debt and tight cash flow.  They simply cannot buy market share in that way.

  4. Intel’s fan base that publishes on Seeking Alpha, Forbes, Motley Fool and elsewhere is far more energized than AMD’s fan base.  No one seemed interested enough in the cited blog article on Motley Fool last week to post a contrarian view.  AMD seems to have lost its own fan base who defended the scrappy underdog of the PC processor industry.  In a world driven by social media and impressions as to what is trending up or down, that’s a truly bad thing and gives Intel a significant edge.

Charles W Mitchell is a consultant who focus on the client PC industry and its suppliers.  Charles has been a leader in development of PC software and hardware, and worked in the semiconductor industry for over a decade.  Charles owns stock in both Intel and AMD and has worked for AMD in the past and been a customer of Intel.  Charles is familiar with both company's products, manufacturing and marketing.


Your Viewpoint

Have a comment to make about something I've said on the left side of the page?  Then send me an e-mail to...

   comments@charleswmitchell.com

And I'll read it. If it is civil and relevant I'll post it on this side of the page.  Your name will be included with your comment.