| Home | About | ![]() |
Articles | ![]() |
News | ![]() |
Folding | ![]() |
Forums | ![]() |
Login | ![]() |
Register |
|
|
|
Page 1 of 1 pages for this article AMD’s New Heavyweight: Opteron 275 Dual Core Reviewed. by Article Admin
![]()
Published: 04/26/2005
Its been just over two years since AMD launched the Opteron family and debuted its 64-bit architecture. Initially greeted with cautious optimism and guarded praise, the Opteron has steadily gained both market share and Tier 1 OEM support, putting AMD on the map for the first time in the server / workstation market. Author’s Note: I apologize for the lateness of this review; my girlfriend was in the hospital for emergency (and unexpected) heart surgery the week I received my test kit--which threw something of a wrench in my testing. This has changed the scope (and timeframe) of the article somewhat. Laurie herself has made a full recovery; look for more in-depth coverage in the near-future on dual core products. Under the Hood: Opteron vs. Smithfield.
Now that both Intel and AMD have thrown their hats into the ring with actual products, we can take a concrete look at the designs of each. The Pentium D / Pentium EE 840 core (codename: Smithfield) is simply two Prescott CPU?s sitting side by side. Communication between the two CPU?s is handled by an arbiter built into the Northbridge. The arbiter runs at 800 MHz (identical to the memory controller and FSB), and has a maximum available bandwidth of 6.4 GB/s. Both cores share this FSB, just as they would in a standard 2P system.
The above diagram shows how the Opteron?s two cores are connected by a system request interface, allowing them to communicate directly with each other without need for an arbiter. The interface connections to a crossbar switch, allowing either CPU access to the memory controller or Hyper Transport linkages. Total external I/O and memory bandwidth available from this setup varies depending on what type of Opteron is used, the 8xx series offers 30.4 GB/s of total bandwidth, the 2xx series gives 22.4 GB/s, and even the 1xx series offers 14.4 GB of total bandwidth?more than double what?s available from an Intel-based solution at the moment. Cache Coherency: One of the challenges of scaling above 1P systems is keeping all cached information synchronized across all CPU?s?preferably without saturating the main memory bus with snoop requests as CPU0 keeps tabs on what CPU1 is doing (and vice versa). Smithfield uses a cache coherency protocol called MESI (Modified, Exclusive, Shared, Invalid). Under this protocol, if CPU0 requests a chunk of data CPU1 does not have, that data is marked as Exclusive. If CPU1 later requests the same piece of data, its now marked as Shared?each CPU ?knows? that the other CPU has the same information stored in cache. If CPU0 writes to the chunk of information, its classification changes to Modified for CPU0, while CPU1 is ordered to flag the data as Invalid. If CPU1 needs that same chunk of information again, it will have to request the new data from CPU0, which would then change the data flag from Modified (CPU0) / Invalid (CPU1) to Shared (CPU0) / Shared (CPU1) again. Although MESI might seem complex, its actually an order of magnitude beyond its predecessor, called write-through. Under this protocol, every time CPU0 wrote to main memory, CPU1 snooped the same data and either updated or invalidated its own cache. This type of protocol required a tremendous number of read / write updates and greatly reduced the main memory bandwidth available for other types of operations. AMD?s coherency protocol, however, is a step beyond MESI. Called MOESI, it uses the same four flags as MESI, but adds a fifth?Owner. Owner allows CPU0 to change data in its own L2, than make that data directly available to CPU1. Because Opterons are linked directly via a system interface, this transfer can happen at full CPU speed, without the need to interface with main memory at all. This eliminates the need to load main memory with snoop requests between processors, and should allow dual core Opteron systems to scale extremely well. In contrast, dual core Smithfield remains dependent on main memory to pass cache coherency information and requests. A simple analogy for understanding the difference between AMD and Intel?s cache coherency would be this: Imagine two neighbors, living in houses next to each other. AMD?s MOESI protocol and system interface are equivalent to a phone line installed from house to the other. If Neighbor #1 has something to tell Neighbor #2, (or vice versa) he simply picks up the phone and calls him. Under Intel?s system, Neighbor #1 and Neighbor #2 still communicate, but only by mail. If Neighbor #1 has something to say to Neighbor #2, he writes it in a letter, hands it to the postman, who then delivers it?after taking it to the post office for sorting. There is no way for Neighbor 1 to directly communicate with Neighbor 2, in a Smithfield implementation. next >
Page 1 of 1 pages for this article Search
|