What is Parity, ECC, and EOS?


An important thing to understand about ECC and parity memory is that ECC and Parity memory modules DO NOT perform any error detection or correction function themselves. Error checking and correction functions are carried out on the system board, not on the memory module itself. The modules simply provide the space required to store the extra bits of data that represent the condition of the real data. The computer always calculates parity and ECC data for every read and write.

Parity

'Parity' is a form of error detection that uses a single bit to represent the odd or even quantities of '1's and '0's in the data. Parity usually consists of one parity bit for each eight bits of data.

On most systems, a parity error detection results in the computer system freezing entirely and the displaying of a 'Parity Error' message on the screen. The system must then be restarted.

Because parity only identifies odd or even quantities, two incorrect bits can go undetected.

Parity has lost favor over the years for several reasons:

  1. Parity can't correct errors.
  2. Some errors go undetected.
  3. Errors result in complete system shutdown which can cause damage to data.
  4. DRAM has become significantly more reliable.
  5. Current computer operating systems don't handle parity errors as smoothly as older OS's.
  6. For many users, the cost of the extra chips isn't justified by the low level of protection.

ECC

'ECC' or Error Checking and Correcting permits error detection as well as correction of certain errors. Typically, ECC can detect single and dual bit errors, and can correct single bit errors.

Corrected errors are usually transparent to the operating system. The memory controller chip on the system board performs the correction and always sends corrected data to the CPU. The memory controller can inform the OS when errors are corrected, but most have no means of logging the corrections or informing the user. So, the user may never know an error ever occurred.

Multi-bit errors are so rare that further detection and correction capabilities are required only in extreme cases, and would require custom memory components. For that level of protection 100% redundancy would probably be less expensive by using single bit correction memory modules on two separate memory subsystems.

The double-bit detection, single-bit correction ECC functions may require more or fewer extra bits than parity depending on the data path. For example:

  1. 8-bit data path requires 5 bits for ECC or 1 bit for parity.
  2. 16-bit data path requires 6 bit for ECC or 2 bits for parity
  3. 32-bit data path requires 7 bits for ECC or 4 bits for parity
  4. 64-bit data path requires 8 bits for ECC and parity
  5. 128-bit data path requires 9 bits for ECC or 16 bits for parity

As you can see, there's a break-even point at 64 bits. At this point, the security of ECC costs the same (in DRAM) as the less capable parity. In our opinion, the efficiency of ECC versus parity in today’s 64-bit processors and the inevitably wider data paths of future processors makes continued use of parity highly improbable.

EOS

'EOS' stands for ECC-On-SIMM. Like ECC and parity, EOS memory modules contain extra data storage for error correction data. Unlike ECC and parity, EOS memory modules actually perform the error detecting and correcting on the module using customized logic chips on each module. EOS corrects errors before they're sent to the system so that the ECC function is transparent to both the OS and the system itself. EOS memory modules can be used on systems that normally would not be capable of ECC, thus giving them most of the security advantages of ECC.

For systems designed to use EOS, the EOS module uses normally unused pins on the memory module to inform the computer system of error corrections.

The disadvantages are many:

  1. EOS uses significantly more DRAM devices which increases cost.
  2. EOS uses specialized logic chips that increase cost.
  3. Each additional DRAM chip and logic chip is another component that can fail.
  4. Systems designed to use EOS generally can't run with EOS and non-EOS at the same time.
  5. EOS has no added protection over normal ECC.
  6. Hewlett Packard and IBM are the only sources for EOS limiting availability and adding to the cost.

That means EOS will cost much more, take longer to get or replace, and statistically less reliable than normal ECC.

Return


Copyright (©) 1997 Advantage Memory Corp.