ISO/IEC JTC1 SC22 WG21 N2745 = 08-0255 - 2008-08-22 (REVISED for POWER 5/5+)
Paul E. McKenney, paulmck@linux.vnet.ibm.com
This document presents an implementation of the proposed C/C++ memory-order model for the POWER 5/5+ family of computer systems, which require either usage restrictions or special code sequences to implement the proposed C/C++ sequentially consistent atomic operations.
The POWER 5/5+ family of computer systems successfully run parallel programs containing atomic operations as long as at least one of the following conditions is met:
Please note that other members of the Power family, for example, Power 6 and Power 7, need not adhere to any of the above conditions.
| Operation | POWER 5/5+ Implementation | 
|---|---|
| Load Relaxed | ld | 
| Load Consume | ld | 
| Load Acquire | ld; cmp; bc; isync | 
| Load Seq Cst (POWER5/5+) | hwsync; larx; cmp; bc; isync | 
| Store Relaxed | st | 
| Store Release | lwsync; st | 
| Store Seq Cst | hwsync; st | 
| Cmpxchg Relaxed,Relaxed (32 bit) | _loop: lwarx; cmp; bc _exit; stwcx.; bc _loop; _exit: | 
| Cmpxchg Acquire,Relaxed (32 bit) | _loop: lwarx; cmp; bc _exit; stwcx.; bc _loop; isync; _exit: | 
| Cmpxchg Release,Relaxed (32 bit) | lwsync; _loop: lwarx; cmp; bc _exit; stwcx.; bc _loop; _exit: | 
| Cmpxchg AcqRel,Relaxed (32 bit) | lwsync; _loop: lwarx; cmp; bc _exit; stwcx.; bc _loop; isync; _exit | 
| Cmpxchg SeqCst,Relaxed (32 bit) | hwsync; _loop: lwarx; cmp; bc _exit; stwcx.; bc _loop; isync; _exit | 
| Acquire Fence | lwsync | 
| Release Fence | lwsync | 
| AcqRel Fence | lwsync | 
| SeqCst Fence (POWER5/5+) | for (i=0;i<8;i++) { dcbf junk; hwsync; ld junk; } | 
The variable junk may be any memory location.
It is permissible to use junk as the loop control variable, as
long as that loop control variable is assigned to a memory location.
It is legitimate (but usually unnecessary) to replace sync,
lwsync, and eieio instructions with the
code sequence shown above for “SeqCst Fence (POWER5/5+)”.