Why InterlockedAdd is much faster than InterlockedCompareStore under high contention

Hey Guys

I have a compute shader with threadgroup size 8*8*8 try to update a 32bit uint (UAV, and is set to 0 before cs).

 // Update brick structure u3BlockIdx is same for every thread in one threadgroup
    if (true || fSDF < vParam.fTruncDist) { // for testing purpose, I made this always true so every thread will do the update
        //tex_uavFlagVol[u3BlockIdx] = 1;                              // case 1, fastest. no atomicity, only useful when we just want to know is there any thread update it or not
        //InterlockedAdd(tex_uavFlagVol[u3BlockIdx], 1);               // case 2, take twice time as case 1, useful when we need to know how many thread actually update it.
        //InterlockedCompareStore(tex_uavFlagVol[u3BlockIdx], 0, 1);   // case 3, take almost twice time as case 2, not much useful, just test for fun
    }

It looks as the result is straight forward. But my expectation is that under such contention (512 threads try to update the same data) case 3 maybe the fastest since it's doing 512 serialized read and compare but only 1 write. while case 1 didn't do atomic write, it actually doing 512 write, and with bank conflict, it should not be much faster than case 3. For case 2, this really confuses me: it's doing 512 serialized write, why its twice fast as case 3 where we only have 1 write?

My understanding is that serialized read should be no slow than serialized write, extra 512 compare time should be negligible, so case 3 really shouldn't be that slow compare to case 2.

But apparently, I must get something wrong. Thanks for anyone who could enlightening me on this

Why InterlockedAdd is much faster than InterlockedCompareStore under high contention

Trending Articles

Bath man appears in court charged with attempted murder of a man...

MACLEAN, Allan

Black Angus Grilled Artichokes

Practice Sheet of Right form of verbs for HSC Students

Police blotter for Jan. 12

99 God Status for Whatsapp, Facebook

Rajasthan Board 12th Science Result 2018 name wise- RBSE 12th commerce result...

Notorious Naushad of Ippa gang nabbed

Child Kidnapping: Amy McNeil was kidnapped on her way to school by 5 adults;...

Sonible Smartlimit v1.1.5-R2R

NCERT Solutions for Class 9th Sanskrit Chapter 3 पाथेयम्

मतलबी दोस्त स्टेट्स | Matlabi Dost Status in Hindi – Selfish Friends Status

Arrow Flash 2 – Sinhala Dubbed – Episode 23 – 20th March 2016

[GET] AI Traffic Goldmine

[E² Plugin] HDF-Radio

Universal Multi-Patch v1.3 By RADIXX11

IWAN – Thanks and Praise ( Throw Back Thursday )

RONALD P SONDERGAARD Arrested by Miami-Dade County Corrections on Mar 03, 2017

मुख मैथुन से उठाएं सेक्स का भरपूर मज़ा, जानें क्या है इसका सही तरीकामुख मैथुन...

HSSC Excise & Taxation Inspector Result 2017 Scorecard/ Category Wise Merit List