Hi Forum!
I need to reduce 2D texture.
For each Dispatch(), I reduce it 16x both directions on CS.
The biggest problem arises when I have the last step, say when texture is 8x5. For previous steps it is not such big.
I found next solution: store in CB for the last, not complete quads, their real denominators, and use it instead of division by 16*16.
CB data is changed on Resize() only, so I don’t need to update it every frame: I just store vector of them as reduction targets.
But I wonder: is there more elegant solution?
Here is a scratch:
static const uint gLumReductionTGSize = 16; cbuffer CB { uint cb_xGroupId; uint cb_xDenominator; //if GroupID.x == cb_xGroupId, use it. Otherwise - gLumReductionTGSize uint cb_yGroupId; uint cb_yDenominator; //if GroupID.y == cb_yGroupId, use it. Otherwise - gLumReductionTGSize } //Each time reduce by 16x16 [numthreads(gLumReductionTGSize, gLumReductionTGSize, 1)] void main(uint3 GroupID : SV_GroupID, uint3 DispatchThreadId : SV_DispatchThreadID, uint ThreadIndex : SV_GroupIndex) { // Will read 0 in case "out of bounds" float pixelLuminance = InputLumMap[DispatchThreadId.xy]; // Store in shared memory LumSamples[ThreadIndex] = pixelLuminance; GroupMemoryBarrierWithGroupSync(); // Reduce [unroll] for (uint s = NumThreads / 2; s > 0; s >>= 1) { if (ThreadIndex < s) { LumSamples[ThreadIndex] += LumSamples[ThreadIndex + s]; } GroupMemoryBarrierWithGroupSync(); } if (ThreadIndex == 0) { uint divX = (GroupID.x == cb_xGroupId) ? cb_xDenominator : gLumReductionTGSize; uint divY = (GroupID.y == cb_yGroupId) ? cb_yDenominator : gLumReductionTGSize; OutputLumMap[GroupID.xy] = LumSamples[0] / (divX* divY); } }
Thanks in advance!