Hey Guys,
We know that for shader constant buffer we have to follow the alignment rule:
cbuffer MyStruct : register(b0) { matrix mWVP; float2 f2ColorReso; float2 f2DepthReso; // float fThisCausesProblem; float4 f4LightPos; }
I was wondering how about struct for inter shader stages? it seems we don't need the alignment rule for data from vs to ps so the following is totally correct:
struct TexColPos { float2 Tex : TEXCOORD0; float3 Col : COLOR0; float4 Pos : SV_Position; }
However, I was told that all GPU memory HW prefer aligned read and write, so may be using the following struct will be faster?
struct TexColPos { float4 Tex : TEXCOORD0; // Tex.zw is not used; float4 Col : COLOR0; // Col.w is not used; float4 Pos : SV_Position; } // Also I was wondering is the following make any differences? struct TexColPos { float2 Tex : TEXCOORD0; float2 Dummy0 : TEXCOORD1; // not used; float3 Col : COLOR0; float Dummy1 : COLOR1 // not used; float4 Pos : SV_Position; }
This seems consume more bandwidth, but since this is between vs and ps, so shouldn't have any impact on device memory bandwidth (read and write should totally happen in cache right? am I wrong?). I have tested this and didn't notice any difference (probably my test workload is very light, and my GPU is pretty old), but it will be great if anyone could provide more insight on this.
Thanks in advance