AFAIK, in Directx 11 there are two ways to get the total number of samples that pass the depth test and the stencil test:
1. If we use an unordered access view with an internal counter, we can increase the internal counter for each processed sample and then use a staging buffer to obtain the internal counter.
2. We can use hardware occlusion query (D3D11_QUERY_OCCLUSION)
But both ways slows down performance, so is there another way to get the total number of samples that pass the depth test and the stencil test?