Warning
It is not guaranteed that executions of the same FFTs (size, direction, type, precision) but with different
number of elements per thread (ElementsPerThread),
number of FFTs calculated per CUDA block (FFTsPerBlock), or
block dimension (BlockDim),
will produce bit-identical results.