Looks a bit over-complicated at the first glance (why provide classes for random tensor generation, I have no idea), but it has a few very nice features:
- Automated num_threads handling - Automated CUDA synchronization - Report generation, storing the results, comparing the results
But I suppose there is nothing wrong just using %%timeit manually setting num_threads.