News

the implementation achieves a speedup of 4.16× on one NVIDIA Tesla M2050 GPU compared to a 2.93 GHz six-core Intel Xeon X5670 CPU. In addition, it runs 2.41× faster on 256 compute nodes of Tianhe-lA ...