Nvidia announce Tesla GPU build on Kepler Architecture
- 15 SMX units
- 6 64-bit memory controller
- TSMC 28nm
- 192 cuda-cor SP (K10) and 64 cuda-core DP (K20)
- 1 Tflops DP
- SMX Streaming Multiprocessor —
- 3X performance per watt than the Fermi SM,
- one petaflop with 10 server racks.
- 4X of CUDA® .
- Dynamic Parallelism —
- enables GPU threads to dynamically spawn new threads, allowing the GPU to adapt dynamically to the data.
- Hyper-Q —
- enables multiple CPU cores to simultaneously use the CUDA cores on a single Kepler GPU.
- ideal for cluster applications that use MPI.
- Grid Management Unit (GMU)
- manages and prioritizes grids to be executed on the GPU.
- can pause the dispatch of new grids and queue pending and suspended grids until they are ready to execute, providing the flexibility to enable powerful runtimes, such as Dynamic Parallelism.
- ensures both CPU‐ and GPU‐generated workloads are properly managed and dispatched.
- NVIDIA GPUDirect
- enables GPUs within a single computer, or GPUs in different servers located across a network, to directly exchange data without needing to go to CPU/system memory.
- The RDMA feature in GPUDirect allows third party devices such as SSDs, NICs, and IB adapters to directly access memory on multiple GPUs within the same system, significantly decreasing the latency of MPI send and receivemessages to/from GPU memory.
- It also reduces demands on system memory bandwidth and frees the GPU DMA engines for use by other CUDA tasks. Kepler GK110 also supports other GPUDirect features including Peer‐to‐Peer and GPUDirect for Video
- spec
- white paper
Advertisements