Programming/openCL & CUDA2011. 3. 2. 18:23
3.2도 못해봤는데 4.0이라니 언제따라가?!?!?! ㅠ.ㅠ

Release Highlights

Easier Application Porting

  • Share GPUs across multiple threads
  • Use all GPUs in the system concurrently from a single host thread
  • No-copy pinning of system memory, a faster alternative to cudaMallocHost()
  • C++ new/delete and support for virtual functions
  • Support for inline PTX assembly
  • Thrust library of templated performance primitives such as sort, reduce, etc.
  • NVIDIA Performance Primitives (NPP) library for image/video processing
  • Layered Textures for working with same size/format textures at larger sizes and higher performance

Faster Multi-GPU Programming

  • Unified Virtual Addressing
  • GPUDirect v2.0 support for Peer-to-Peer Communication

New & Improved Developer Tools

  • Automated Performance Analysis in Visual Profiler
  • C++ debugging in cuda-gdb
  • GPU binary disassembler for Fermi architecture (cuobjdump)

Please refer to the Release Notes and Getting Started Guides for more information.


[링크 : http://developer.nvidia.com/object/cuda_4_0_RC_downloads.html]


Posted by 구차니