NOTE: The
NVIDIA Developer Forums
and the
GPU Computing Forums
require separate logins. We will fix this in the near future when the two forums are merged. Thank you for your patience!
Fermi HW profiling support for CUDA C and OpenCL in Visual Profiler
C++ Class Inheritance and Template Inheritance support for increased programmer productivity
A new unified interoperability API for Direct3D and OpenGL, with support for:
OpenGL texture interop
Direct3D 11 interop support
CUDA Driver / Runtime Buffer Interoperability, which allows applications using the CUDA Driver API to also use libraries
implemented using the CUDA C Runtime such as CUFFT and CUBLAS.
CUBLAS now supports all BLAS1, 2, and 3 routines including those for single and double precision complex numbers
Up to 100x performance improvement while debugging applications with cuda-gdb
cuda-gdb hardware debugging support for applications that use the CUDA Driver API
cuda-gdb support for JIT-compiled kernels
New CUDA Memory Checker reports misalignment and out of bounds errors, available as a stand-alone utility and debugging
mode within cuda-gdb
CUDA Toolkit libraries are now versioned, enabling applications to require a specific version, support multiple versions
explicitly, etc.
CUDA C/C++ kernels are now compiled to standard ELF format
Support for device emulation mode has been packaged in a separate version of the CUDA C Runtime (CUDART), and is deprecated
in this release. Now that more sophisticated hardware debugging tools are available and more are on the way, NVIDIA will
be focusing on supporting these tools instead of the legacy device emulation functionality.
On Windows, use the new Parallel Nsight development environment for Visual Studio, with integrated GPU debugging and
profiling tools (was code-named "Nexus"). Please see www.nvidia.com/nsight for details.
On Linux, use cuda-gdb and cuda-memcheck, and check out the solutions from Allinea and TotalView that will be available
soon.
Support for all the OpenCL features in the latest R195 production driver package:
Double Precision
Graphics Interoperability with OpenCL, Direc3D9, Direct3D10, and Direct3D11 for high performance visualization
o Query for Compute Capability, so you can target optimizations for GPU architectures (cl_nv_device_attribute_query)
Ability to control compiler optimization settings via support for pragma unroll in OpenCL kernels and an extension that
allows programmers to set compiler flags. (cl_nv_compiler_options)
OpenCL Images support, for better/faster image filtering
32-bit global and local atomics for fast, convenient data manipulation
Byte Addressable Stores, for faster video/image processing and compression algorithms
Support for the latest OpenCL spec revision 1.0.48 and latest official Khronos OpenCL headers as of 2010-02-17
For more information on general purpose computing features of the Fermi architecture, see: www.nvidia.com/fermi.
Please review the release notes for additional important information about this release.
Note: The developer driver packages below provide baseline support for the widest number of NVIDIA products in the smallest
number of installers. More recent production driver packages for end users are available at www.nvidia.com/drivers.