Deep Learning Frameworks
Deep learning frameworks offer building blocks for designing, training and validating deep neural networks, through a high level programming interface. Widely used deep learning frameworks such as MXNet, PyTorch, TensorFlow and others rely on GPU-accelerated libraries such as cuDNN, NCCL and DALI to deliver high-performance multi-GPU accelerated training.
Developers, researchers and data scientists can get easy access to NVIDIA optimized deep learning framework containers with deep learning examples, that are performance tuned and tested for NVIDIA GPUs. This eliminates the need to manage packages and dependencies or build deep learning frameworks from source. Visit NVIDIA NGC to learn more and get started.
Following is a list of popular deep learning frameworks we support, including learning resources to get started.
PyTorch is a Python package that provides two high-level features:
- Tensor computation (like numpy) with strong GPU acceleration
- Deep Neural Networks built on a tape-based autograd system
You can reuse your favorite Python packages such as numpy, scipy and Cython to extend PyTorch when needed.
MXNet is a deep learning framework designed for both efficiency and flexibility. It allows you to mix the flavors of symbolic programming and imperative programming to maximize efficiency and productivity.
In its core is a dynamic dependency scheduler that automatically parallelizes both symbolic and imperative operations on the fly. A graph optimization layer on top of that makes symbolic execution fast and memory efficient. The library is portable and lightweight, and it scales to multiple GPUs and multiple machines.
TensorFlow is an open source software library for numerical computation using data flow graphs. Nodes in the graph represent mathematical operations, while the graph edges represent the multidimensional data arrays (tensors) that flow between them. This flexible architecture lets you deploy computation to one or more CPUs or GPUs in a desktop, server, or mobile device without rewriting code. For visualizing TensorFlow results, TensorFlow offers TensorBoard, suite of visualization tools.
For high-performance inference deployment for TensorFlow trained models, you can:
- Use TensorFlow-TensorRT integration to optimize models within TensorFlow and deploy with TensorFlow
- Export TensorFlow models and import, optimize and deploy with NVIDIA TensorRT's built in TensorFlow model importer.
- Deep Learning Documentation: TensorFlow User Guide
- Deep Learning Documentation: TensorFlow Best Practices
- TensorFlow Getting Started Guide
MATLAB makes deep learning easy for engineers, scientists and domain experts. With tools and functions for managing and labeling large data sets, MATLAB also offers specialized toolboxes for working with machine learning, neural networks, computer vision, and automated driving. With just a few lines of code, MATLAB lets you create and visualize models, and deploy models to servers and embedded devices without being an expert. MATLAB also enables users to generate high-performance CUDA code for deep learning and vision applications automatically from MATLAB code.
For high-performance inference deployment of MATLAB trained models, use MATLAB GPU Coder to automatically generate TensorRT optimized inference engines from cloud to embedded deployment environments.
Caffe is developed by the Berkeley Vision and Learning Center (BVLC) and by community contributors. NVIDIA Caffe, also known as NVCaffe, is an NVIDIA-maintained fork of BVLC Caffe tuned for NVIDIA GPUs, particularly in multi-GPU configurations.
For high-performance inference deployment for Caffe trained models, import, optimize and deploy with NVIDIA TensorRT's built-in Caffe model importer.
- Deep Learning Documentation: NVCaffe User Guide
- Deep Learning Documentation: NVCaffe Best Practices
- Blog:Deep Learning for Computer Vision with Caffe and cuDNN
Chainer is a Python-based deep learning framework aiming at flexibility. It provides automatic differentiation APIs based on the define-by-run approach, also known as dynamic computational graphs, as well as object-oriented high-level APIs to build and train neural networks. It supports CUDA and cuDNN using CuPy for high performance training and inference.
PaddlePaddle provides an intuitive and flexible interface for loading data and specifying model structures. It supports CNN, RNN, multiple variants and configures complicated deep models easily.
It also provides extremely optimized operations, memory recycling, and network communication. PaddlePaddle makes it easy to scale heterogeneous computing resources and storage to accelerate the training process.