Tensorflow is a wonderful tool for Differentiable Neural Computing (DNC) and has enjoyed great success and market share in the Deep Learning arena. We usually use it with python in a prebuild fashion using
Anaconda or
pip repositories. What we miss that way is the chance to enable optimizations to better use our processing capabilities as well as do some lower level computing using
C/C++.
The purpose of this post is to be a guide for compiling
Tensorflow r1.4 on Linux with
CUDA GPU support and the high performance
AVX and
SSE CPU extension
s.
This guide is largely based on the official
Tensorflow Guide and
this snippet with some bug fixes from my side.
1. Install python adependencies:
sudo apt-get install python-numpy python-dev python-pip python-wheel python-setuptools
2. Install GPU prerequisites:
- CUDA developer and drivers
- CUDNN developer and runtime
- CUBLAS
Make sure cuddn libs are copied inside the
cuda/lib64 directory usually found under
/usr/local/cuda.
sudo apt-get install libcupti-dev
3. Install Bazel google's custom build tool:
sudo apt-get install openjdk-8-jdk
echo "deb [arch=amd64] http://storage.googleapis.com/bazel-apt stable jdk1.8" | sudo tee /etc/apt/sources.list.d/bazel.list
curl https://bazel.build/bazel-release.pub.gpg | sudo apt-key add -
sudo apt-get update && sudo apt-get install bazel
sudo apt-get upgrade bazel
4. Configure
Tensorflow:
git clone https://github.com/tensorflow/tensorflow
cd tensorflow
git checkout r1.4
## don't use clang for nvcc backend [https://github.com/tensorflow/tensorflow/issues/11807]
## when asked for the path to the gcc compiler, make sure it points to a version <= 5
./configure
5. Compile with the
SSE and
AVX flags and install using
pip:
# set locale to en_us [https://github.com/tensorflow/tensorflow/issues/36]
export LC_ALL=en_us.UTF-8
export LANG=en_us.UTF-8
bazel build -c opt --copt=-mavx --copt=-mavx2 --copt=-mfma --copt=-mfpmath=both --copt=-msse4.2 --incompatible_load_argument_is_label=false --config=opt --config=cuda //tensorflow/tools/pip_package:build_pip_package
./bazel-bin/tensorflow/tools/pip_package/build_pip_package /tmp/tensorflow_pkg
sudo pip install /tmp/tensorflow_pkg/tensorflow-1.4.1*
If you get a
nasm broken link error :
edit
tensorflow/tensorflow/workspace.bzl and add an extra link
urls = [
"https://mirror.bazel.build/www.nasm.us/pub/nasm/releasebuilds/2.12.02/nasm-2.12.02.tar.bz2",
"http://www.nasm.us/pub/nasm/releasebuilds/2.12.02/nasm-2.12.02.tar.bz2",
"http://pkgs.fedoraproject.org/repo/pkgs/nasm/nasm-2.12.02.tar.bz2/d15843c3fb7db39af80571ee27ec6fad/nasm-2.12.02.tar.bz2",
]
6. Test that everything works:
cd ~/
python
>>> import tensorflow as tf
>>> session = tf.InteractiveSession()
>>> init = tf.global_variables_initializer()
## At this point if your get a malloc.c assertion failure, it is due to a wrong CUDA configuration (ie not using the runtime version)
At this point there should not be any
CPU warning and the
GPU should be initialized.