Differences between revisions 14 and 44 (spanning 30 versions)

Setup for deep learning workstation

This page covers the steps for setting up a machine primarily intended for deep learning analyses. This is assuming Ubuntu has already been installed.

Ubuntu installation notes:

Should probably tweak the BIOS to match that of Agnew/Calculon/Lrrr/Ndnd if we set up any more of these. Just see one of those guys for the good options.
Can/should tick on the options for auto-installing updates and installing third-party software while installing Ubuntu
For some reason, on the rack-mount machines, when it tells you to hit enter to restart after installing, hitting enter doesn't do anything. You just have to power off the machine.
Another weird thing: On the rack-mount machines with two graphics cards, you have to switch back and forth between the two graphics cards during Ubuntu installation vs running vs using the BIOS or whatnot. It's weird. Just keep going back and forth... one or the other will work for any given scenario.

Other tips/notes on running analyses: KerasTips

Initial setup

Early on, presumably right after installation of OS, remember to update all packages:

sudo apt-get update
sudo apt-get upgrade

Package installation

These two are definite necessities. In particular, need to install openssh-server before basically anything else because otherwise we can't get SSH access.

sudo apt-get openssh-server
sudo apt-get install tightvncserver

The following may not be necessary anymore -- it was for our old VNC setup. But it shouldn't hurt to install these packages anyway, just in case we want to use something like the old setup again.

sudo apt-get install ubuntu-desktop gnome-panel gnome-settings-daemon metacity nautilus gnome-terminal

OLD VNC server configuration

 #!/bin/sh
[-x /etc/vnc/startup] && exec /etc/vnc/startup
[ -r $HOME/.Xresources ] && xrdb $HOME/.Xresources
xsetroot -solid grey
vncconfig -iconic & x-terminal-emulator -geometry 80x24+10+10 -ls -title "$VNCDESKTOP Desktop" & x-window-manager & gnome-panel & gnome-settings-daemon & metacity & nautilus

NEW VNC server configuration

Steps roughly follow https://www.digitalocean.com/community/tutorials/how-to-install-and-configure-vnc-on-ubuntu-16-04. Exact instructions below:

Install the following packages:

sudo apt-get install xfce4 xfce4-goodies
sudo apt-get install autocutsel

Next, start up VNC:

vncserver:yourvncnumber

It will prompt you to set and confirm a password; do so. Then end the session:

vncserver -kill :yourvncnumber

This creates ~/.vnc/xstartup.

Either edit ~/.vnc/xstartup , or delete it and make a new file. IF MAKING A NEW FILE, enter this command as well:

chmod 755 ~/.vnc/xstartup

The new contents of ~/.vnc/xstartup should be:

 #!/bin/bash
xrdb $HOME/.Xresources
startxfce4 &

After you've edited (or deleted/recreated) xstartup, start a new VNC desktop:

vncserver :yourvncnumber -geometry 1280x800

When you open the VNC viewer, you might get a "Welcome to first start" message; select "Use default config". (You may also get an error message saying Ubuntu had a problem but it doesn't appear to cause issues.)

That should be the basic VNC setup. Other convenience functions/packages/etc below:

Adding users

sudo adduser newusername
sudo usermod -aG sudo newusername

Install additional packages

Sublime Text
FileZilla

Enable copy/paste on VNC

Allows copy/paste between VNC windows and your computer. This has to be done at the beginning of every VNC session (so you should only need to do it once, unless you kill your VNC session, Agnew/Calculon/etc restart, etc).

run autocutsel -fork

Enable the Tab key https://www.starnet.com/xwin32kb/tab-key-not-working-when-using-xfce-desktop/

Open the Xfce Application Menu > Settings > Window Manager
Click on the Keyboard Tab
Clear the "Switch window for same application" setting

CUDA/CUDNN/Keras/etc. setup

Install and run Anaconda

First, download the Anaconda installer from their website. (Just Google it.) We want Linux version, x86, 64-bit, Python 3.6 edition. Then:

sudo bash [name of anaconda .sh installer file]
when prompted, install into: /opt/anaconda3

Next, do the CUDA setup:

Download CUDA installer from NVidia (or actually, just get from Agnew/Calculon/etc.) For reference, the version we're running on Agnew/Calculon/Lrrr/Ndnd as of June 2017 is 8.0.44. Before we can actually install it though, we need to follow the following pages' instructions for shutting down display manager and blacklisting Nouveau. The links follow immediately, but see below them for the short summary of what we actually have to do.

http://askubuntu.com/questions/788323/change-runlevel-on-16-04

http://askubuntu.com/questions/481414/install-nvidia-driver-instead-nouveau (top solution)

(First askubuntu page:) To disable starting up in graphical mode:

sudo systemctl isolate multi-user.target
sudo systemctl enable multi-user.target
sudo systemctl set-default multi-user.target

(Second askubuntu page:) Now keep nouveau from running by editing the blacklist:

sudo nano /etc/modprobe.d/blacklist.conf

Add the following lines to that blacklist file (see Agnew/Calculon's blacklist files if you want to confirm you got it right)

blacklist amd76x_edac #this might not be required for x86 32 bit users.
blacklist vga16fb
blacklist nouveau
blacklist rivafb
blacklist nvidiafb
blacklist rivatv

Now the following is probably not necessary (as there shouldn't be any proprietary Nvidia drivers on the system yet) but not a bad idea to run anyway:

sudo apt-get remove --purge nvidia*

Now, restart the system. When restarted, you should be able to run the CUDA installer.

Running the CUDA installer

In most cases, you can just run the cuda_8.0.44_linux.run (or whatever version) installer file and accept most of the defaults. However, note the following exception:

GTX 1080Ti GPU (current as of June 2017 None of the CUDA installers currently have the right drivers for this card. So when you install CUDA, do not let it install the GPU driver! Instead, do everything else normally but don't install any driver. Then download the current driver for a 1080Ti card from the Nvidia website. (Lrrr is using NVIDIA-Linux-x86_64-381.22.run as of June 2017.) Install the driver -- if it says there is already a driver installed (e.g., maybe from a past failed CUDA installation attempt or something), and asks to overwrite the old driver, allow it to overwrite! Otherwise, CUDA installation and everything following it should be the same as written below.

Detour over; back to CUDA installation. When asked if you want to install samples, say yes and put them in /opt/cuda_samples.

CUDA should now be installed. Next up is CUDNN -- need to download that from Nvidia developer program or just get from Agnew/Calculon/etc. We are currently using version 5.1 on all machines (even Lrrr, with the weird driver) as of June 2017.

Unzip/untar/whatever the CUDNN files e.g. cudnn-8.0-linux-x64-v5.1.tar. Should yield a cuda directory with lib64 and include subdirectories. Copy the files in each of those to the corresponding /usr/local/cuda subdirectories (will require sudo), e.g. sudo cp lib64/* /usr/local/cuda/lib64/ (assuming you are in the cuda directory already).

Now we need to put CUDA in the path and set up its environment variable(s) -- should just need to add the following to each user's .bashrc file (and exit shell / re-enter shell to take effect):

export PATH="/usr/local/cuda/bin:$PATH"
export set CUDA_ROOT=/usr/local/cuda

Also, it seems you need to enter sudo ldconfig /usr/local/cuda/lib64 at some point after installing all this stuff -- we think this has to do with making the system aware of the shared libraries? Seems like we need to enter it periodically but it's not clear when -- maybe after each restart???

Installing Theano and Keras, and apparently Git which isn't installed by default???

We are currently (June 2017) using old-ish versions of Keras and Theano for compatibility reasons. Use the commands below to install the right versions. Note the full path to pip is necessary even if you have Python 3 in your path, because it isn't in the super-user's path by default.

sudo /opt/anaconda3/bin/pip install theano==0.9.0
sudo /opt/anaconda3/bin/pip install keras==1.2
sudo apt-get install git

You'll need to create a .keras folder in your home directory and put the following file, named keras.json, inside it. Or just copy the .keras folder from another computer.

{
    "epsilon": 1e-07,
    "floatx": "float32",
    "backend": "theano",
    "image_dim_ordering": "tf"
}

Critical things that are changed from the defaults are backend (default tensorflow, needs to be theano) and image_dim_ordering (default th, should be tf)

You'll also need to make a .theanorc file in your home directory (or copy it from another system). That file should contain the following:

[global]
floatX = float32
device = gpu0

[lib]
cnmem = 1

[dnn]
enabled = True

(Note that if you have your GPUs configured in a different order, you may want to change gpu0 to gpu1, or even to cpu if you want to live life in the slow lane.)

At this point, you will hopefully be ready to try a sample analysis. If you feel up to it, maybe grab some Keras examples and try to run them! Best way, since we are using old code, is to steal an old Keras testing installation from Agnew/Calculon/Lrrr/Ndnd (e.g., the 'testing' folder in Matt's home directory). Go into the keras directory, then the examples directory, and try something like python mnist_mlp.py and hopefully it will run on the chosen GPU!

Mount Farnsworth

Only needs to be done once. Will only unmount if we do so explicitly or if Agnew/Calculon/Lrrr/Ndnd gets rebooted (or if their network connection dies)

Install cifs-utils package: sudo apt-get install cifs-utils sudo mkdir /mnt/eeg_data_analysis (or whatever the share is named)

sudo mount -t cifs -o username=matt //farnsworth/eeg_data/analysis /mnt/eeg_data_analysis/ (replace username with your Farnsworth username)

Start VNC session

Enter in Terminal:

ssh yourusername@agnew/calculon.local
vncserver :yourvncnumber -geometry  (whatever, e.g.) 1280x800

Enter in VNC:

agnew/calculon.local :yourvncnumber

End VNC session

vncserver -kill :yourvncnumber

-  ⇤ ← Revision 14 as of 2017-01-31 22:29:32 → 
  Size: 2463
  Editor: ChengLim
  Comment:
+   ← Revision 44 as of 2017-06-22 03:26:43 → ⇥
  Size: 11034
  Editor: MattJohnson
  Comment:
-Deletions are marked like this.
+Additions are marked like this.
 Line 8:
+Ubuntu installation notes:

 * Should probably tweak the BIOS to match that of Agnew/Calculon/Lrrr/Ndnd if we set up any more of these. Just see one of those guys for the good options.
 * Can/should tick on the options for auto-installing updates and installing third-party software while installing Ubuntu
 * For some reason, on the rack-mount machines, when it tells you to hit enter to restart after installing, hitting enter doesn't do anything. You just have to power off the machine.
 * Another weird thing: On the rack-mount machines with two graphics cards, you have to switch back and forth between the two graphics cards during Ubuntu installation vs running vs using the BIOS or whatnot. It's weird. Just keep going back and forth... one or the other will work for any given scenario.
-Line 11:
+Line 18:
+Early on, presumably right after installation of OS, remember to update all packages:

{{{
sudo apt-get update
sudo apt-get upgrade
}}}
-Line 13:
+Line 28:
-{{{
openssh-server
+These two are definite necessities. In particular, need to install openssh-server before basically anything else because otherwise we can't get SSH access.
{{{
sudo apt-get openssh-server
-Line 16:
+Line 32:
+}}}

The following may not be necessary anymore -- it was for our old VNC setup. But it shouldn't hurt to install these packages anyway, just in case we want to use something like the old setup again.
{{{
-Line 18:
+Line 38:
-'''VNC server configuration'''

{{{
# ! /bin/sh [no spaces]
+'''OLD VNC server configuration'''

{{{
 #!/bin/sh
-Line 27:
+Line 47:
+'''NEW VNC server configuration'''

Steps roughly follow [[https://www.digitalocean.com/community/tutorials/how-to-install-and-configure-vnc-on-ubuntu-16-04]]. Exact instructions below:

Install the following packages:
{{{
sudo apt-get install xfce4 xfce4-goodies
sudo apt-get install autocutsel
}}}


Next, start up VNC:
{{{
vncserver:yourvncnumber
}}}
It will prompt you to set and confirm a password; do so. 
Then end the session:
{{{
vncserver -kill :yourvncnumber
}}}

This creates `~/.vnc/xstartup`.

Either edit `~/.vnc/xstartup` , or delete it and make a new file. 
IF MAKING A NEW FILE, enter this command as well: 
{{{
chmod 755 ~/.vnc/xstartup
}}}
The new contents of `~/.vnc/xstartup` should be:
{{{
 #!/bin/bash
xrdb $HOME/.Xresources
startxfce4 &
}}}


After you've edited (or deleted/recreated) `xstartup`, start a new VNC desktop:
{{{
vncserver :yourvncnumber -geometry 1280x800
}}}

When you open the VNC viewer, you might get a "Welcome to first start" message; select "Use default config". 
(You may also get an error message saying Ubuntu had a problem but it doesn't appear to cause issues.)

That should be the basic VNC setup. Other convenience functions/packages/etc below:
-Line 33:
+Line 99:
-== Keras setup ==
Install and run Anaconda:

{{{
ls /opt/anaconda3
sudo bash [name of anaconda .sh file]

pip install theano
pip install keras
apt-get install git
}}}
Edit `keras.json` (in home folder)

Change `backend:` from `tensorflow` -> `theano`

Change `image_dim_ordering:` from `tf` -> `th`

Install CUDA 8 and CUDNN:
+'''Install additional packages'''
 * Sublime Text
 * FileZilla

'''Enable copy/paste on VNC'''

Allows copy/paste between VNC windows and your computer. This has to be done at the beginning of every VNC session (so you should only need to do it once, unless you kill your VNC session, Agnew/Calculon/etc restart, etc).
{{{
run autocutsel -fork
}}}


'''Enable the Tab key'''
[[https://www.starnet.com/xwin32kb/tab-key-not-working-when-using-xfce-desktop/]]
 * Open the Xfce Application Menu > Settings > Window Manager
 * Click on the Keyboard Tab
 * Clear the "Switch window for same application" setting

----

----

----

----


== CUDA/CUDNN/Keras/etc. setup ==

'''Install and run Anaconda'''

First, download the Anaconda installer from their website. (Just Google it.) We want Linux version, x86, 64-bit, Python 3.6 edition. Then:

{{{
sudo bash [name of anaconda .sh installer file]
when prompted, install into: /opt/anaconda3
}}}

'''Next, do the CUDA setup:'''

Download CUDA installer from NVidia (or actually, just get from Agnew/Calculon/etc.) For reference, the version we're running on Agnew/Calculon/Lrrr/Ndnd as of June 2017 is 8.0.44. Before we can actually install it though, we need to follow the following pages' instructions for shutting down display manager and blacklisting Nouveau. The links follow immediately, but see below them for the short summary of what we actually have to do.
-Line 56:
+Line 146:
-Add CUDA directory `cuda/bin` to path

Copy `cudnn.h` to cuda include directory

Copy shared libraries to cuda library

Make `.theanorc` in home directory

Set CUDA/root environment variable

Change ldconfig in cuda lib directory: `sudo ldconfig /usr/local/cuda/lib64` ''(unclear how often we have to do this -- each restart? Per user?)''
+'''(First askubuntu page:) To disable starting up in graphical mode:'''

{{{
sudo systemctl isolate multi-user.target
sudo systemctl enable multi-user.target
sudo systemctl set-default multi-user.target
}}}

'''(Second askubuntu page:) Now keep nouveau from running by editing the blacklist:'''

{{{
sudo nano /etc/modprobe.d/blacklist.conf
}}}

Add the following lines to that blacklist file (see Agnew/Calculon's blacklist files if you want to confirm you got it right)

{{{
blacklist amd76x_edac #this might not be required for x86 32 bit users.
blacklist vga16fb
blacklist nouveau
blacklist rivafb
blacklist nvidiafb
blacklist rivatv
}}}

Now the following is probably not necessary (as there shouldn't be any proprietary Nvidia drivers on the system yet) but not a bad idea to run anyway:

{{{
sudo apt-get remove --purge nvidia*
}}}

Now, restart the system. When restarted, you should be able to run the CUDA installer.

'''Running the CUDA installer'''

In most cases, you can just run the {{{cuda_8.0.44_linux.run}}} (or whatever version) installer file and accept most of the defaults. However, note the following exception:

'''''GTX 1080Ti GPU (current as of June 2017''''' None of the CUDA installers currently have the right drivers for this card. So when you install CUDA, do '''not''' let it install the GPU driver! Instead, do everything else normally but don't install any driver. Then download the current driver for a 1080Ti card from the Nvidia website. (Lrrr is using NVIDIA-Linux-x86_64-381.22.run as of June 2017.) Install the driver -- if it says there is already a driver installed (e.g., maybe from a past failed CUDA installation attempt or something), and asks to overwrite the old driver, allow it to overwrite! Otherwise, CUDA installation and everything following it should be the same as written below.

Detour over; back to CUDA installation. When asked if you want to install samples, say yes and put them in /opt/cuda_samples.

CUDA should now be installed. Next up is CUDNN -- need to download that from Nvidia developer program or just get from Agnew/Calculon/etc. We are currently using version 5.1 on all machines (even Lrrr, with the weird driver) as of June 2017.

Unzip/untar/whatever the CUDNN files e.g. `cudnn-8.0-linux-x64-v5.1.tar`. Should yield a `cuda` directory with `lib64` and `include` subdirectories. Copy the files in each of those to the corresponding `/usr/local/cuda` subdirectories (will require `sudo`), e.g. `sudo cp lib64/* /usr/local/cuda/lib64/` (assuming you are in the `cuda` directory already).

Now we need to put CUDA in the path and set up its environment variable(s) -- should just need to add the following to each user's `.bashrc` file (and exit shell / re-enter shell to take effect):

{{{
export PATH="/usr/local/cuda/bin:$PATH"
export set CUDA_ROOT=/usr/local/cuda
}}}

Also, it seems you need to enter `sudo ldconfig /usr/local/cuda/lib64` at some point after installing all this stuff -- we think this has to do with making the system aware of the shared libraries? Seems like we need to enter it periodically but it's not clear when -- maybe after each restart???

'''Installing Theano and Keras, and apparently Git which isn't installed by default???'''

We are currently (June 2017) using old-ish versions of Keras and Theano for compatibility reasons. Use the commands below to install the right versions. Note the full path to {{{pip}}} is necessary even if you have Python 3 in your path, because it isn't in the super-user's path by default.

{{{
sudo /opt/anaconda3/bin/pip install theano==0.9.0
sudo /opt/anaconda3/bin/pip install keras==1.2
sudo apt-get install git
}}}

You'll need to create a {{{.keras}}} folder in your home directory and put the following file, named {{{keras.json}}}, inside it. Or just copy the `.keras` folder from another computer.

{{{

{
    "epsilon": 1e-07,
    "floatx": "float32",
    "backend": "theano",
    "image_dim_ordering": "tf"
}

}}}

Critical things that are changed from the defaults are `backend` (default `tensorflow`, needs to be `theano`) and `image_dim_ordering` (default `th`, should be `tf`)

You'll also need to make a `.theanorc` file in your home directory (or copy it from another system). That file should contain the following:

{{{
[global]
floatX = float32
device = gpu0

[lib]
cnmem = 1

[dnn]
enabled = True
}}}

''(Note that if you have your GPUs configured in a different order, you may want to change `gpu0` to `gpu1`, or even to `cpu` if you want to live life in the slow lane.)''

At this point, you will hopefully be ready to try a sample analysis. If you feel up to it, maybe grab some Keras examples and try to run them! Best way, since we are using old code, is to steal an old Keras testing installation from Agnew/Calculon/Lrrr/Ndnd (e.g., the 'testing' folder in Matt's home directory). Go into the `keras` directory, then the `examples` directory, and try something like `python mnist_mlp.py` and hopefully it will run on the chosen GPU!
-Line 69:
+Line 245:
-Only needs to be done once. Will only unmount if we do so explicitly or if Agnew/Farnsworth gets rebooted (or if their network connection dies)
+Only needs to be done once. Will only unmount if we do so explicitly or if Agnew/Calculon/Lrrr/Ndnd gets rebooted (or if their network connection dies)
-Line 76:
+Line 252:
+Enter in Terminal:
-Line 78:
+Line 256:
-vncserver -kill :17
vncserver :17 -geometry  (whatever, e.g.) 1280x800
}}}
+vncserver :yourvncnumber -geometry  (whatever, e.g.) 1280x800
}}}
Enter in VNC:

{{{
agnew/calculon.local :yourvncnumber
}}}
== End VNC session ==
{{{
vncserver -kill :yourvncnumber
}}}