One of the first things you learn when tapping into a powerful machine is that there are constraints on what one can do. Some, are frustrating. Like, you can’t just plug a kbd and a screen and hack away!
One of these machines are the DGX family from NVIDIA. Off the bat, the manual “greet” you with this text:
Excerpt from Nvidia’s DGX manual
What can we do, if we are used to having a visual IDE, debugging, all that jazz? Can we pause an 8-GPU training and examine if the batch-norm is doing its thing? Without being able to plug our kbd/screen in the said machine?
What we want is something similar with the figure below.
The code resides inside a docker, inside a remote machine, inside a, well, secure network. Usually, with only one SSH port open to the world.
We reside on a nice location, with our laptop. And we want to execute, step by step, said code. Having access, of course to memory, variable contents, adding/removing breakpoints, everything that one can do on a local machine, with a piece of code.
Kind reminder, executing the code on a batch mode, is trivial once SSH access is granted.
How we want to work (red line)
Jetbrains recently introduced in their ecosystem the ability to control a remote IDE from a local IDE. Remote IDE is “headless” and the local IDE is “thin”.
Catch? You need professional edition. Imho, money well spent.
The remote setup
Enough chit-chat, let’s see how we set up things.
We will not focus on setting up accounts, ssh tunnels, dockers, docker port mappings, we assume that all these can be easily Chatgpt-ed. [ProTip: One must seriously validate the ChatGPT hallucinations before opening ports on remote machines]
So, docker works, ssh works, port forwarding works, let’s roll our sleeves to get PyCharm working!
Get the binaries
The 1st step is go to the download page of PyCharm and find the download link:
Where to download from
Just examine the “direct link” section and you will be fine.
Now, inside the docker container [either run an interactive session or write the
dockerfile] download and unzip the file. No need to be under some root account.
After the files were unpacked, go to the
pycharm-xx.yy/bin folder. Let’s say it is:
Here, run the backend server:
./remote-dev-server.sh run ~/src/ -l 0.0.0.0 -p 9400
- the project path from ~/src/ to your project path inside the docker
- The listening port and/or address, if there are different
Observe the generated link. Copy it! You would probably want to change 1-2 things in there, too. For example, if on the local machine, the forwarded port is not 9400, you must change that. The address, will probably be
127.0.0.1 but if is not, change it to 127.0.0.1.
And that’s about it for remote machine. Make sure the ssh and docker have the ports properly forwarded. Also, that the source code is mounted inside the docker.
Reading through the
remote-dev-server.sh it might be possible to configure the link so it is predictable.
The local setup
- Make sure the ssh tunnel is established. You can do it from command line or from IDE.
- Open your IDE. Either PyCharm [professional] or Jetbrains Gateway.
- Locate the Remote Development window:
- Paste the modified URL in “Connect to Running IDE” field.
- Let it settle.
The Remote Development window. Paste the link in the text box and hit Connect.
One small bonus is that you get some info about the remote machine, too. Like /home/ disk space [always a problem on these machines] and RAM. Note the available RAM 😉 Slow day, that was.
Remote development works
Now, there is no excuse in hacking around inside a thick and wide neural network. Coding while debugging the tensor shapes don’t require installing all the libraries in CPU mode, locally. Or working the dimensions using pen and pencil.
Jupyter lab is an alternative for visual development, especially that one can see the pictures, but debugging is a bit of a pain.