Hi -
Is there a way to restrict Deep Learning network to execute on a specific GPU ID on a multi-GPU host with Linux OS? Currently, when I use the DL Network Executor (Tensorflow) on a host with 4 GPUs, the DL model uses all available GPUs (see nvidia-smi output below).
One can get around this, if using Python Scripting, by setting “environment variables” but is there an option to do this directly in the “DL Network Executor” Node?
nvidia-smi output (apologies about the format)
±----------------------------------------------------------------------------+
| NVIDIA-SMI 396.37 Driver Version: 396.37 |
|-------------------------------±---------------------±---------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 Tesla P100-PCIE… On | 00000000:0D:00.0 Off | Off |
| N/A 31C P0 31W / 250W | 16056MiB / 16280MiB | 0% Default |
±------------------------------±---------------------±---------------------+
| 1 Tesla P100-PCIE… On | 00000000:13:00.0 Off | Off |
| N/A 27C P0 31W / 250W | 15485MiB / 16280MiB | 0% Default |
±------------------------------±---------------------±---------------------+
| 2 Tesla P100-PCIE… On | 00000000:8E:00.0 Off | Off |
| N/A 28C P0 31W / 250W | 15485MiB / 16280MiB | 0% Default |
±------------------------------±---------------------±---------------------+
| 3 Tesla P100-PCIE… On | 00000000:91:00.0 Off | Off |
| N/A 30C P0 32W / 250W | 15485MiB / 16280MiB | 0% Default |
±------------------------------±---------------------±---------------------+
±----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 11757 C …linux.x86_64_1.8.0.152-01/jre/bin/java 15475MiB |
| 1 11757 C …linux.x86_64_1.8.0.152-01/jre/bin/java 15475MiB |
| 2 11757 C …linux.x86_64_1.8.0.152-01/jre/bin/java 15475MiB |
| 3 11757 C …linux.x86_64_1.8.0.152-01/jre/bin/java 15475MiB |
±----------------------------------------------------------------------------+
Thanks,
Reddy