問題描述
我想使用 HPC 的 gpu 並嘗試 module add CUDA ...但出現錯誤。錯誤是“Lmod 檢測到以下錯誤: (I want to use the gpu of the HPC and try module add CUDA... But errors occurs. The error is "Lmod has detected the following error:)
Lmod has detected the following error: Unable to load module
because of error when evaluating modulefile:
/trinity/shared/easybuild/modules/all/CUDA/11.1.1‑GCC‑10.2.0.lua: Empty or
non‑existant file
Please check the modulefile and especially if there is a the line number
specified in the above message
While processing the following module(s):
Module fullname Module Filename
‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑ ‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑
CUDA/11.1.1‑GCC‑10.2.0 /trinity/shared/easybuild/modules/all/CUDA/11.1.1‑GCC‑10.2.0.lua
錯誤令人困惑。我輸入了 rm –rf ~/.lmod.d/.cache
但它不起作用。如何解決這個問題?
參考解法
方法 1:
Can you cat /trinity/shared/easybuild/modules/all/CUDA/11.1.1‑GCC‑10.2.0.lua
? Maybe your modulefile
is not existent.
if
modulefile
is not existent: Generally, you can write downmodulefile
withlua
ortcl
file. Try to make some file like this!
#%Module
set s /usr/local/cuda‑11.1
prepend‑path PATH $s/bin
prepend‑path LIBRARY_PATH $s/lib
prepend‑path LD_LIBRARY_PATH $s/lib
prepend‑path LIBRARY_PATH $s/lib64
prepend‑path LD_LIBRARY_PATH $s/lib64
prepend‑path CPATH $s/include
prepend‑path INCLUDE $s/include
(by 郭曼珊、William Mou)