Attempt to run ANSYS in a distributed mode failed

采用多节点运行ansys13时出现故障(单节点没有任何问题):
[chenwei@node32 ~]$ ansys130 -dis -machines node32:12:node33:12 -i bmd-4.dat
HP-MPI licensed for ANSYS.
Host 0 — ip 11.11.16.32 — ranks 0 – 11
Host 1 — ip 11.11.16.33 — ranks 12 – 23 

host | 0    1
======|===========
0 : SHM  IBV
1 : IBV  SHM

Prot –  All Intra-node communication is: SHM
Prot –  All Inter-node communication is: IBV

*** FATAL ***
Attempt to run ANSYS in a distributed mode failed.
Distributed ANSYS was unable to write into the working directory (  )
for process with MPI Rank ID =   -1 which is located on machine:
Please verify that the working directory exists on
this machine and that you have write permissions
for the directory.
[chenwei@node32 ~]$

这个问题是HPMPI导致的,HPMPI(如今的Platformmpi)在多节点执行时不会将提交节点的工作目录作为环境变量传递到其他的计算节点,所以就出现了问题。话说HPMPI有的时候会把home当做工作目录,有的时竟然把/当做工作目录。。。

解决办法:指定工作目录:
[chenwei@node32 ~]$ ansys130 -dir /public/users/chenwei -dis -machines node32:12:node33:12 -i bmd-4.dat
HP-MPI licensed for ANSYS.
Host 0 — ip 11.11.16.32 — ranks 0 – 11
Host 1 — ip 11.11.16.33 — ranks 12 -23 

host | 0    1
======|===========
0 : SHM  IBV
1 : IBV  SHM

Prot –  All Intra-node communication is: SHM
Prot –  All Inter-node communication is: IBV

ANSYS Multiphysics

*————————————————————-*
|                                                             |
|   W E L C O M E   T O   T H E   A N S Y S   P R O G R A M   |
|                                                             |
*————————————————————-*

更彻底一点的是,坚决抛弃hpmpi(Platformmpi):
[chenwei@node32 ~]$ ansys130 -mpi intelmpi -dis -machines node32:12:node33:12 -i bmd-4.dat 

ANSYS Multiphysics

*————————————————————-*
|                                                             |
|   W E L C O M E   T O   T H E   A N S Y S   P R O G R A M   |
|                                                             |
*————————————————————-*

话说采用intelmpi的启动速度,相对于hpmpi的龟速,简直可以用闪电来形容。
此条目发表在CAE, HPC分类目录,贴了, , , , 标签。将固定链接加入收藏夹。

发表评论

电子邮件地址不会被公开。

验证图片

*