MPI_Comm_split 不適用於 MPI_Bcast (MPI_Comm_split not working with MPI_Bcast)


問題描述

MPI_Comm_split 不適用於 MPI_Bcast (MPI_Comm_split not working with MPI_Bcast)

使用以下代碼,我將 4 個進程拆分為列組,然後從對角線 (0,3) 在同一列中廣播。進程 0 向 2 廣播。而 3 應該向 1 廣播。但它沒有按預期工作。有人能看出什麼問題嗎?

0 1
2 3

#include <stdio.h>
#include <stdlib.h>
#include <iostream>
#include <mpi.h>
#include <mpi.h>
using namespace std;

int main(int argc, char **argv){
    MPI_Comm col_comm,row_comm;
    int myrank, size, even, value=0;
    int localRank=0;
    MPI_Init (&argc, &argv);
    MPI_Comm_rank (MPI_COMM_WORLD, &myrank);
    MPI_Comm_size (MPI_COMM_WORLD, &size);      
    MPI_Comm_split(MPI_COMM_WORLD, myrank%2, myrank, &col_comm); 
    MPI_Comm_rank (col_comm, &localRank);

    if(myrank%3==0){
        value = myrank*5+1;
        MPI_Bcast(&value, 1, MPI_INT, localRank, col_comm);
    }

    printf("Rank=%d | LocalRank=%d | Got broadcast value of %d\n", myrank, localRank, value);
    MPI_Finalize();
    return 0;
}

</code></pre>

輸出

ubuntu@root:~/matrixmult$ mpirun comtest ‑np 4
Rank=0 | LocalRank=0 | Got broadcast value of 1
Rank=1 | LocalRank=0 | Got broadcast value of 0
Rank=2 | LocalRank=1 | Got broadcast value of 0
Rank=3 | LocalRank=1 | Got broadcast value of 16


參考解法

方法 1:

MPI_Bcast

Broadcasts a message from the process with rank "root" to all other processes of the communicator


</blockquote>

is a collective communication routine, hence it should be called by all the processes in a given communicator. Therefore, you need to remove the following condition if(myrank%3==0) and then you need to adapt the root process accordingly, instead of using localRank.

In your current code, only the processes with myrank 0 and 3 called the MPI_Bcast (both belonging to different communicators). So process 0 calls

MPI_Bcast(&value, 1, MPI_INT, localRank, col_comm);

which basically means that it broadcasted the value to itself. The same happens with process 3. Hence, the output:

Rank=0 | LocalRank=0 | Got broadcast value of 1
Rank=1 | LocalRank=0 | Got broadcast value of 0
Rank=2 | LocalRank=1 | Got broadcast value of 0
Rank=3 | LocalRank=1 | Got broadcast value of 16

Rank=0 and Rank=3 communicated with themselves, while the other processes where not part of the MPI_Bcast. Hence the value of 0 for both of them.

Try the following:

int main(int argc, char **argv){
    MPI_Comm col_comm,row_comm;
    int myrank, size, even, value=0;
    int localRank=0;
    MPI_Init (&argc, &argv);
    MPI_Comm_rank (MPI_COMM_WORLD, &myrank);
    MPI_Comm_size (MPI_COMM_WORLD, &size);      
    MPI_Comm_split(MPI_COMM_WORLD, myrank%2, myrank, &col_comm); 
    MPI_Comm_rank (col_comm, &localRank);

    if(myrank == 0 || myrank == 3)    
       value = myrank*5+1;

    MPI_Bcast(&value, 1, MPI_INT, myrank%2 != 0, col_comm);

    printf("Rank=%d | LocalRank=%d | Got broadcast value of %d\n", myrank, localRank, value);
    MPI_Finalize();
    return 0;
}

Process 0 broadcasts to 2. And 3 should broadcast to 1.

Output:

Rank=0 | LocalRank=0 | Got broadcast value of 1
Rank=1 | LocalRank=0 | Got broadcast value of 16
Rank=2 | LocalRank=1 | Got broadcast value of 1
Rank=3 | LocalRank=1 | Got broadcast value of 16

(by Pit Diggerdreamcrash)

參考文件

  1. MPI_Comm_split not working with MPI_Bcast (CC BY‑SA 2.5/3.0/4.0)

#mpi #C++ #openmpi #parallel-processing #C






相關問題

MPI 在根進程上收集數組 (MPI gather array on root process)

如何為 xcode 安裝 Openmpi? (how to install Openmpi for xcode?)

在 ARM 上的 Linux 上運行 MPI (OpenMPI) 應用程序時出現問題 (Problems running MPI (OpenMPI) app on Linux on ARM)

在 C++ 和 MPI 中獨立並行寫入文件 (independent parallel writing into files in C++ and MPI)

傳輸一些數據後 MPI_Bcast 掛起 (MPI_Bcast hanging after some data transferred)

來自一個文件的多個 mpirun 與多個文件運行 (Multiple mpiruns from one file vs multiple file runs)

Isend/Irecv 不起作用,但 Send/Recv 可以 (Isend/Irecv doesn`t work but Send/Recv does)

MPI 要求在 localhost 上進行身份驗證 (MPI asks authentication on localhost)

MPI 生成和合併問題 (Issue with MPI spawn and merge)

mpiexec 拋出錯誤“mkstemp 失敗,沒有這樣的文件或目錄” (mpiexec throws error "mkstemp failed No such file or directory")

使用 MPI_Isend 時出現分段錯誤 (Segmentation Fault when using MPI_Isend)

MPI_Comm_split 不適用於 MPI_Bcast (MPI_Comm_split not working with MPI_Bcast)







留言討論