C 在客户端代码中使用fseek()时出现问题的NFS性能
我正在使用MPI开发一个简单的并行应用程序,它涉及到将文件加载到内存。该文件通过NFS导出到计算机群集的节点。我注意到,在某些情况下,随着数千个额外的TCP数据包从服务器传输到客户端,NFS的性能会显著下降,我在代码中指出了使用fseek()的问题:C 在客户端代码中使用fseek()时出现问题的NFS性能,c,optimization,parallel-processing,mpi,nfs,C,Optimization,Parallel Processing,Mpi,Nfs,我正在使用MPI开发一个简单的并行应用程序,它涉及到将文件加载到内存。该文件通过NFS导出到计算机群集的节点。我注意到,在某些情况下,随着数千个额外的TCP数据包从服务器传输到客户端,NFS的性能会显著下降,我在代码中指出了使用fseek()的问题: //Seek to data and load them to array fseek ( fp, ( unsigned int ) dec_number + start, SEEK_SET ); for ( i = 0; i < n *
//Seek to data and load them to array
fseek ( fp, ( unsigned int ) dec_number + start, SEEK_SET );
for ( i = 0; i < n * mpi_n; i++ ) {
if ( ! feof ( fp ) )
text[i] = fgetc ( fp );
if ( i > 0 && n > mpi_n && i % mpi_n == 0 )
fseek ( fp, n - mpi_n, SEEK_CUR );
}
fclose ( fp );
具有10个节点的群集、具有冷NFS缓存和fseek()的300MB文件的nfswatch快照:
具有10个节点的群集的nfswatch快照,一个具有冷NFS缓存且不带fseek()的2GB文件:
使用以下mount命令装载客户端:
/nfs on/nfs类型nfs(rw,rsize=8192,wsize=8192,timeo=14,intr)如果没有fseek(),您的代码可能会有不同的行为-您能将其精确定位到fseek()吗?如果您确定您所在的平台(例如,RHEL 5中有一个轻微的()降级,直到RHEL 5.7中的某个内核更新后才进行修复),这可能会有所帮助。我已经看到了关于RHEL问题的评论。集群的服务器和客户端上都安装了Ubuntu12.04.2,这是一个同质的设置。你正在读取一个字节,然后寻找一个病态的坏模式(fseek倾向于导致缓冲区刷新)。使用
noatime
装载。交换到open&read&lseek(或open、mmap和纯指针访问)。
Time with cold NFS cache, without fseek(): ~4 sec
Time with hot NFS cache, without fseek(): ~3 sec
Time with cold NFS cache, with fseek(): ~12 sec
Time with hot NFS cache, with fseek(): ~3 sec
Total packets:
1903459 (network) 544803 (to host) 0 (dropped)
Packet counters:
NFS3 Read: 116290 21%
NFS3 Write: 10 0%
NFS Read: 0 0%
NFS Write: 0 0%
NFS Mount: 0 0%
Port Mapper: 0 0%
RPC Authorization: 29 0%
Other RPC Packets: 0 0%
TCP Packets: 544386 100%
UDP Packets: 17 0%
ICMP Packets: 0 0%
Routing Control: 0 0%
Address Resolution: 0 0%
Reverse Addr Resol: 0 0%
Ethernet Broadcast: 0 0%
Other Packets: 49 0%
Total packets:
251804 (network) 102650 (to host) 0 (dropped)
Packet counters:
NFS3 Read: 37039 36%
NFS3 Write: 1 0%
NFS Read: 0 0%
NFS Write: 0 0%
NFS Mount: 0 0%
Port Mapper: 0 0%
RPC Authorization: 2 0%
Other RPC Packets: 0 0%
TCP Packets: 102543 100%
UDP Packets: 30 0%
ICMP Packets: 1 0%
Routing Control: 0 0%
Address Resolution: 0 0%
Reverse Addr Resol: 0 0%
Ethernet Broadcast: 0 0%
Other Packets: 41 0%