Hadoop 向现有群集添加新的namenode数据目录
要将新的NameNode数据目录(dfs.name.dir、dfs.NameNode.name.dir)正确添加到现有生产群集,我需要遵循什么过程?我已将新路径添加到hdfs-site.xml文件中的逗号分隔列表中,但当我尝试启动namenode时,出现以下错误: 目录/data/nfs/dfs/nn处于不一致状态:存储目录不存在或不可访问 在我的例子中,我有两个目录已经就位并正在工作。(/data/1/dfs/nn,/data/2/dfs/nn)添加新目录时,无法启动namenode。删除新路径后,它将正常启动。新目录的我的fstab如下所示: 备份服务器:/hadoop\u nn/data/nfs/dfs-nfs-tcp、soft、intr、timeo=10、retrans=10 1 2 在上面的挂载点中,我创建了一个名为nn的文件夹。该文件夹与其他两个现有文件夹具有相同的所有权和权限 drwx-----2 hdfs hadoop 64 Jan 22 16:30 nnHadoop 向现有群集添加新的namenode数据目录,hadoop,hdfs,cloudera,Hadoop,Hdfs,Cloudera,要将新的NameNode数据目录(dfs.name.dir、dfs.NameNode.name.dir)正确添加到现有生产群集,我需要遵循什么过程?我已将新路径添加到hdfs-site.xml文件中的逗号分隔列表中,但当我尝试启动namenode时,出现以下错误: 目录/data/nfs/dfs/nn处于不一致状态:存储目录不存在或不可访问 在我的例子中,我有两个目录已经就位并正在工作。(/data/1/dfs/nn,/data/2/dfs/nn)添加新目录时,无法启动namenode。删除新路
我是否需要手动复制/复制现有namenode目录中的所有文件,还是namenode服务在启动时应自动复制/复制 在Cloudera CDH 4.5.0中,只有当以下函数(从
Storage.java
,第418行附近)返回不存在时,才会发生该错误。在每种情况下都会显示一条警告,并记录更多详细信息,请从org.apache.hadoop.hdfs.server.common.Storage
中查找日志行
简而言之,名称节点似乎认为它不存在、不是目录、不可写或抛出了一个SecurityException
/**
* Check consistency of the storage directory
*
* @param startOpt a startup option.
*
* @return state {@link StorageState} of the storage directory
* @throws InconsistentFSStateException if directory state is not
* consistent and cannot be recovered.
* @throws IOException
*/
public StorageState analyzeStorage(StartupOption startOpt, Storage storage)
throws IOException {
assert root != null : "root is null";
String rootPath = root.getCanonicalPath();
try { // check that storage exists
if (!root.exists()) {
// storage directory does not exist
if (startOpt != StartupOption.FORMAT) {
LOG.warn("Storage directory " + rootPath + " does not exist");
return StorageState.NON_EXISTENT;
}
LOG.info(rootPath + " does not exist. Creating ...");
if (!root.mkdirs())
throw new IOException("Cannot create directory " + rootPath);
}
// or is inaccessible
if (!root.isDirectory()) {
LOG.warn(rootPath + "is not a directory");
return StorageState.NON_EXISTENT;
}
if (!root.canWrite()) {
LOG.warn("Cannot access storage directory " + rootPath);
return StorageState.NON_EXISTENT;
}
} catch(SecurityException ex) {
LOG.warn("Cannot access storage directory " + rootPath, ex);
return StorageState.NON_EXISTENT;
}
this.lock(); // lock storage if it exists
if (startOpt == HdfsServerConstants.StartupOption.FORMAT)
return StorageState.NOT_FORMATTED;
if (startOpt != HdfsServerConstants.StartupOption.IMPORT) {
storage.checkOldLayoutStorage(this);
}
// check whether current directory is valid
File versionFile = getVersionFile();
boolean hasCurrent = versionFile.exists();
// check which directories exist
boolean hasPrevious = getPreviousDir().exists();
boolean hasPreviousTmp = getPreviousTmp().exists();
boolean hasRemovedTmp = getRemovedTmp().exists();
boolean hasFinalizedTmp = getFinalizedTmp().exists();
boolean hasCheckpointTmp = getLastCheckpointTmp().exists();
if (!(hasPreviousTmp || hasRemovedTmp
|| hasFinalizedTmp || hasCheckpointTmp)) {
// no temp dirs - no recovery
if (hasCurrent)
return StorageState.NORMAL;
if (hasPrevious)
throw new InconsistentFSStateException(root,
"version file in current directory is missing.");
return StorageState.NOT_FORMATTED;
}
if ((hasPreviousTmp?1:0) + (hasRemovedTmp?1:0)
+ (hasFinalizedTmp?1:0) + (hasCheckpointTmp?1:0) > 1)
// more than one temp dirs
throw new InconsistentFSStateException(root,
"too many temporary directories.");
// # of temp dirs == 1 should either recover or complete a transition
if (hasCheckpointTmp) {
return hasCurrent ? StorageState.COMPLETE_CHECKPOINT
: StorageState.RECOVER_CHECKPOINT;
}
if (hasFinalizedTmp) {
if (hasPrevious)
throw new InconsistentFSStateException(root,
STORAGE_DIR_PREVIOUS + " and " + STORAGE_TMP_FINALIZED
+ "cannot exist together.");
return StorageState.COMPLETE_FINALIZE;
}
if (hasPreviousTmp) {
if (hasPrevious)
throw new InconsistentFSStateException(root,
STORAGE_DIR_PREVIOUS + " and " + STORAGE_TMP_PREVIOUS
+ " cannot exist together.");
if (hasCurrent)
return StorageState.COMPLETE_UPGRADE;
return StorageState.RECOVER_UPGRADE;
}
assert hasRemovedTmp : "hasRemovedTmp must be true";
if (!(hasCurrent ^ hasPrevious))
throw new InconsistentFSStateException(root,
"one and only one directory " + STORAGE_DIR_CURRENT
+ " or " + STORAGE_DIR_PREVIOUS
+ " must be present when " + STORAGE_TMP_REMOVED
+ " exists.");
if (hasCurrent)
return StorageState.COMPLETE_ROLLBACK;
return StorageState.RECOVER_ROLLBACK;
}
在Cloudera CDH 4.5.0中,只有当以下函数(来自Storage.java
,第418行附近)返回不存在时才会发生该错误。在每种情况下都会显示一条警告,并记录更多详细信息,请从org.apache.hadoop.hdfs.server.common.Storage
中查找日志行
简而言之,名称节点似乎认为它不存在、不是目录、不可写或抛出了一个SecurityException
/**
* Check consistency of the storage directory
*
* @param startOpt a startup option.
*
* @return state {@link StorageState} of the storage directory
* @throws InconsistentFSStateException if directory state is not
* consistent and cannot be recovered.
* @throws IOException
*/
public StorageState analyzeStorage(StartupOption startOpt, Storage storage)
throws IOException {
assert root != null : "root is null";
String rootPath = root.getCanonicalPath();
try { // check that storage exists
if (!root.exists()) {
// storage directory does not exist
if (startOpt != StartupOption.FORMAT) {
LOG.warn("Storage directory " + rootPath + " does not exist");
return StorageState.NON_EXISTENT;
}
LOG.info(rootPath + " does not exist. Creating ...");
if (!root.mkdirs())
throw new IOException("Cannot create directory " + rootPath);
}
// or is inaccessible
if (!root.isDirectory()) {
LOG.warn(rootPath + "is not a directory");
return StorageState.NON_EXISTENT;
}
if (!root.canWrite()) {
LOG.warn("Cannot access storage directory " + rootPath);
return StorageState.NON_EXISTENT;
}
} catch(SecurityException ex) {
LOG.warn("Cannot access storage directory " + rootPath, ex);
return StorageState.NON_EXISTENT;
}
this.lock(); // lock storage if it exists
if (startOpt == HdfsServerConstants.StartupOption.FORMAT)
return StorageState.NOT_FORMATTED;
if (startOpt != HdfsServerConstants.StartupOption.IMPORT) {
storage.checkOldLayoutStorage(this);
}
// check whether current directory is valid
File versionFile = getVersionFile();
boolean hasCurrent = versionFile.exists();
// check which directories exist
boolean hasPrevious = getPreviousDir().exists();
boolean hasPreviousTmp = getPreviousTmp().exists();
boolean hasRemovedTmp = getRemovedTmp().exists();
boolean hasFinalizedTmp = getFinalizedTmp().exists();
boolean hasCheckpointTmp = getLastCheckpointTmp().exists();
if (!(hasPreviousTmp || hasRemovedTmp
|| hasFinalizedTmp || hasCheckpointTmp)) {
// no temp dirs - no recovery
if (hasCurrent)
return StorageState.NORMAL;
if (hasPrevious)
throw new InconsistentFSStateException(root,
"version file in current directory is missing.");
return StorageState.NOT_FORMATTED;
}
if ((hasPreviousTmp?1:0) + (hasRemovedTmp?1:0)
+ (hasFinalizedTmp?1:0) + (hasCheckpointTmp?1:0) > 1)
// more than one temp dirs
throw new InconsistentFSStateException(root,
"too many temporary directories.");
// # of temp dirs == 1 should either recover or complete a transition
if (hasCheckpointTmp) {
return hasCurrent ? StorageState.COMPLETE_CHECKPOINT
: StorageState.RECOVER_CHECKPOINT;
}
if (hasFinalizedTmp) {
if (hasPrevious)
throw new InconsistentFSStateException(root,
STORAGE_DIR_PREVIOUS + " and " + STORAGE_TMP_FINALIZED
+ "cannot exist together.");
return StorageState.COMPLETE_FINALIZE;
}
if (hasPreviousTmp) {
if (hasPrevious)
throw new InconsistentFSStateException(root,
STORAGE_DIR_PREVIOUS + " and " + STORAGE_TMP_PREVIOUS
+ " cannot exist together.");
if (hasCurrent)
return StorageState.COMPLETE_UPGRADE;
return StorageState.RECOVER_UPGRADE;
}
assert hasRemovedTmp : "hasRemovedTmp must be true";
if (!(hasCurrent ^ hasPrevious))
throw new InconsistentFSStateException(root,
"one and only one directory " + STORAGE_DIR_CURRENT
+ " or " + STORAGE_DIR_PREVIOUS
+ " must be present when " + STORAGE_TMP_REMOVED
+ " exists.");
if (hasCurrent)
return StorageState.COMPLETE_ROLLBACK;
return StorageState.RECOVER_ROLLBACK;
}
我想我可能刚刚回答了我自己的问题。最后,我将一个现有namenode目录的全部内容复制到新的NFS namenode目录中,并且我能够启动namenode。(请注意,为了避免出现问题,我在复制之前停止了namenode)
我想我认为namenode会自动将现有元数据复制到新目录的假设是不正确的。我想我可能刚刚回答了我自己的问题。最后,我将一个现有namenode目录的全部内容复制到新的NFS namenode目录中,并且我能够启动namenode。(请注意,为了避免出现问题,我在复制之前停止了namenode)
我猜我认为namenode会自动将现有元数据复制到新目录的假设是不正确的。您的装载点的所有权是否允许hdfs
用户进入目录?装载点由root所有。(用户和组)该目录上的权限为700。这与其他数据目录遵循的结构相同。(/data/1/dfs归root所有,权限为700,而/data/1/dfs/nn归hdfs/hadoop所有)足够公平。还有一个愚蠢的问题:hdfs
用户和hadoop
组在NFS服务器和客户端上是否具有相同的uid(gid)?NFS服务器是一个Windows 2012R2框。我已为此特定共享启用了未映射的UNIX用户名访问。据我所知,UUUA将使windows不真正关心unix uid/gid,而只关注unix客户端告诉它的任何内容。/data/nfs/dfs
(您的装载点)的所有权是否允许hdfs
用户进入目录?装载点归root所有。(用户和组)该目录上的权限为700。这与其他数据目录遵循的结构相同。(/data/1/dfs归root所有,权限为700,而/data/1/dfs/nn归hdfs/hadoop所有)足够公平。还有一个愚蠢的问题:hdfs
用户和hadoop
组在NFS服务器和客户端上是否具有相同的uid(gid)?NFS服务器是一个Windows 2012R2框。我已为此特定共享启用了未映射的UNIX用户名访问。据我所知,UUUA将使windows不真正关心unix uid/gid,而只关注unix客户端告诉它的任何内容。请参阅我对该问题的答案。我的猜测是,因为新路径是空的,所以在namenode看来它不存在。我假设在添加新目录时会自动复制到新目录,但我猜我错了。请参阅我的问题答案。我的猜测是,因为新路径是空的,所以在namenode看来它不存在。我以为在添加新目录时会自动复制到新目录,但我想我错了。