使用scala API,如何将一个HDFS位置的所有文件复制到另一个HDFS位置
使用scala,我想将srcFilePath中的所有文件复制到destFilePath,但下面的代码抛出错误 有人能帮我修复这个错误和复制文件的解决方案吗使用scala API,如何将一个HDFS位置的所有文件复制到另一个HDFS位置,scala,apache-spark,hadoop,Scala,Apache Spark,Hadoop,使用scala,我想将srcFilePath中的所有文件复制到destFilePath,但下面的代码抛出错误 有人能帮我修复这个错误和复制文件的解决方案吗 scala> val srcFilePath = "/development/staging/b8baf3f4-abce-11eb-8592-0242ac110032/" srcFilePath: String = /development/staging/b8baf3f4-abce-11eb-8592-0242
scala> val srcFilePath = "/development/staging/b8baf3f4-abce-11eb-8592-0242ac110032/"
srcFilePath: String = /development/staging/b8baf3f4-abce-11eb-8592-0242ac110032/
scala> val destFilePath = "/development/staging/dest_b8baf3f4-abce-11eb-8592-0242ac110032/"
destFilePath: String = /development/staging/dest_b8baf3f4-abce-11eb-8592-0242ac110032/
scala> val hadoopConf = new Configuration()
hadoopConf: org.apache.hadoop.conf.Configuration = Configuration: core-default.xml, core-site.xml, mapred-default.xml, mapred-site.xml, yarn-default.xml, yarn-site.xml, hdfs-default.xml, hdfs-site.xml
scala> val hdfs = FileSystem.get(hadoopConf)
hdfs: org.apache.hadoop.fs.FileSystem = DFS[DFSClient[clientName=DFSClient_NONMAPREDUCE_-1792011619_1, ugi=be9dusr@INTERNAL.IMSGLOBAL.COM (auth:KERBEROS)]]
scala>
scala> val srcPath = new Path(srcFilePath)
srcPath: org.apache.hadoop.fs.Path = /development/staging/b8baf3f4-abce-11eb-8592-0242ac110032
scala> val destPath = new Path(destFilePath)
destPath: org.apache.hadoop.fs.Path = /development/staging/dest_b8baf3f4-abce-11eb-8592-0242ac110032
scala>
scala> hdfs.copy(srcPath, destPath)
<console>:52: error: value move is not a member of org.apache.hadoop.fs.FileSystem
hdfs.copy(srcPath, destPath)
scala>val srcFilePath=“/development/staging/b8baf3f4-abce-11eb-8592-0242ac110032/”
srcFilePath:String=/development/staging/b8baf3f4-abce-11eb-8592-0242ac110032/
scala>val destFilePath=“/development/staging/dest_b8baf3f4-abce-11eb-8592-0242ac110032/”
destFilePath:String=/development/staging/dest_b8baf3f4-abce-11eb-8592-0242ac110032/
scala>val hadoopConf=新配置()
hadoopConf:org.apache.hadoop.conf.conf=Configuration:core-default.xml、core-site.xml、mapred-default.xml、mapred-site.xml、warn-default.xml、warn-site.xml、hdfs-default.xml、hdfs-site.xml
scala>val hdfs=FileSystem.get(hadoopConf)
hdfs:org.apache.hadoop.fs.FileSystem=DFS[DFSClient[clientName=DFSClient\u NONMAPREDUCE\u1792011619\u 1,ugi=be9dusr@INTERNAL.IMSGLOBAL.COM(授权:KERBEROS)]]
斯卡拉>
scala>val srcPath=新路径(srcFilePath)
srcPath:org.apache.hadoop.fs.Path=/development/staging/b8baf3f4-abce-11eb-8592-0242ac110032
scala>val destPath=新路径(destFilePath)
destPath:org.apache.hadoop.fs.Path=/development/staging/dest_b8baf3f4-abce-11eb-8592-0242ac110032
斯卡拉>
scala>hdfs.copy(srcPath,destPath)
:52:错误:值移动不是org.apache.hadoop.fs.FileSystem的成员
复制(srcPath,destPath)
您可能想看看上面的答案
是的,这段代码正在工作,我想解压文件并将其放入dstPath。基本上,我需要从srcPath读取一个zip文件,然后将其解压,然后将解压后的文件放在dstPath中。有这样做的选择吗
Try Hadoop's FileUtil.copy() command, as described here: https://hadoop.apache.org/docs/r2.8.5/api/org/apache/hadoop/fs/FileUtil.html#copy(org.apache.hadoop.fs.FileSystem,%20org.apache.hadoop.fs.Path,%20org.apache.hadoop.fs.FileSystem,%20org.apache.hadoop.fs.Path,%20boolean,%20org.apache.hadoop.conf.Configuration)
val conf = new org.apache.hadoop.conf.Configuration()
val srcPath = new org.apache.hadoop.fs.Path("hdfs://my/src/path")
val dstPath = new org.apache.hadoop.fs.Path("hdfs://my/dst/path")
org.apache.hadoop.fs.FileUtil.copy(
srcPath.getFileSystem(conf),
srcPath,
dstPath.getFileSystem(conf),
dstPath,
true,
conf
)