Python 如何在DataRicks上将文件从一个文件夹移动到另一个文件夹_Python_Apache Spark_Pyspark_Azure Data Lake_Azure Databricks

Python 如何在DataRicks上将文件从一个文件夹移动到另一个文件夹

python apache-spark pyspark

Python 如何在DataRicks上将文件从一个文件夹移动到另一个文件夹,python,apache-spark,pyspark,azure-data-lake,azure-databricks,Python,Apache Spark,Pyspark,Azure Data Lake,Azure Databricks,我正在尝试使用databricks python笔记本将文件从一个文件夹移动到另一个文件夹。我的来源是azure DataLake gen 1 假设我的文件存在adl://testdatalakegen12021.azuredatalakestore.net/source/test.csv 我正试图将文件从adl://testdatalakegen12021.azuredatalakestore.net/demo/test.csv 到adl://testdatalakegen12021.azu

我正在尝试使用databricks python笔记本将文件从一个文件夹移动到另一个文件夹。我的来源是azure DataLake gen 1

假设我的文件存在adl://testdatalakegen12021.azuredatalakestore.net/source/test.csv 我正试图将文件从adl://testdatalakegen12021.azuredatalakestore.net/demo/test.csv 到adl://testdatalakegen12021.azuredatalakestore.net/destination/movedtest.csv

我尝试了各种逻辑，但没有一个代码工作正常

# Move a file by renaming it's path
import os
import shutil
os.rename('adl://testdatalakegen12021.azuredatalakestore.net/demo/test.csv', 'adl://testdatalakegen12021.azuredatalakestore.net/demo/renamedtest.csv')

# Move a file from the directory d1 to d2
shutil.move('adl://testdatalakegen12021.azuredatalakestore.net/demo/test.csv', 'adl://testdatalakegen12021.azuredatalakestore.net/destination/renamedtest.csv')

请让我知道我在databricks上执行此操作时是否使用了正确的逻辑，而不是在本地执行此操作。

要在databricks笔记本中移动文件，可以使用以下方法：

dbutils.fs.mv（'adl://testdatalakegen12021.azuredatalakestore.net/demo/test.csv', 'adl://testdatalakegen12021.azuredatalakestore.net/destination/renamedtest.csv')

以下是将文件从一个文件夹移动到另一个文件夹的步骤：

将Azure Data Lake存储Gen1装载到databricks工作区：

configs = {"<prefix>.oauth2.access.token.provider.type": "ClientCredential",
           "<prefix>.oauth2.client.id": "<application-id>",
           "<prefix>.oauth2.credential": dbutils.secrets.get(scope = "<scope-name>", key = "<key-name-for-service-credential>"),
           "<prefix>.oauth2.refresh.url": "https://login.microsoftonline.com/<directory-id>/oauth2/token"}

# Optionally, you can add <directory-name> to the source URI of your mount point.
dbutils.fs.mount(
  source = "adl://<storage-resource>.azuredatalakestore.net/<directory-name>",
  mount_point = "/mnt/<mount-name>",
  extra_configs = configs)

dbutils.fs.mv('dbfs:/mnt/adlsgen1/test/data.csv', 'dbfs:/mnt/adlsgen1/test1/dataone.csv')

使用
dbutils
命令移动文件：

configs = {"<prefix>.oauth2.access.token.provider.type": "ClientCredential",
           "<prefix>.oauth2.client.id": "<application-id>",
           "<prefix>.oauth2.credential": dbutils.secrets.get(scope = "<scope-name>", key = "<key-name-for-service-credential>"),
           "<prefix>.oauth2.refresh.url": "https://login.microsoftonline.com/<directory-id>/oauth2/token"}

# Optionally, you can add <directory-name> to the source URI of your mount point.
dbutils.fs.mount(
  source = "adl://<storage-resource>.azuredatalakestore.net/<directory-name>",
  mount_point = "/mnt/<mount-name>",
  extra_configs = configs)

dbutils.fs.mv('dbfs:/mnt/adlsgen1/test/data.csv', 'dbfs:/mnt/adlsgen1/test1/dataone.csv')

您好，我在执行相同步骤py4j.security.Py4JSecurityException时遇到以下错误：构造函数public com.databricks.backend.daemon.dbutils.DBUtilsCore（org.apache.spark.SparkContext，org.apache.spark.sql.SQLContext）未列入白名单。