Warning: file_get_contents(/data/phpspider/zhask/data//catemap/1/typescript/8.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Pyspark 如何将文本文件从Databricks笔记本上传到FTP_Pyspark_Databricks - Fatal编程技术网

Pyspark 如何将文本文件从Databricks笔记本上传到FTP

Pyspark 如何将文本文件从Databricks笔记本上传到FTP,pyspark,databricks,Pyspark,Databricks,我试图找到一个解决办法,但一无所获。我在这方面是新手,所以如果你知道解决方案,请帮助我。 谢谢 在Databricks中,您可以使用下面描述的任何一种方法访问存储在ADL中的文件。 有三种访问Azure Data Lake存储Gen2的方法: 使用服务主体和OAuth 2.0将Azure Data Lake Storage Gen2文件系统装载到DBFS 直接使用服务主体 直接使用Azure Data Lake Storage Gen2存储帐户访问密钥 将文件系统中的文件当作本地文件装载和访问的

我试图找到一个解决办法,但一无所获。我在这方面是新手,所以如果你知道解决方案,请帮助我。
谢谢

在Databricks中,您可以使用下面描述的任何一种方法访问存储在ADL中的文件。 有三种访问Azure Data Lake存储Gen2的方法:

  • 使用服务主体和OAuth 2.0将Azure Data Lake Storage Gen2文件系统装载到DBFS
  • 直接使用服务主体
  • 直接使用Azure Data Lake Storage Gen2存储帐户访问密钥
  • 将文件系统中的文件当作本地文件装载和访问的步骤:

    configs = {"fs.azure.account.auth.type": "OAuth",
           "fs.azure.account.oauth.provider.type": "org.apache.hadoop.fs.azurebfs.oauth2.ClientCredsTokenProvider",
           "fs.azure.account.oauth2.client.id": "<appId>",
           "fs.azure.account.oauth2.client.secret": "<password>",
           "fs.azure.account.oauth2.client.endpoint": "https://login.microsoftonline.com/<tenant>/oauth2/token",
           "fs.azure.createRemoteFileSystemDuringInitialization": "true"}
    
    dbutils.fs.mount(
    source = "abfss://<container-name>@<storage-account-name>.dfs.core.windows.net/folder1",
    mount_point = "/mnt/flightdata",
    extra_configs = configs)
    
    要装载Azure Data Lake存储Gen2或容器中的文件夹,请使用以下命令:

    语法:

    configs = {"fs.azure.account.auth.type": "OAuth",
           "fs.azure.account.oauth.provider.type": "org.apache.hadoop.fs.azurebfs.oauth2.ClientCredsTokenProvider",
           "fs.azure.account.oauth2.client.id": "<appId>",
           "fs.azure.account.oauth2.client.secret": "<password>",
           "fs.azure.account.oauth2.client.endpoint": "https://login.microsoftonline.com/<tenant>/oauth2/token",
           "fs.azure.createRemoteFileSystemDuringInitialization": "true"}
    
    dbutils.fs.mount(
    source = "abfss://<container-name>@<storage-account-name>.dfs.core.windows.net/folder1",
    mount_point = "/mnt/flightdata",
    extra_configs = configs)
    
    示例:

    configs = {"fs.azure.account.auth.type": "OAuth",
           "fs.azure.account.oauth.provider.type": "org.apache.hadoop.fs.azurebfs.oauth2.ClientCredsTokenProvider",
           "fs.azure.account.oauth2.client.id": "<appId>",
           "fs.azure.account.oauth2.client.secret": "<password>",
           "fs.azure.account.oauth2.client.endpoint": "https://login.microsoftonline.com/<tenant>/oauth2/token",
           "fs.azure.createRemoteFileSystemDuringInitialization": "true"}
    
    dbutils.fs.mount(
    source = "abfss://<container-name>@<storage-account-name>.dfs.core.windows.net/folder1",
    mount_point = "/mnt/flightdata",
    extra_configs = configs)
    

    参考:


    希望这能有所帮助。

    您能详细说明您的问题吗?“从Databricks笔记本上传文本文件到FTP”是什么意思?是的。我在ADLS中有文本文件,我想用pysparkOk在Databricks笔记本中编写(上传)到FTP,谢谢,但这不是访问ADLS的方法,而是如何从Databricks上传文件到FTP。我找到了一个解决方案,我将用它来回答。无论如何,谢谢。嗨@MilosTodosijevic,请分享你的发现。这对其他社区成员是有益的。非常感谢。
    Ok, I found a solution.
    
    #copy file from ADLS to SFTP
    from ftplib import FTP_TLS
    from azure.datalake.store import core, lib, multithread
    import pandas as pd
    
    keyVaultName = "yourkeyvault"
    #then you need to configure keyvault with ADLS
    
    #set up authentification for ADLS
    tenant_id = dbutils.secrets.get(scope = keyVaultName, key = "tenantId")
    username = dbutils.secrets.get(scope = keyVaultName, key = "appRegID")
    password = dbutils.secrets.get(scope = keyVaultName, key = "appRegSecret")
    store_name = 'ADLSStoridge'
    token = lib.auth(tenant_id = tenant_id, client_id = username, client_secret = password)
    adl = core.AzureDLFileSystem(token, store_name=store_name)
    
    #create secure connection with SFTP
    ftp = FTP_TLS('ftp.xyz.com')
    #add credentials
    ftp.login(user='',passwd='') 
    ftp.prot_p()
    #set sftp directory path
    ftp.cwd('folder path on FTP')
    
    #load file
    f = adl.open('adls path of your file')
    #write to SFTP
    ftp.storbinary('STOR myfile.csv', f)