Warning: file_get_contents(/data/phpspider/zhask/data//catemap/9/git/25.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 如何在pyspark中创建动态数据帧名称_Python_Dataframe_Apache Spark_Pyspark - Fatal编程技术网

Python 如何在pyspark中创建动态数据帧名称

Python 如何在pyspark中创建动态数据帧名称,python,dataframe,apache-spark,pyspark,Python,Dataframe,Apache Spark,Pyspark,如何在pyspark中创建动态数据帧名称 在这里,我无法使用下面的代码创建新的数据帧,它将只给我最后的数据帧名称,我需要所有的数据帧名称 for prime2 in pdf2: ol2 = Bucket_path + prime2['S3_File_with_Path'] t = 1 sd = {} testR = "df" + str(t) print("testR",testR) sd[testR]

如何在pyspark中创建动态数据帧名称 在这里,我无法使用下面的代码创建新的数据帧,它将只给我最后的数据帧名称,我需要所有的数据帧名称

for prime2 in pdf2:
    ol2 =  Bucket_path + prime2['S3_File_with_Path']
    t = 1
    sd = {}  
    testR = "df" + str(t)
    print("testR",testR)
    sd[testR] = spark.read.format("parquet").load(ol2).cache() 
    t = t + 1 


似乎您正在循环中创建dict,因此获取一个只有一个(最后一个)条目的dict。尝试将代码更改为以下内容:

sd = {}  
for prime2 in pdf2:
    ol2 =  Bucket_path + prime2['S3_File_with_Path']
    t = 1
    testR = "df" + str(t)
    print("testR",testR)
    df = spark.read.format("parquet").load(ol2).cache() 
    sd[testR] = df
    t = t + 1 

# sd dict is available here, all the dataframes are inside
print(len(sd))

嘿@Ajinkya,这个答案有用吗?