Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/293.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 莫丁熊猫和达斯克什么也不做,只是上吊_Python_Pandas_Modin - Fatal编程技术网

Python 莫丁熊猫和达斯克什么也不做,只是上吊

Python 莫丁熊猫和达斯克什么也不做,只是上吊,python,pandas,modin,Python,Pandas,Modin,我正在试图解释为什么这只是与莫丁一起玩,而与普通熊猫一起玩效果很好: import modin.pandas as pd infile1 = 'D:\\test_files\\curves_crosstab.csv' infile2 = 'D:\\test_files\\8760_crosstab.csv' infilenames = [infile1, infile2] outfile1 = 'D:\\test_files\\curves_sample_output.csv' outfil

我正在试图解释为什么这只是与莫丁一起玩,而与普通熊猫一起玩效果很好:

import modin.pandas as pd

infile1 = 'D:\\test_files\\curves_crosstab.csv'
infile2 = 'D:\\test_files\\8760_crosstab.csv'
infilenames = [infile1, infile2]

outfile1 = 'D:\\test_files\\curves_sample_output.csv'
outfile2 = 'D:\\test_files\\8760_sample_output.csv'

for i in range(len(infilenames)) :
    if 'curves' in infilenames[i] :
        print("in curves")
        df = pd.read_csv(infilenames[i], header=[0,1,2,3])
        print("read curves")
        df.columns = df.columns.to_flat_index()
        print("indexed columns")
        df.columns = ['_'.join(i) for i in df.columns]
        print("joined columns")
        df2 = df.melt(id_vars=['Unnamed: 0_level_0_Unnamed: 0_level_1_Unnamed: 0_level_2_Year',
            'Unnamed: 1_level_0_Unnamed: 1_level_1_Unnamed: 1_level_2_Month',
            'Unnamed: 2_level_0_Unnamed: 2_level_1_Unnamed: 2_level_2_Day',
            'Unnamed: 3_level_0_Unnamed: 3_level_1_Unnamed: 3_level_2_Hour'])
        print("melted")
        df2 = pd.concat([df2,df2.variable.str.split('_',expand=True)],axis=1)
        del df2['variable']
        print("deleted variable column")
        df2.rename(columns={'Unnamed: 0_level_0_Unnamed: 0_level_1_Unnamed: 0_level_2_Year' : 'Year' ,
                            'Unnamed: 1_level_0_Unnamed: 1_level_1_Unnamed: 1_level_2_Month' : 'Month',
                            'Unnamed: 2_level_0_Unnamed: 2_level_1_Unnamed: 2_level_2_Day' : 'Day',
                            'Unnamed: 3_level_0_Unnamed: 3_level_1_Unnamed: 3_level_2_Hour' : 'Hour',
                            0 : 'currency', 
                            1 : 'consultant_or_case', 
                            2 : 'name', 
                            3 : 'hub', 
                            'value' : 'rate_in_local_currency'}, inplace = True)
        print("renamed")
        pd.DataFrame.to_csv(df2, path_or_buf=outfile1,index=False,encoding='utf-8')
        print("created csv")
    else :

        df = pd.read_csv(infilenames[i], encoding='cp1252')
        df2 = df.melt(id_vars=['Month','Day','Hour'])

        pd.DataFrame.to_csv(df2, path_or_buf=outfile2,index=False,encoding='utf-8')
当我在pandas下运行此程序时,它执行了,但由于曲线文件的大小(约36.5MB in和约395MB out),平均需要87秒,我希望莫丁能够缩短时间。当切换到Modin时,脚本会运行,但它只是静止不动。它甚至都不给我

我不知道它是否应该出现在控制台上,但它没有。当我进入曲线时,脚本将进入csv的第一次读取。然后它就坐着。再也不做别的事了。我怎么知道发生了什么事

如果这很重要的话,操作系统就是Windows10

Waiting for redis server at 127.0.0.1:14618 to respond...
Waiting for redis server at 127.0.0.1:31410 to respond...
Starting local scheduler with the following resources: {'CPU': 4, 'GPU': 0}.