Warning: file_get_contents(/data/phpspider/zhask/data//catemap/5/google-sheets/3.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 将numpy和dataframe重构为字典_Python_Pandas_Numpy_Dictionary - Fatal编程技术网

Python 将numpy和dataframe重构为字典

Python 将numpy和dataframe重构为字典,python,pandas,numpy,dictionary,Python,Pandas,Numpy,Dictionary,我试图重构一些代码,这样就不会有太多的重复。我想做的是为多通道/输入神经网络创建一个输入。正在考虑的功能完全来自两个不同的来源,这里的输入是一个2D数组,必须保持这种格式 我有以下代码: 'Create Input Values' inputA= word_embeddings.numpy() inputB = df['Features'].values y = df['Target'].values full_model_inputs = [inputA, inputB] #Create

我试图重构一些代码,这样就不会有太多的重复。我想做的是为多通道/输入神经网络创建一个输入。正在考虑的功能完全来自两个不同的来源,这里的输入是一个2D数组,必须保持这种格式

我有以下代码:

'Create Input Values'
inputA= word_embeddings.numpy()
inputB = df['Features'].values
y = df['Target'].values

full_model_inputs = [inputA, inputB]

#Create Dictionary
original_model_inputs = dict(inputA= inputA, inputB= inputB)

'Create Train and Validation Data from Inputs'
#Preserve data dimensionality for data split
df = pd.DataFrame({"inputA":original_model_inputs["inputA"],  
                   "inputB":list(original_model_inputs["inputB"])})

#Data Split
x_train, x_valid, y_train, y_valid = train_test_split(df, y, test_size = 0.25)

#Convert back to original format
x_train = x_train.to_dict("list")
x_valid = x_valid.to_dict("list")

#Format dictionary items as arrays to be functional for model
x_train = {k:np.array(v) for k,v in x_train.items()}
x_valid = {k:np.array(v) for k,v in x_valid.items()}
是否有任何改进此代码的建议?只是想从社区中获得一些见解

字典是什么样子的:

{'inputA': array([40., 68., 46., ..., 60., 42., 50.]),
 'inputB': array([[-1.915694  , -2.39863253, -1.75456583, ...,  2.11158562,
          2.42145038,  1.0996474 ],
        [-1.99583805, -2.38059568, -1.94454968, ...,  2.14585209,
          2.56227231,  1.2808286 ],
        [-2.1607585 , -2.29914975, -1.85722673, ...,  2.04741383,
          2.34712863,  1.77104282],
        ...,
        [-2.1576829 , -2.28505015, -1.71492636, ...,  2.05909061,
          2.43704724,  1.90647388],
        [-1.81904769, -2.74457788, -2.15936947, ...,  2.31333733,
          2.50243115,  1.75907826],
        [-2.01300311, -2.32310271, -2.00470185, ...,  2.09641671,
          2.53372359,  1.22000134]])}

这可能更适合CodeReview。虽然SO喜欢解决“numpy矢量化”和相关的优化问题,但重构和整体ccde组织并不适合SO。但首先,请花一些时间查看CR要求和典型答案。他们对代码的完整性和可运行性比较挑剔。谢谢!我不知道:)我很感激!