Python 在数据帧中混合三维和二维数据
将这种数据结构存储在数据库中的惯用方法是什么:Python 在数据帧中混合三维和二维数据,python,pandas,Python,Pandas,将这种数据结构存储在数据库中的惯用方法是什么: ### Option 1 df = pd.DataFrame(data = [ {'kws' : np.array([0,0,0]), 'x' : i, 'y', i} for i in range(10) ]) # df.x and df.y works as expected # the list and array casting is required because df.kws is # an array of arrays
### Option 1
df = pd.DataFrame(data = [
{'kws' : np.array([0,0,0]), 'x' : i, 'y', i} for i in range(10)
])
# df.x and df.y works as expected
# the list and array casting is required because df.kws is
# an array of arrays
np.array(list(df.kws))
# this causes problems when trying to assign as well though:
# for any other data type, this would set all kws in df to the rhs [1,2,3]
# but since the rhs is a list, it tried to do an element-wise assignment and
# errors saying that the length of df and the length of the rhs do not match
df.kws = [1,2,3]
### Option 2
df = pd.DataFrame(data = [
{'kw_0' : 0, 'kw_1' : 0, 'kw_2' : 0, 'x' : i, 'y', i} for i in range(10)
])
# retrieving 2d array:
df[sorted([c for c in df if c.startswith('kw_')])].values
# batch set :
kws = [1,2,3]
for i, kw in enumerate(kws) :
df['kw_'+i] = kw
我觉得这两种解决方案都不对。首先,它们都不允许在不复制所有数据的情况下检索2d矩阵。有没有更好的方法来处理这种混合维度数据,或者这只是pandas目前没有完成的任务?只需使用列多索引 选择很容易
In [35]: df['kw']
Out[35]:
0 1 2
0 0 0 0
1 0 0 0
2 0 0 0
3 0 0 0
4 0 0 0
5 0 0 0
6 0 0 0
7 0 0 0
8 0 0 0
9 0 0 0
设置太多
In [36]: df.loc[1,'kw'] = [4,5,6]
In [37]: df
Out[37]:
kw value
0 1 2 x y
0 0 0 0 0 0
1 4 5 6 1 1
2 0 0 0 2 2
3 0 0 0 3 3
4 0 0 0 4 4
5 0 0 0 5 5
6 0 0 0 6 6
7 0 0 0 7 7
8 0 0 0 8 8
9 0 0 0 9 9
或者,您可以使用2个数据帧,索引相同,并在需要时合并/合并
In [36]: df.loc[1,'kw'] = [4,5,6]
In [37]: df
Out[37]:
kw value
0 1 2 x y
0 0 0 0 0 0
1 4 5 6 1 1
2 0 0 0 2 2
3 0 0 0 3 3
4 0 0 0 4 4
5 0 0 0 5 5
6 0 0 0 6 6
7 0 0 0 7 7
8 0 0 0 8 8
9 0 0 0 9 9