Python 用于循环打印来自框架的唯一值_Python_Pandas_For Loop

Python 用于循环打印来自框架的唯一值

python pandas for-loop

Python 用于循环打印来自框架的唯一值,python,pandas,for-loop,Python,Pandas,For Loop,我有一个熊猫数据框，大约有52列。我想得到每个专栏的唯一内容。假设列名是col1，col2…col52。要获得第1列的唯一内容，我可以使用 df.col1.unique() 这将是痛苦的，如果我必须做所有50列。所以我想做一些类似的事情，但不起作用 for i in df.columns: print 'df.'+i+'.unque()' 有什么建议吗？如果您只想打印值，最简单的是： for col in df: print (col, df[col].unique())

我有一个熊猫数据框，大约有52列。我想得到每个专栏的唯一内容。假设列名是col1，col2…col52。要获得第1列的唯一内容，我可以使用

df.col1.unique()

这将是痛苦的，如果我必须做所有50列。所以我想做一些类似的事情，但不起作用

for i in df.columns:
    print 'df.'+i+'.unque()'

有什么建议吗？

如果您只想打印

值，最简单的是：
for col in df:
    print (col, df[col].unique())

A [1 2 3]
B [4 6]
C [9]
D [1 5]
E [5 3]
F [7 4]

与应用一起使用
：
df1 = df.apply(lambda x: pd.Series(x.unique()))

样本：
df = pd.DataFrame({'A':[1,2,3],
                   'B':[4,4,6],
                   'C':[9,9,9],
                   'D':[1,1,5],
                   'E':[5,3,3],
                   'F':[7,4,7]})

print (df)
   A  B  C  D  E  F
0  1  4  9  1  5  7
1  2  4  9  1  3  4
2  3  6  9  5  3  7

df1 = df.apply(lambda x: pd.Series(x.unique()))
print (df1)
   A    B    C    D    E    F
0  1  4.0  9.0  1.0  5.0  7.0
1  2  6.0  NaN  5.0  3.0  4.0
2  3  NaN  NaN  NaN  NaN  NaN

另一种解决方案，但输出略有不同：
如果只想打印值，最简单的方法是：
for col in df:
    print (col, df[col].unique())

A [1 2 3]
B [4 6]
C [9]
D [1 5]
E [5 3]
F [7 4]

与应用一起使用
：
df1 = df.apply(lambda x: pd.Series(x.unique()))

样本：
df = pd.DataFrame({'A':[1,2,3],
                   'B':[4,4,6],
                   'C':[9,9,9],
                   'D':[1,1,5],
                   'E':[5,3,3],
                   'F':[7,4,7]})

print (df)
   A  B  C  D  E  F
0  1  4  9  1  5  7
1  2  4  9  1  3  4
2  3  6  9  5  3  7

df1 = df.apply(lambda x: pd.Series(x.unique()))
print (df1)
   A    B    C    D    E    F
0  1  4.0  9.0  1.0  5.0  7.0
1  2  6.0  NaN  5.0  3.0  4.0
2  3  NaN  NaN  NaN  NaN  NaN

另一种解决方案，但输出略有不同：
考虑数据帧df

df = pd.DataFrame(
    np.random.randint(1, 100, (10, 5)),
    columns=['col{}'.format(i) for i in range(1, 6)]
)

print(df)

   col1  col2  col3  col4  col5
0     4    58    89    15    75
1    66    89    38     6    11
2    85    48    32    60    97
3    28     3    27    12    66
4    88    60    99    11    19
5    30    71     2    53    10
6    38    23    29     2    22
7    50     7    68    87     8
8    25    50     1    10    20
9    58    94    67    54     1


选项1

结合使用pd.Series
，np.unique
，以及列表
pd.Series([np.unique(x) for _, x in df.iteritems()], df.columns)

col1    [4, 66, 85, 28, 88, 30, 38, 50, 25, 58]
col2     [58, 89, 48, 3, 60, 71, 23, 7, 50, 94]
col3     [89, 38, 32, 27, 99, 2, 29, 68, 1, 67]
col4     [15, 6, 60, 12, 11, 53, 2, 87, 10, 54]
col5     [75, 11, 97, 66, 19, 10, 22, 8, 20, 1]
dtype: object

选项2

使用groupby
+np.unique

df.groupby(axis=1, level=0).apply(np.unique)

col1    [4, 25, 28, 30, 38, 50, 58, 66, 85, 88]
col2     [3, 7, 23, 48, 50, 58, 60, 71, 89, 94]
col3     [1, 2, 27, 29, 32, 38, 67, 68, 89, 99]
col4     [2, 6, 10, 11, 12, 15, 53, 54, 60, 87]
col5     [1, 8, 10, 11, 19, 20, 22, 66, 75, 97]
dtype: object

考虑数据帧df

df = pd.DataFrame(
    np.random.randint(1, 100, (10, 5)),
    columns=['col{}'.format(i) for i in range(1, 6)]
)

print(df)

   col1  col2  col3  col4  col5
0     4    58    89    15    75
1    66    89    38     6    11
2    85    48    32    60    97
3    28     3    27    12    66
4    88    60    99    11    19
5    30    71     2    53    10
6    38    23    29     2    22
7    50     7    68    87     8
8    25    50     1    10    20
9    58    94    67    54     1


选项1

结合使用pd.Series
，np.unique
，以及列表
pd.Series([np.unique(x) for _, x in df.iteritems()], df.columns)

col1    [4, 66, 85, 28, 88, 30, 38, 50, 25, 58]
col2     [58, 89, 48, 3, 60, 71, 23, 7, 50, 94]
col3     [89, 38, 32, 27, 99, 2, 29, 68, 1, 67]
col4     [15, 6, 60, 12, 11, 53, 2, 87, 10, 54]
col5     [75, 11, 97, 66, 19, 10, 22, 8, 20, 1]
dtype: object

选项2

使用groupby
+np.unique

df.groupby(axis=1, level=0).apply(np.unique)

col1    [4, 25, 28, 30, 38, 50, 58, 66, 85, 88]
col2     [3, 7, 23, 48, 50, 58, 60, 71, 89, 94]
col3     [1, 2, 27, 29, 32, 38, 67, 68, 89, 99]
col4     [2, 6, 10, 11, 12, 15, 53, 54, 60, 87]
col5     [1, 8, 10, 11, 19, 20, 22, 66, 75, 97]
dtype: object

谢谢你的解决方案也很好。我只能选择一个。谢谢。你的解决方案也很有效..我只能选择一个。