Python 如何展开输出显示以查看数据帧的更多列?

Python 如何展开输出显示以查看数据帧的更多列?,python,pandas,printing,column-width,Python,Pandas,Printing,Column Width,有没有办法在交互或脚本执行模式下扩大输出的显示范围 具体地说,我在数据帧上使用descripe()函数。当DataFrame有五列(标签)宽时,我会得到我想要的描述性统计数据。但是,如果DataFrame有更多的列,则统计信息将被抑制,并返回如下内容: >索引:8个条目,最多计数 >>数据列: >>x1 8非空值 >>x2 8非空值 >>x3 8非空值 >>x4 8非空值 >>x5 8非空值 >>x6 8非空值 >>x7 8非空值 无论是6列还是7列,都会给出“8”值。“8”指的是什么 我已经

有没有办法在交互或脚本执行模式下扩大输出的显示范围

具体地说,我在数据帧上使用
descripe()
函数。当
DataFrame
有五列(标签)宽时,我会得到我想要的描述性统计数据。但是,如果
DataFrame
有更多的列,则统计信息将被抑制,并返回如下内容:

>索引:8个条目,最多计数
>>数据列:
>>x1 8非空值
>>x2 8非空值
>>x3 8非空值
>>x4 8非空值
>>x5 8非空值
>>x6 8非空值
>>x7 8非空值
无论是6列还是7列,都会给出“8”值。“8”指的是什么

我已经尝试过将窗口拖得更大,以及增加“配置空闲”宽度选项,但都没有效果


我使用Pandas和
descripe()
的目的是避免使用第二个程序(如Stata)进行基本数据操作和调查。

您可以使用
打印df.descripe().to_string()
强制它显示整个表。(您可以像这样对任何数据帧使用
to_string()
descripe
的结果只是一个数据帧本身。)


8是数据框中保存“描述”的行数(因为
description
计算8个统计值、最小值、最大值、平均值等)。

您可以使用
设置打印选项来调整熊猫打印选项

In [3]: df.describe()
Out[3]:
<class 'pandas.core.frame.DataFrame'>
Index: 8 entries, count to max
Data columns:
x1    8  non-null values
x2    8  non-null values
x3    8  non-null values
x4    8  non-null values
x5    8  non-null values
x6    8  non-null values
x7    8  non-null values
dtypes: float64(7)

In [4]: pd.set_printoptions(precision=2)

In [5]: df.describe()
Out[5]:
            x1       x2       x3       x4       x5       x6       x7
count      8.0      8.0      8.0      8.0      8.0      8.0      8.0
mean   69024.5  69025.5  69026.5  69027.5  69028.5  69029.5  69030.5
std       17.1     17.1     17.1     17.1     17.1     17.1     17.1
min    69000.0  69001.0  69002.0  69003.0  69004.0  69005.0  69006.0
25%    69012.2  69013.2  69014.2  69015.2  69016.2  69017.2  69018.2
50%    69024.5  69025.5  69026.5  69027.5  69028.5  69029.5  69030.5
75%    69036.8  69037.8  69038.8  69039.8  69040.8  69041.8  69042.8
max    69049.0  69050.0  69051.0  69052.0  69053.0  69054.0  69055.0
此外,设置选项的API已更改:

In [4]: pd.set_option('display.precision', 2)

In [5]: df.describe()
Out[5]:
            x1       x2       x3       x4       x5       x6       x7
count      8.0      8.0      8.0      8.0      8.0      8.0      8.0
mean   59832.4  27356.7  49317.3  51214.8  51254.8  41863.0  33950.2
std    22600.7  26867.2  28071.7  21012.4  33831.5  38709.5  29075.7
min    31906.7   1648.4     56.4  16278.3     43.7   3591.0   1833.5
25%    45264.6  12799.5  41429.6  40374.3  29789.6  15145.8   6879.5
50%    56340.2  18666.5  51995.7  54894.6  47667.7  22139.2  33706.0
75%    75587.0  31375.6  61069.2  67811.9  76014.9  72039.0  51449.9
max    98136.5  84544.5  91744.0  75154.6  99012.7  98601.2  83309.1

更新:熊猫0.23.4版以后的版本

这是没有必要的。如果设置了
pd.options.display.width=0
,则自动检测终端窗口的大小。(对于旧版本,请参见底部。)

pandas.set\u printoptions(…)
不推荐使用。相反,请使用
pandas.set_选项(optname,val)
,或等效的
pd.options.=val
。比如:

import pandas as pd
pd.set_option('display.max_rows', 500)
pd.set_option('display.max_columns', 500)
pd.set_option('display.width', 1000)
以下是:

set_option(pat,value)-设置指定选项的值 可用选项: 显示。[切碎阈值,列标题,列间距,日期,日期优先, 日期_yearfirst,编码,展开_frame _repr,浮动_格式,高度, 行宽、最大列数、最大列宽、最大信息列数、最大信息行数、, 最大行数、最大顺序项、mpl样式、多行稀疏、笔记本报告html、, pprint\u嵌套\u深度、精度、宽度] 模式。[模拟交互,将\u inf\u用作\u null] 参数 ---------- pat-str/regexp,它应该匹配单个选项。 注意:为方便起见,支持部分匹配,但除非使用 完整的选项名称(例如,*x.y.z.option\u name*),您的代码将来可能会中断 如果引入了具有类似名称的新选项,则为。 价值-期权的新价值。 退换商品 ------- 没有一个 提高 ------ 如果不存在这样的选项,则返回KeyError display.chop_阈值:[默认值:无][当前:无] :浮动或无 如果设置为浮点值,则所有浮点值均小于给定阈值 repr和friends将显示为0。 display.colheader_justify:[默认值:右侧][当前:右侧] :“左”/“右” 控制列标题的对齐方式。由DataFrameFormatter使用。 display.column_space:[默认值:12][当前:12]无可用说明。 display.date\u dayfirst:[默认值:False][当前值:False] :布尔值 如果为True,则打印并解析日期,例如20/01/2005 display.date\u yearfirst:[默认值:False][当前值:False] :布尔值 如果为True,则打印并解析第一年的日期,例如2005/01/20 display.encoding:[默认值:UTF-8][当前:UTF-8] :str/unicode 默认为检测到的控制台编码。 指定要用于to_字符串返回的字符串的编码, 这些通常是要在控制台上显示的字符串。 display.expand\u frame\u repr:[默认值:True][当前值:True] :布尔值 是否打印宽数据帧的完整数据帧报告 在多行中,仍然尊重'max_columns',但输出将 如果宽度超过“display.width”,则在多个“页面”之间环绕。 display.float_格式:[默认值:无][当前:无] :可调用 可调用函数应接受浮点数并返回 具有所需数字格式的字符串。这是用来 在某些地方,如SeriesFormatter。 有关示例,请参见core.format.EngFormatter。 display.height:[默认值:60][当前值:1000] :int 不赞成。 (不推荐使用,请改用'display.height' display.line_width:[默认值:80][当前值:1000] :int 不赞成。 (不推荐使用,请改用'display.width' display.max_columns:[默认值:20][当前值:500] :int max_rows和max_columns在_repr__()方法中用于决定 to_string()或info()用于将对象渲染为字符串。万一 python/IPython正在终端中运行,可以将其设置为0和0 将正确自动检测终端的宽度,并切换到较小的宽度 格式,以防所有列无法垂直放置。IPython笔记本, IPython qtconsole或IDLE不在终端中运行,因此它不是 可以进行正确的自动检测。 “无”值表示无限制。 display.max_colwidth:[默认值:50][当前值:50] :int 报表中列的最大字符宽度 数据结构。当列溢出时,会出现“…” 占位符嵌入到输出中。 display.max\u info\u列:[默认值:100][当前值:100] :int 在DataFrame.info方法中使用max_info_列来决定 将打印每列信息。 display.max\u info\u行:[默认值:
import pandas as pd
pd.set_option('display.max_rows', 500)
pd.set_option('display.max_columns', 500)
pd.set_option('display.width', 1000)
set_option(pat,value) - Sets the value of the specified option Available options: display.[chop_threshold, colheader_justify, column_space, date_dayfirst, date_yearfirst, encoding, expand_frame_repr, float_format, height, line_width, max_columns, max_colwidth, max_info_columns, max_info_rows, max_rows, max_seq_items, mpl_style, multi_sparse, notebook_repr_html, pprint_nest_depth, precision, width] mode.[sim_interactive, use_inf_as_null] Parameters ---------- pat - str/regexp which should match a single option. Note: partial matches are supported for convenience, but unless you use the full option name (e.g., *x.y.z.option_name*), your code may break in future versions if new options with similar names are introduced. value - new value of option. Returns ------- None Raises ------ KeyError if no such option exists display.chop_threshold: [default: None] [currently: None] : float or None if set to a float value, all float values smaller then the given threshold will be displayed as exactly 0 by repr and friends. display.colheader_justify: [default: right] [currently: right] : 'left'/'right' Controls the justification of column headers. used by DataFrameFormatter. display.column_space: [default: 12] [currently: 12]No description available. display.date_dayfirst: [default: False] [currently: False] : boolean When True, prints and parses dates with the day first, eg 20/01/2005 display.date_yearfirst: [default: False] [currently: False] : boolean When True, prints and parses dates with the year first, e.g., 2005/01/20 display.encoding: [default: UTF-8] [currently: UTF-8] : str/unicode Defaults to the detected encoding of the console. Specifies the encoding to be used for strings returned by to_string, these are generally strings meant to be displayed on the console. display.expand_frame_repr: [default: True] [currently: True] : boolean Whether to print out the full DataFrame repr for wide DataFrames across multiple lines, `max_columns` is still respected, but the output will wrap-around across multiple "pages" if it's width exceeds `display.width`. display.float_format: [default: None] [currently: None] : callable The callable should accept a floating point number and return a string with the desired format of the number. This is used in some places like SeriesFormatter. See core.format.EngFormatter for an example. display.height: [default: 60] [currently: 1000] : int Deprecated. (Deprecated, use `display.height` instead.) display.line_width: [default: 80] [currently: 1000] : int Deprecated. (Deprecated, use `display.width` instead.) display.max_columns: [default: 20] [currently: 500] : int max_rows and max_columns are used in __repr__() methods to decide if to_string() or info() is used to render an object to a string. In case python/IPython is running in a terminal this can be set to 0 and Pandas will correctly auto-detect the width the terminal and swap to a smaller format in case all columns would not fit vertically. The IPython notebook, IPython qtconsole, or IDLE do not run in a terminal and hence it is not possible to do correct auto-detection. 'None' value means unlimited. display.max_colwidth: [default: 50] [currently: 50] : int The maximum width in characters of a column in the repr of a Pandas data structure. When the column overflows, a "..." placeholder is embedded in the output. display.max_info_columns: [default: 100] [currently: 100] : int max_info_columns is used in DataFrame.info method to decide if per column information will be printed. display.max_info_rows: [default: 1690785] [currently: 1690785] : int or None max_info_rows is the maximum number of rows for which a frame will perform a null check on its columns when repr'ing To a console. The default is 1,000,000 rows. So, if a DataFrame has more 1,000,000 rows there will be no null check performed on the columns and thus the representation will take much less time to display in an interactive session. A value of None means always perform a null check when repr'ing. display.max_rows: [default: 60] [currently: 500] : int This sets the maximum number of rows Pandas should output when printing out various output. For example, this value determines whether the repr() for a dataframe prints out fully or just a summary repr. 'None' value means unlimited. display.max_seq_items: [default: None] [currently: None] : int or None when pretty-printing a long sequence, no more then `max_seq_items` will be printed. If items are ommitted, they will be denoted by the addition of "..." to the resulting string. If set to None, the number of items to be printed is unlimited. display.mpl_style: [default: None] [currently: None] : bool Setting this to 'default' will modify the rcParams used by matplotlib to give plots a more pleasing visual style by default. Setting this to None/False restores the values to their initial value. display.multi_sparse: [default: True] [currently: True] : boolean "sparsify" MultiIndex display (don't display repeated elements in outer levels within groups) display.notebook_repr_html: [default: True] [currently: True] : boolean When True, IPython notebook will use html representation for Pandas objects (if it is available). display.pprint_nest_depth: [default: 3] [currently: 3] : int Controls the number of nested levels to process when pretty-printing display.precision: [default: 7] [currently: 7] : int Floating point output precision (number of significant digits). This is only a suggestion display.width: [default: 80] [currently: 1000] : int Width of the display in characters. In case python/IPython is running in a terminal this can be set to None and Pandas will correctly auto-detect the width. Note that the IPython notebook, IPython qtconsole, or IDLE do not run in a terminal and hence it is not possible to correctly detect the width. mode.sim_interactive: [default: False] [currently: False] : boolean Whether to simulate interactive mode for purposes of testing mode.use_inf_as_null: [default: False] [currently: False] : boolean True means treat None, NaN, INF, -INF as null (old way), False means None and NaN are null, but INF, -INF are not null (new way). Call def: pd.set_option(self, *args, **kwds)
pd.set_option('display.expand_frame_repr', False)
pd.set_option('display.width', pd.util.terminal.get_terminal_size()[0])
with pd.option_context('display.max_rows', None, 'display.max_columns', None):
    print (df)
pd.set_option('max_colwidth', 800)
pd.set_option('display.large_repr', 'truncate')
pd.set_option('display.max_columns', 0)
pd.options.display.width = None
In [1]: import pandas as pd

In [2]: pd.options.display.max_rows
Out[2]: 15

In [3]: pd.options.display.max_rows = 999

In [4]: pd.options.display.max_rows
Out[4]: 999
pd.set_option('display.max_columns', None)
pd.set_option('display.expand_frame_repr', False)
pd.set_option('max_colwidth', -1)
# Environment settings: 
pd.set_option('display.max_column', None)
pd.set_option('display.max_rows', None)
pd.set_option('display.max_seq_items', None)
pd.set_option('display.max_colwidth', 500)
pd.set_option('expand_frame_repr', True)
import pandas as pd
pd.set_option('display.max_columns', 100)
pd.set_option('display.width', 1000)

SentenceA = "William likes Piano and Piano likes William"
SentenceB = "Sara likes Guitar"
SentenceC = "Mamoosh likes Piano"
SentenceD = "William is a CS Student"
SentenceE = "Sara is kind"
SentenceF = "Mamoosh is kind"


bowA = SentenceA.split(" ")
bowB = SentenceB.split(" ")
bowC = SentenceC.split(" ")
bowD = SentenceD.split(" ")
bowE = SentenceE.split(" ")
bowF = SentenceF.split(" ")

# Creating a set consisting of all words

wordSet = set(bowA).union(set(bowB)).union(set(bowC)).union(set(bowD)).union(set(bowE)).union(set(bowF))
print("Set of all words is: ", wordSet)

# Initiating dictionary with 0 value for all BOWs

wordDictA = dict.fromkeys(wordSet, 0)
wordDictB = dict.fromkeys(wordSet, 0)
wordDictC = dict.fromkeys(wordSet, 0)
wordDictD = dict.fromkeys(wordSet, 0)
wordDictE = dict.fromkeys(wordSet, 0)
wordDictF = dict.fromkeys(wordSet, 0)

for word in bowA:
    wordDictA[word] += 1
for word in bowB:
    wordDictB[word] += 1
for word in bowC:
    wordDictC[word] += 1
for word in bowD:
    wordDictD[word] += 1
for word in bowE:
    wordDictE[word] += 1
for word in bowF:
    wordDictF[word] += 1

# Printing term frequency

print("SentenceA TF: ", wordDictA)
print("SentenceB TF: ", wordDictB)
print("SentenceC TF: ", wordDictC)
print("SentenceD TF: ", wordDictD)
print("SentenceE TF: ", wordDictE)
print("SentenceF TF: ", wordDictF)

print(pd.DataFrame([wordDictA, wordDictB, wordDictB, wordDictC, wordDictD, wordDictE, wordDictF]))
   CS  Guitar  Mamoosh  Piano  Sara  Student  William  a  and  is  kind  likes
0   0       0        0      2     0        0        2  0    1   0     0      2
1   0       1        0      0     1        0        0  0    0   0     0      1
2   0       1        0      0     1        0        0  0    0   0     0      1
3   0       0        1      1     0        0        0  0    0   0     0      1
4   1       0        0      0     0        1        1  1    0   1     0      0
5   0       0        0      0     1        0        0  0    0   1     1      0
6   0       0        1      0     0        0        0  0    0   1     1      0
df.columns.values
for col in df.columns: 
    print(col) 
pd.set_option('display.max_columns', None)
import pandas as pd
pd.options.display.max_columns = 10
pd.options.display.max_rows = 999
pd.options.display.max_columns = 100
def display_all(df):     # For any Dataframe df
   with pd.option_context('display.max_rows',1000): # Change number of rows accordingly
      with pd.option_context('display.max_columns',1000): # Change number of columns accordingly
          display(df)
import math
col_range = 5
for _ in range(int(math.ceil(len(df_data.columns)/col_range))):
    idx1 = _*col_range
    idx2 = idx1+col_range
    print(df_data.iloc[:, idx1:idx2].describe())
import numpy as np
np.set_printoptions(linewidth=160)