使用pandas的python频率表
我正在使用pandas lineSer.value_counts()创建频率表,但它不会显示我的所有项目。我有100多条数据,我需要查看所有数据使用pandas的python频率表,python,pandas,frequency,Python,Pandas,Frequency,我正在使用pandas lineSer.value_counts()创建频率表,但它不会显示我的所有项目。我有100多条数据,我需要查看所有数据 def freqTable(): fileIn = open('data.txt','r') fileOut = open('dataOut.txt', 'w') lines = [line.strip() for line in fileIn if line.strip() and not line.startswith('c
def freqTable():
fileIn = open('data.txt','r')
fileOut = open('dataOut.txt', 'w')
lines = [line.strip() for line in fileIn if line.strip() and not line.startswith('com')
lineSer = pd.Series(lines)
freq = str(lineSer.value_counts())
for line in freq:
fileOut.write(line)
这就是我正在使用的代码,我需要去掉结果中的“…”并查看所有数据点。我能做什么不同的事
Madding. 57
Crowning. 47
My. 8
And. 8
Thy. 7
Thou. 7
The. 5
To. 5
For. 5
I. 4
That. 4
In. 4
Love. 4
Is. 3
Not. 3
...
Did. 1
Shadows. 1
Of. 1
Mind,. 1
O'erlook. 1
Sometime. 1
Fairer. 1
Monsters,. 1
23. 1
Defect,. 1
Show,. 1
What's. 1
Wood. 1
So. 1
Lov'st,. 1
Length: 133, dtype: int64
试试这个:
pd.options.display.max_rows = 999
如果要将列表写入文件,请不要将其设置为字符串并将其写入文件。Pandas具有将内容写入文件的内置函数。只需对csv('dataOut.txt')执行
lineSer.value\u counts()。如果要调整输出的格式,请阅读to_csv
的文档,了解如何对其进行自定义。(您也可以通过使用类似于pandas.read_csv
,更高效地读取数据,但这是另一个主题。)如果您需要临时显示数据,请尝试使用display.max_rows
:
#temporary print 999 rows
with pd.option_context('display.max_rows', 999):
print freq
更多信息请访问
我尝试使用函数修改您的解决方案,并用于处理字符串数据和将输出写入文件
:
import pandas as pd
import io
temp=u"""Madding.
Madding.
Madding.
Madding.
Crowning.
Crowning.
com Crowning.
com My.
com And.
Thy.
Thou.
The."""
#after testing replace io.StringIO(temp) to data.txt
s = pd.read_csv(io.StringIO(temp), sep="|", squeeze=True)
print s
0 Madding.
1 Madding.
2 Madding.
3 Crowning.
4 Crowning.
5 com Crowning.
6 com My.
7 com And.
8 Thy.
9 Thou.
10 The.
Name: Madding., dtype: object
#strip data
s = s.str.strip()
#get data which starts with 'com'
print s.str.startswith('com')
0 False
1 False
2 False
3 False
4 False
5 True
6 True
7 True
8 False
9 False
10 False
Name: Madding., dtype: bool
#filter rows, which not starts width 'com'
s = s[~s.str.startswith('com')]
print s
0 Madding.
1 Madding.
2 Madding.
3 Crowning.
4 Crowning.
8 Thy.
9 Thou.
10 The.
Name: Madding., dtype: object
#count freq
freq = s.value_counts()
#temporary print 999 rows
with pd.option_context('display.max_rows', 999):
print freq
Madding. 3
Crowning. 2
Thou. 1
Thy. 1
The. 1
Name: Madding., dtype: int64
#write series to file by to_csv
freq.to_csv('dataOut.txt', sep=';')