Python 从.txt文件读取统计信息并输出它们_Python_Python 3.x

Python 从.txt文件读取统计信息并输出它们

python python-3.x

Python 从.txt文件读取统计信息并输出它们,python,python-3.x,Python,Python 3.x,我应该从.txt文件中获取某些信息并将其输出。这是我需要的信息：人口最多的州人口最少的州平均州人口德克萨斯州人口下面的示例如下所示： Alabama AL 4802982 Alaska AK 721523 Arizona AZ 6412700 Arkansas AR 2926229 California CA 37341989 这是我的代码，它并没有真正执行我需要它执行的任何操作： def main(): # Open the StateCensus2010.txt fi

我应该从.txt文件中获取某些信息并将其输出。这是我需要的信息：

人口最多的州
人口最少的州
平均州人口
德克萨斯州人口

下面的示例如下所示：

Alabama
AL
4802982
Alaska
AK
721523
Arizona
AZ
6412700
Arkansas
AR
2926229
California
CA
37341989

这是我的代码，它并没有真正执行我需要它执行的任何操作：

def main():
    # Open the StateCensus2010.txt file.
    census_file = open('StateCensus2010.txt', 'r')
    # Read the state name
    state_name = census_file.readline()

    while state_name != '':
        state_abv = census_file.readline()
        population = int(census_file.readline())

        state_name = state_name.rstrip('\n')
        state_abv = state_abv.rstrip('\n')

        print('State Name: ', state_name)
        print('State Abv.: ', state_abv)
        print('Population: ', population)
        print()

        state_name = census_file.readline()
    census_file.close()
main()

我所要做的就是读取州名，abv并将人口转换成整数。我不需要它来做任何事情，但是我不确定如何做作业要求的事情。任何提示都将不胜感激！在过去的几个小时里，我一直在尝试一些东西，但都没有用

更新：

states = []
for line in data:
    states.append(
        dict(state=line.strip(),
             abbrev=next(data).strip(),
             pop=int(next(data)),
             )
    )

df = pd.DataFrame(states)
print(df)

print('\nmax population:\n', df.ix[df['pop'].idxmax()])
print('\nmin population:\n', df.ix[df['pop'].idxmin()])
print('\navg population:\n', df['pop'].mean())
print('\nAZ population:\n', df[df.abbrev == 'AZ'])

from io import StringIO
data = StringIO(u'\n'.join([x.strip() for x in """
    Alabama
    AL
    4802982
    Alaska
    AK
    721523
    Arizona
    AZ
    6412700
    Arkansas
    AR
    2926229
    California
    CA
    37341989
""".split('\n')[1:-1]]))

  abbrev       pop       state
0     AL   4802982     Alabama
1     AK    721523      Alaska
2     AZ   6412700     Arizona
3     AR   2926229    Arkansas
4     CA  37341989  California

max population:
abbrev            CA
pop         37341989
state     California
Name: 4, dtype: object

min population:
abbrev        AK
pop       721523
state     Alaska
Name: 1, dtype: object

avg population:
10441084.6

AZ population:
  abbrev      pop    state
2     AZ  6412700  Arizona

这是我的更新代码，但我收到以下错误：

Traceback (most recent call last):
  File "main.py", line 13, in <module>
    if population > max_population:
TypeError: unorderable types: str() > int()

由于数据顺序一致；州名，州Abv，人口。所以你只需要读一次这些行，然后显示所有三个信息。下面是示例代码

average = 0.0
total = 0.0
state_min = 999999999999
state_max = 0
statename_min = ''
statename_max = ''
texas_population = 0
with open('StateCensus2010.txt','r') as file:
    # split new line, '\n' here means newline

    data = file.read().split('\n')

    # get the length of the data by using len() method
    # there are 50 states in the text file
    # each states have 3 information stored,
    # state name, state abreviation, population
    # that's why length of data which is 150/3 = 50 states
    state_total = len(data)/3 


    # this count is used as an index for the list 
    count = 0
    for i in range(int(state_total)):

        statename = data[count]
        state_abv = data[count+1]
        population = int(data[count+2])

        print('Statename : ',statename)
        print('State Abv : ',state_abv)
        print('Population: ',population)
        print()

        # sum all states population
        total += population

        if population > state_max:
            state_max = population
            statename_max = statename

        if population < state_min:
            state_min = population
            statename_min = statename

        if statename == 'Texas':
            texas_population = population


        # add 3 because we want to jump to next state
        # for example the first three lines is Alabama info
        # the next three lines is Alaska info and so on
        count += 3


    # divide the total population with number of states 
    average = total/state_total
    print(str(average))

    print('Lowest population state :', statename_min)
    print('Highest population state :', statename_max)
    print('Texas population :', texas_population)

average=0.0
总计=0.0
状态_min=9999999
状态_max=0
statename_min=“”
statename_max=“”
德克萨斯州人口=0
打开（'StateCensus2010.txt'，'r'）作为文件：
#拆分新行，此处的“\n”表示新行
data=file.read（）.split（'\n'）
#使用len（）方法获取数据的长度
#文本文件中有50个状态
#每个州存储了3个信息，
#州名、州删节、人口
#这就是为什么数据长度为150/3=50个状态
总状态=长度（数据）/3
#此计数用作列表的索引
计数=0
对于范围内的i（int（state_total））：
statename=数据[计数]
状态=数据[计数+1]
人口=int（数据[计数+2]）
打印（'Statename:'，Statename）
打印（'State Abv:'，State_Abv）
打印（'总体：'，总体）
打印（）
#所有州人口总和
总数+=人口
如果总体>状态_max：
状态_max=总体
statename\u max=statename
如果人口

因为数据顺序一致；州名，州Abv，人口。所以你只需要读一次这些行，然后显示所有三个信息。下面是示例代码

average = 0.0
total = 0.0
state_min = 999999999999
state_max = 0
statename_min = ''
statename_max = ''
texas_population = 0
with open('StateCensus2010.txt','r') as file:
    # split new line, '\n' here means newline

    data = file.read().split('\n')

    # get the length of the data by using len() method
    # there are 50 states in the text file
    # each states have 3 information stored,
    # state name, state abreviation, population
    # that's why length of data which is 150/3 = 50 states
    state_total = len(data)/3 


    # this count is used as an index for the list 
    count = 0
    for i in range(int(state_total)):

        statename = data[count]
        state_abv = data[count+1]
        population = int(data[count+2])

        print('Statename : ',statename)
        print('State Abv : ',state_abv)
        print('Population: ',population)
        print()

        # sum all states population
        total += population

        if population > state_max:
            state_max = population
            statename_max = statename

        if population < state_min:
            state_min = population
            statename_min = statename

        if statename == 'Texas':
            texas_population = population


        # add 3 because we want to jump to next state
        # for example the first three lines is Alabama info
        # the next three lines is Alaska info and so on
        count += 3


    # divide the total population with number of states 
    average = total/state_total
    print(str(average))

    print('Lowest population state :', statename_min)
    print('Highest population state :', statename_max)
    print('Texas population :', texas_population)

average=0.0
总计=0.0
状态_min=9999999
状态_max=0
statename_min=“”
statename_max=“”
德克萨斯州人口=0
打开（'StateCensus2010.txt'，'r'）作为文件：
#拆分新行，此处的“\n”表示新行
data=file.read（）.split（'\n'）
#使用len（）方法获取数据的长度
#文本文件中有50个状态
#每个州存储了3个信息，
#州名、州删节、人口
#这就是为什么数据长度为150/3=50个状态
总状态=长度（数据）/3
#此计数用作列表的索引
计数=0
对于范围内的i（int（state_total））：
statename=数据[计数]
状态=数据[计数+1]
人口=int（数据[计数+2]）
打印（'Statename:'，Statename）
打印（'State Abv:'，State_Abv）
打印（'总体：'，总体）
打印（）
#所有州人口总和
总数+=人口
如果总体>状态_max：
状态_max=总体
statename\u max=statename
如果人口

这个问题很容易解决

代码：

states = []
for line in data:
    states.append(
        dict(state=line.strip(),
             abbrev=next(data).strip(),
             pop=int(next(data)),
             )
    )

df = pd.DataFrame(states)
print(df)

print('\nmax population:\n', df.ix[df['pop'].idxmax()])
print('\nmin population:\n', df.ix[df['pop'].idxmin()])
print('\navg population:\n', df['pop'].mean())
print('\nAZ population:\n', df[df.abbrev == 'AZ'])

from io import StringIO
data = StringIO(u'\n'.join([x.strip() for x in """
    Alabama
    AL
    4802982
    Alaska
    AK
    721523
    Arizona
    AZ
    6412700
    Arkansas
    AR
    2926229
    California
    CA
    37341989
""".split('\n')[1:-1]]))

  abbrev       pop       state
0     AL   4802982     Alabama
1     AK    721523      Alaska
2     AZ   6412700     Arizona
3     AR   2926229    Arkansas
4     CA  37341989  California

max population:
abbrev            CA
pop         37341989
state     California
Name: 4, dtype: object

min population:
abbrev        AK
pop       721523
state     Alaska
Name: 1, dtype: object

avg population:
10441084.6

AZ population:
  abbrev      pop    state
2     AZ  6412700  Arizona

测试数据：

states = []
for line in data:
    states.append(
        dict(state=line.strip(),
             abbrev=next(data).strip(),
             pop=int(next(data)),
             )
    )

df = pd.DataFrame(states)
print(df)

print('\nmax population:\n', df.ix[df['pop'].idxmax()])
print('\nmin population:\n', df.ix[df['pop'].idxmin()])
print('\navg population:\n', df['pop'].mean())
print('\nAZ population:\n', df[df.abbrev == 'AZ'])

from io import StringIO
data = StringIO(u'\n'.join([x.strip() for x in """
    Alabama
    AL
    4802982
    Alaska
    AK
    721523
    Arizona
    AZ
    6412700
    Arkansas
    AR
    2926229
    California
    CA
    37341989
""".split('\n')[1:-1]]))

  abbrev       pop       state
0     AL   4802982     Alabama
1     AK    721523      Alaska
2     AZ   6412700     Arizona
3     AR   2926229    Arkansas
4     CA  37341989  California

max population:
abbrev            CA
pop         37341989
state     California
Name: 4, dtype: object

min population:
abbrev        AK
pop       721523
state     Alaska
Name: 1, dtype: object

avg population:
10441084.6

AZ population:
  abbrev      pop    state
2     AZ  6412700  Arizona

结果：

states = []
for line in data:
    states.append(
        dict(state=line.strip(),
             abbrev=next(data).strip(),
             pop=int(next(data)),
             )
    )

df = pd.DataFrame(states)
print(df)

print('\nmax population:\n', df.ix[df['pop'].idxmax()])
print('\nmin population:\n', df.ix[df['pop'].idxmin()])
print('\navg population:\n', df['pop'].mean())
print('\nAZ population:\n', df[df.abbrev == 'AZ'])

from io import StringIO
data = StringIO(u'\n'.join([x.strip() for x in """
    Alabama
    AL
    4802982
    Alaska
    AK
    721523
    Arizona
    AZ
    6412700
    Arkansas
    AR
    2926229
    California
    CA
    37341989
""".split('\n')[1:-1]]))

  abbrev       pop       state
0     AL   4802982     Alabama
1     AK    721523      Alaska
2     AZ   6412700     Arizona
3     AR   2926229    Arkansas
4     CA  37341989  California

max population:
abbrev            CA
pop         37341989
state     California
Name: 4, dtype: object

min population:
abbrev        AK
pop       721523
state     Alaska
Name: 1, dtype: object

avg population:
10441084.6

AZ population:
  abbrev      pop    state
2     AZ  6412700  Arizona

这个问题很容易解决

代码：

states = []
for line in data:
    states.append(
        dict(state=line.strip(),
             abbrev=next(data).strip(),
             pop=int(next(data)),
             )
    )

df = pd.DataFrame(states)
print(df)

print('\nmax population:\n', df.ix[df['pop'].idxmax()])
print('\nmin population:\n', df.ix[df['pop'].idxmin()])
print('\navg population:\n', df['pop'].mean())
print('\nAZ population:\n', df[df.abbrev == 'AZ'])

from io import StringIO
data = StringIO(u'\n'.join([x.strip() for x in """
    Alabama
    AL
    4802982
    Alaska
    AK
    721523
    Arizona
    AZ
    6412700
    Arkansas
    AR
    2926229
    California
    CA
    37341989
""".split('\n')[1:-1]]))

  abbrev       pop       state
0     AL   4802982     Alabama
1     AK    721523      Alaska
2     AZ   6412700     Arizona
3     AR   2926229    Arkansas
4     CA  37341989  California

max population:
abbrev            CA
pop         37341989
state     California
Name: 4, dtype: object

min population:
abbrev        AK
pop       721523
state     Alaska
Name: 1, dtype: object

avg population:
10441084.6

AZ population:
  abbrev      pop    state
2     AZ  6412700  Arizona

测试数据：

states = []
for line in data:
    states.append(
        dict(state=line.strip(),
             abbrev=next(data).strip(),
             pop=int(next(data)),
             )
    )

df = pd.DataFrame(states)
print(df)

print('\nmax population:\n', df.ix[df['pop'].idxmax()])
print('\nmin population:\n', df.ix[df['pop'].idxmin()])
print('\navg population:\n', df['pop'].mean())
print('\nAZ population:\n', df[df.abbrev == 'AZ'])

from io import StringIO
data = StringIO(u'\n'.join([x.strip() for x in """
    Alabama
    AL
    4802982
    Alaska
    AK
    721523
    Arizona
    AZ
    6412700
    Arkansas
    AR
    2926229
    California
    CA
    37341989
""".split('\n')[1:-1]]))

  abbrev       pop       state
0     AL   4802982     Alabama
1     AK    721523      Alaska
2     AZ   6412700     Arizona
3     AR   2926229    Arkansas
4     CA  37341989  California

max population:
abbrev            CA
pop         37341989
state     California
Name: 4, dtype: object

min population:
abbrev        AK
pop       721523
state     Alaska
Name: 1, dtype: object

avg population:
10441084.6

AZ population:
  abbrev      pop    state
2     AZ  6412700  Arizona

结果：

states = []
for line in data:
    states.append(
        dict(state=line.strip(),
             abbrev=next(data).strip(),
             pop=int(next(data)),
             )
    )

df = pd.DataFrame(states)
print(df)

print('\nmax population:\n', df.ix[df['pop'].idxmax()])
print('\nmin population:\n', df.ix[df['pop'].idxmin()])
print('\navg population:\n', df['pop'].mean())
print('\nAZ population:\n', df[df.abbrev == 'AZ'])

from io import StringIO
data = StringIO(u'\n'.join([x.strip() for x in """
    Alabama
    AL
    4802982
    Alaska
    AK
    721523
    Arizona
    AZ
    6412700
    Arkansas
    AR
    2926229
    California
    CA
    37341989
""".split('\n')[1:-1]]))

  abbrev       pop       state
0     AL   4802982     Alabama
1     AK    721523      Alaska
2     AZ   6412700     Arizona
3     AR   2926229    Arkansas
4     CA  37341989  California

max population:
abbrev            CA
pop         37341989
state     California
Name: 4, dtype: object

min population:
abbrev        AK
pop       721523
state     Alaska
Name: 1, dtype: object

avg population:
10441084.6

AZ population:
  abbrev      pop    state
2     AZ  6412700  Arizona

请尝试此操作，因为前面的代码与python 3不兼容。它支持python 2.7

    def extract_data(state):
        total_population = 0
        for states, stats in state.items():
            population = stats.get('population')
            state_name = stats.get('state_name')
            states = states

        total_population = population + total_population

        if 'highest' not in vars():
            highest = population
            higherst_state_name = state_name
            highest_state = states

        if 'lowest' not in vars():
            lowest = population
            lowest_state_name = state_name
            lowest_state = states

        if highest < population:
            highest = population
            higherst_state_name = state_name
            highest_state = states        

        if lowest > population:
            lowest = population
            lowest_state_name = state_name
            lowest_state = states


    print(highest_state, highest)
    print(lowest_state, lowest)
    print(len(state))
    print(int(total_population/len(state)))
    print(state.get('TX').get('population'))

def main():
    # Open the StateCensus2010.txt file.
    census_file = open('states.txt', 'r')
    # Read the state name
    state_name = census_file.readline()
    state = {}


    while state_name != '':
        state_abv = census_file.readline()
        population = int(census_file.readline())
        state_name = state_name.rstrip('\n')
        state_abv = state_abv.rstrip('\n')

        if state_abv in state:
            state[state_abv].update({'population': population, 'state_name': state_name})
        else:
            state.setdefault(state_abv,{'population': population, 'state_name': state_name})

        state_name = census_file.readline()        
    census_file.close()
    return state

state=main()
extract_data(state)

def提取数据（状态）：
总人口=0
对于状态，stats在state.items（）中：
population=stats.get（'population'）
state\u name=stats.get（'state\u name'））
状态=状态
总人口=人口+总人口
如果变量（）中没有“最高”：
最高=人口
高级状态名称=状态名称
最高状态=状态
如果变量（）中没有“最低”：
最低=人口
最低州名=州名
最低状态=状态
如果最高值<人口：
最高=人口
高级状态名称=状态名称
最高状态=状态
如果最低>总体：
最低=人口
最低州名=州名
最低状态=状态
打印（最高_状态，最高）
打印（最低_状态，最低）
印刷品（国家）
印刷品（整版）