Python 计算数据集的平均值,文本混合

Python 计算数据集的平均值,文本混合,python,Python,我被要求编写一个Python程序,读取一个文件并计算10年内每个国家的平均GDP 基本上,我期望的结果是: Australia: 1248467214849.1 Azerbaijan: 55506365440.0 Bangladesh: 139036345780.9 Brazil: 2057882976008.9 Brunei Darussalam: 14817756697.0 Burkina Faso: 10081729086.1 Cabo Verde: 1719693752.3 Cambod

我被要求编写一个Python程序,读取一个文件并计算10年内每个国家的平均GDP

基本上,我期望的结果是:

Australia: 1248467214849.1
Azerbaijan: 55506365440.0
Bangladesh: 139036345780.9
Brazil: 2057882976008.9
Brunei Darussalam: 14817756697.0
Burkina Faso: 10081729086.1
Cabo Verde: 1719693752.3
Cambodia: 13779735437.1
Chile: 229246627569.0
China: 7784747168448.6
Czech Republic: 207328405561.6
Dominica: 499171357.0
Egypt, Arab Rep.: 247614743339.3
France: 2702817149305.2
Germany: 3582562859622.3
Greece: 270091322197.4
Guam: 5115700000.0
India: 1726508317353.4
Iran, Islamic Rep.: 454617559842.3
Iraq: 169480789377.9
Japan: 5217301203153.5
Jordan: 29469864942.1
Kazakhstan: 168198946242.6
Kenya: 48807995178.8
Korea, Rep.: 1205755199135.1
Latvia: 28908355369.8
Lebanon: 40455121214.3
Lithuania: 42763449721.2
Madagascar: 9486935333.5
Malaysia: 274833978374.2
Mali: 11894695436.7
Mongolia: 9207583282.1
Mozambique: 12838623643.4
Myanmar: 50703575766.4
Nicaragua: 10212597587.4
Nigeria: 375494148527.7
Paraguay: 23250819867.6
Philippines: 231981575952.4
Qatar: 149455747118.1
Singapore: 257026873704.2
Spain: 1404296966483.9
Sweden: 519174481541.8
Tanzania: 36731725995.3
Tunisia: 44118349316.0
Turkmenistan: 29383204467.2
United Kingdom: 2736682446205.8
United States: 16108231800000.0
Vietnam: 144579453846.2
Zambia: 21393965950.9
Zimbabwe: 11907947332.3
提供的文本文件如下所示:

853764622753
1055334825425
927168311000
1142876772659
1390557034408
1538194473087
1567178619062
1459597906913
1345383143356
1204616439828
Australia
33050343783
48852482960
44291490421
52902703376
65951627200
69684317719
74164435946
75244166773
53074370486
37847715736
Azerbaijan
79611888213
91631278239
102477791472
115279077465
128637938711
133355749482
149990451022
172885454931
195078665828
221415162446
Bangladesh
1397084381901
1695824517396
1667019605882
2208871646203
2616201578192
2465188674415
2472806919902
2455993200170
1803652649614
1796186586414
Brazil
12247694247
14393099069
10732366286
13707370737
18525319978
19048495519
18093829923
17098342541
12930394938
11400653732
Brunei Darussalam
6771277871
8369637065
8369175126
8979966766
10724063458
11166063467
11947176342
12377391463
10419303761
11693235542
Burkina Faso
1513934037
1789333749
1711817182
1664310770
1864824081
1751888562
1850951315
1858121723
1574288668
1617467436
Cabo Verde
8639235842
10351914093
10401851851
11242275199
12829541141
14038383450
15449630419
16777820333
18049954289
20016747754
Cambodia
173605968179
179638496279
172389498445
218537551220
252251992029
267122320057
278384332694
260990299051
242517905162
247027912574
Chile
3552182311653
4598206091384
5109953609257
6100620488868
7572553836875
8560547314679
9607224481533
10482372109962
11064666282626
11199145157649
China
189227050760
235718586901
206179982164
207477857919
227948349666
207376427021
209402444996
207818330724
186829940546
195305084919
Czech Republic
421375852
458190185
489074333
493824407
501025303
485997988
501979277
523666347
535095846
581484032
Dominica
130478960092
162818181818
188982374701
218888324505
236001858960
279372758362
288586231502
305529656458
332698041031
332791045964
Egypt, Arab Rep.
2663112510266
2923465651091
2693827452070
2646837111795
2862680142625
2681416108537
2808511203185
2849305322685
2433562015516
2465453975282
France
3439953462907
3752365607148
3418005001389
3417094562649
3757698281118
3543983909148
3752513503278
3890606893347
3375611100742
3477796274497
Germany
318497936901
354460802549
330000252153
299361576558
287797822093
245670666639
239862011450
237029579261
195541761243
192690813127
Greece
4375000000
4621000000
4781000000
4895000000
4928000000
5199000000
5337000000
5531000000
5697000000
5793000000
Guam
1201111768409
1186952757636
1323940295875
1656617073124
1823049927771
1827637859136
1856722121395
2035393459979
2089865410868
2263792499341
India
349881601459
406070949554
414059094949
487069570464
583500357530
598853401276
467414852231
434474616832
385874474399
418976679729
Iran, Islamic Rep.
88840050497
131613661510
111660855043
138516722650
185749664444
218000986223
234648370497
234648370497
179640210726
171489001692
Iraq
4515264514431
5037908465114
5231382674594
5700098114744
6157459594824
6203213121334
5155717056271
4848733415524
4383076298082
4940158776617
Japan
17110587447
21972004086
23820230000
26425379437
28840263380
30937277606
33593843662
35826925775
37517410282
38654727746
Jordan
104849886826
133441612247
115308661143
148047348241
192626507972
207998568866
236634552078
221415572820
184388432149
137278320084
Kazakhstan
31958195182
35895153328
37021512049
39999659234
41953433591
50412754822
55097343448
61445345999
63767539357
70529014778
Kenya
1122679154632
1002219052968
901934953365
1094499338703
1202463682634
1222807284485
1305604981272
1411333926201
1382764027114
1411245589977
Korea, Rep.
30901399261
35596016664
26169854045
23757368290
28223552825
28119996053
30314363219
31419072948
27009231911
27572698482
Latvia
24577114428
29227350570
35477118070
38419626628
40075674163
43868565282
46014226808
47833413749
49459296463
49598825982
Lebanon
39738180077
47850551149
37440673478
37120517694
43476878139
42847900766
46473646002
48545251796
41402022148
42738875963
Lithuania
7342923489
9413002921
8550363975
8729936136
9892702358
9919780071
10601690872
10673516673
9744243420
10001193420
Madagascar
193547824063
230813597938
202257586268
255016609233
297951960784
314443149443
323277158907
338061963396
296434003329
296535930381
Malaysia
8145694632
9750822511
10181021770
10678749467
12978107561
12442747897
13246412031
14388360064
13100058100
14034980334
Mali
4234999823
5623216449
4583850368
7189481824
10409797649
12292770631
12582122604
12226514722
11749620620
11183458131
Mongolia
9366742309
11494837053
10911698208
10154238250
13131168012
14534278446
16018848991
16961127046
14798439527
11014858592
Mozambique
20182477481
31862554102
36906181381
49540813342
59977326086
59937797559
60269734045
65446402659
59687373958
63225097051
Myanmar
7423377429
8496965842
8298695145
8758622329
9774316692
10532001130
10982972256
11880438824
12747741540
13230844687
Nicaragua
166451213396
208064753766
169481317540
369062464570
411743801712
460953836444
514966287207
568498937588
481066152889
404652720165
Nigeria
13794910634
18504130753
15929902138
20030528043
25099681461
24595319574
28965906502
30881166852
27282581336
27424071383
Paraguay
149359920006
174195135053
168334599538
199590775190
224143083707
250092093548
271836123724
284584522899
292774099014
304905406845
Philippines
79712087912
115270054945
97798351648
125122306346
167775274725
186833516484
198727747253
206224725275
164641483516
152451923077
Qatar
179981288567
192225881688
192408387762
236421782178
275599459374
289162118909
302510668904
308142766948
296840704102
296975678610
Singapore
1479341637011
1635015380108
1499099749931
1431616749640
1488067258325
1336018949806
1361854206549
1376910811041
1197789902774
1237255019654
Spain
487816328342
513965650650
429657033108
488377689565
563109663291
543880647757
578742001488
573817719109
497918109302
514459972806
Sweden
21501741757
27368386358
28573777052
31407908612
33878631649
39087748240
44333456245
48197218327
45628320606
47340071107
Tanzania
38908069299
44856586316
43454935940
44050929160
45810626509
45044112939
46251061734
47587913059
43156708809
42062549395
Tunisia
12664165103
19271523179
20214385965
22583157895
29233333333
35164210526
39197543860
43524210526
35799628571
36179885714
Turkmenistan
3074359743898
2890564338235
2382825985356
2441173394730
2619700404733
2662085168499
2739818680930
3022827781881
2885570309161
2647898654635
United Kingdom
14477635000000
14718582000000
14418739000000
14964372000000
15517926000000
16155255000000
16691517000000
17393103000000
18120714000000
18624475000000
United States
77414425532
99130304099
106014659770
115931749697
135539438560
155820001920
171222025117
186204652922
193241108710
205276172135
Vietnam
14056957976
17910858638
15328342304
20265556274
23460098340
25503370699
28045460442
27150630607
21154394546
21063989683
Zambia
5291950100
4415702800
8621573608
10141859710
12098450749
14242490252
15451768659
15891049236
16304667807
16619960402
Zimbabwe
到目前为止,我想到的是: 要使用聚合循环检查当前行是GDP值还是国家名称:当它到达一个国家名称时,应计算平均值并打印结果,然后应重置每个国家的聚合变量,并继续循环以聚合下一个国家的GDP值

因此,为了处理输入文件的混合性质,我要么使用str.isnumeric()方法,要么在读取10个GDP值时保留一个计数器进行检查(因为下一行将是相应国家的名称)


python3
中类似的内容可能会起作用:

import statistics

with open('10year-gdp.txt') as f:
    items = []
    for line in f.readlines():
        line = line.strip()
        if line.isdigit():
            items.append(float(line))
        else:
            print('{0}: {1}'.format(line, statistics.mean(items)))
            items = []
打印出:

{'Guam': 5115700000.0, 'Lithuania': 42763449721.2, 'Azerbaijan': 55506365440.0, 'Bangladesh': 139036345780.9, 'Egypt, Arab Rep.': 247614743339.3, 'Burkina Faso': 10081729086.1, 'Chile': 229246627569.0, 'Mongolia': 9207583282.1, 'Nicaragua': 10212597587.4, 'Brazil': 2057882976008.9, 'Kenya': 48807995178.8, 'Dominica': 499171357.0, 'Japan': 5217301203153.5, 'India': 1726508317353.4, 'Cabo Verde': 1719693752.3, 'United States': 16108231800000.0, 'Greece': 270091322197.4, 'Myanmar': 50703575766.4, 'Madagascar': 9486935333.5, 'Tunisia': 44118349316.0, 'Mozambique': 12838623643.4, 'Cambodia': 13779735437.1, 'Iraq': 169480789377.9, 'Korea, Rep.': 1205755199135.1, 'Kazakhstan': 168198946242.6, 'Turkmenistan': 29383204467.2, 'Germany': 3582562859622.3, 'Iran, Islamic Rep.': 454617559842.3, 'France': 2702817149305.2, 'Paraguay': 23250819867.6, 'United Kingdom': 2736682446205.8, 'Malaysia': 274833978374.2, 'Philippines': 231981575952.4, 'Qatar': 149455747118.1, 'Lebanon': 40455121214.3, 'Jordan': 29469864942.1, 'Mali': 11894695436.7, 'Zambia': 21393965950.9, 'Australia': 1248467214849.1, 'Singapore': 257026873704.2, 'Zimbabwe': 11907947332.3, 'Sweden': 519174481541.8, 'Nigeria': 375494148527.7, 'China': 7784747168448.6, 'Tanzania': 36731725995.3, 'Czech Republic': 207328405561.6, 'Vietnam': 144579453846.2, 'Latvia': 28908355369.8, 'Spain': 1404296966483.9, 'Brunei Darussalam': 14817756697.0}

你也可以试试这个:

with open("10year-gdp.txt", "r") as infile:

    content = infile.readlines()
    content = [content[i:i+11] for i in range(0,len(content),11)]
    results = [": ".join([c[10],str(sum(map(float,c[0:10]))/10)]).replace("\n","") for c in content]
    for result in results:
        print(result)
输出:

Australia: 1248467214849.1
Azerbaijan: 55506365440.0
Bangladesh: 139036345780.9
Brazil: 2057882976008.9
Brunei Darussalam: 14817756697.0
Burkina Faso: 10081729086.1
Cabo Verde: 1719693752.3
Cambodia: 13779735437.1
Chile: 229246627569.0
China: 7784747168448.6
Czech Republic: 207328405561.6
Dominica: 499171357.0
Egypt, Arab Rep.: 247614743339.3
France: 2702817149305.2
Germany: 3582562859622.3
Greece: 270091322197.4
Guam: 5115700000.0
India: 1726508317353.4
Iran, Islamic Rep.: 454617559842.3
Iraq: 169480789377.9
Japan: 5217301203153.5
Jordan: 29469864942.1
Kazakhstan: 168198946242.6
Kenya: 48807995178.8
Korea, Rep.: 1205755199135.1
Latvia: 28908355369.8
Lebanon: 40455121214.3
Lithuania: 42763449721.2
Madagascar: 9486935333.5
Malaysia: 274833978374.2
Mali: 11894695436.7
Mongolia: 9207583282.1
Mozambique: 12838623643.4
Myanmar: 50703575766.4
Nicaragua: 10212597587.4
Nigeria: 375494148527.7
Paraguay: 23250819867.6
Philippines: 231981575952.4
Qatar: 149455747118.1
Singapore: 257026873704.2
Spain: 1404296966483.9
Sweden: 519174481541.8
Tanzania: 36731725995.3
Tunisia: 44118349316.0
Turkmenistan: 29383204467.2
United Kingdom: 2736682446205.8
United States: 16108231800000.0
Vietnam: 144579453846.2
Zambia: 21393965950.9
Zimbabwe: 11907947332.3

你的问题是什么?你还没有提出问题或陈述问题。堆栈溢出不是代码编写服务。请尝试自己编写程序,然后在代码遇到特定问题时使用堆栈溢出。如果您当前的问题是不知道如何区分文本和数字数据,请查看现有的问答:好的,先生们:)欢迎来到堆栈溢出!请不要破坏你的帖子。通过在Stack Exchange网络上发布,您已授予SE分发该内容的不可撤销的权利(根据)。根据SE政策,任何破坏行为都将被恢复。这是有效的!但我对python还不是很熟练。。。有一种非常简单的方法吗?列表理解是Python的一个特性,但是您可以指出那些看起来很复杂的部分,我将尝试使它们更具可读性。另一种方法是使用一个包,比如。用户说每个国家都有10个值,我想这就是为什么你硬编码平均值的原因。通过使用
statistics.mean()
可以简化代码并消除项数限制。确实,这是一个错误。我将相应地编辑我的aswer。您在
content=[content[I:I+11]中每隔11行(10行表示GDP和国家名称)对输入文件进行分割,然后在
result
中进行计算。将所有数字从字符串转换为浮点数,用
sum
求和,然后除以10得到平均值。然后使用
join
将结果连接到一个字符串中。我很确定它是正确的,但它说它没有生成与预期答案相同的输出。因此,当您处理问题时,是否有一种方法可以在代码之前进行一些初始化,例如,对于open(文本)中的值,gdp=0:您使用过的一些方法,如readlines,我还没有学会。我想我还没有学会“.mean”,但我们已经学会了将所有的数字相加,然后再除以总数据点,得到平均值,因为我对python非常在行。。。哈哈……还有我的课程。。。。我也没有处理“items=[]”我检查了输出,它显示了与您发布的结果相同的结果。可能在统计信息.mean()中出现了一些舍入,这可能会导致值略有不同。
with open("10year-gdp.txt", "r") as infile:

    content = infile.readlines()
    content = [content[i:i+11] for i in range(0,len(content),11)]
    results = [": ".join([c[10],str(sum(map(float,c[0:10]))/10)]).replace("\n","") for c in content]
    for result in results:
        print(result)
Australia: 1248467214849.1
Azerbaijan: 55506365440.0
Bangladesh: 139036345780.9
Brazil: 2057882976008.9
Brunei Darussalam: 14817756697.0
Burkina Faso: 10081729086.1
Cabo Verde: 1719693752.3
Cambodia: 13779735437.1
Chile: 229246627569.0
China: 7784747168448.6
Czech Republic: 207328405561.6
Dominica: 499171357.0
Egypt, Arab Rep.: 247614743339.3
France: 2702817149305.2
Germany: 3582562859622.3
Greece: 270091322197.4
Guam: 5115700000.0
India: 1726508317353.4
Iran, Islamic Rep.: 454617559842.3
Iraq: 169480789377.9
Japan: 5217301203153.5
Jordan: 29469864942.1
Kazakhstan: 168198946242.6
Kenya: 48807995178.8
Korea, Rep.: 1205755199135.1
Latvia: 28908355369.8
Lebanon: 40455121214.3
Lithuania: 42763449721.2
Madagascar: 9486935333.5
Malaysia: 274833978374.2
Mali: 11894695436.7
Mongolia: 9207583282.1
Mozambique: 12838623643.4
Myanmar: 50703575766.4
Nicaragua: 10212597587.4
Nigeria: 375494148527.7
Paraguay: 23250819867.6
Philippines: 231981575952.4
Qatar: 149455747118.1
Singapore: 257026873704.2
Spain: 1404296966483.9
Sweden: 519174481541.8
Tanzania: 36731725995.3
Tunisia: 44118349316.0
Turkmenistan: 29383204467.2
United Kingdom: 2736682446205.8
United States: 16108231800000.0
Vietnam: 144579453846.2
Zambia: 21393965950.9
Zimbabwe: 11907947332.3