使用python从json csv文件创建列表_Json_List_Csv_Python 3.x_Twitter

使用python从json csv文件创建列表

json list csv python-3.x twitter

使用python从json csv文件创建列表,json,list,csv,python-3.x,twitter,Json,List,Csv,Python 3.x,Twitter,很抱歉我问了这个问题，但我已经看了一遍，但找不到答案。老实说，我是个新手。我正在尝试从json csv文件生成一个完整单词的列表。我已经创建了一个行列表，但无法使用split（）生成包含单独单词的新列表（稍后我需要计算单词的出现次数）。我的输入文件包含twitter信息：我试图编写简单的代码： myfile=open('fileName','r') words=[] for line in myfile: words.append(line.split()) len(words)=

很抱歉我问了这个问题，但我已经看了一遍，但找不到答案。老实说，我是个新手。我正在尝试从json csv文件生成一个完整单词的列表。我已经创建了一个行列表，但无法使用split（）生成包含单独单词的新列表（稍后我需要计算单词的出现次数）。我的输入文件包含twitter信息：我试图编写简单的代码：

myfile=open('fileName','r')
words=[]
for line in myfile:
    words.append(line.split())

len(words)=82

我还尝试了reader=csv.reader（myFile）和reader=csv.DictReader（myFile）

但总的来说，我可以得到每一行，但是如何进一步将字符串/行拆分为独立的单词。对不起，提前谢谢你

我的数据#我换了一个不同的例子，因为上一个例子的格式可能不好：

id,flags,expiration,cas,value
493926581610364928,0,0,2635740904247446,"{""contributors"":null,""truncated"":false,""text"":""@xaaronh @blueredandgold If Namco Bandai's One Piece Unlimited World is anything to go by, no local retail release means no eShop either =\\"",""in_reply_to_status_id"":493925918998425600,""id"":493926581610364928,""favorite_count"":0,""source"":""<a href=\""hp://twitter.com\"" rel=\""nofollow\"">Twitter Web Client</a>"",""retweeted"":false,""coordinates"":null,""entities"":{""symbols"":[],""user_mentions"":[{""id"":139852376,""indices"":[0,8],""id_str"":""139852376"",""screen_name"":""xaaronh"",""name"":""Aaron""},{""id"":74393990,""indices"":[9,24],""id_str"":""74393990"",""screen_name"":""blueredandgold"",""name"":""Leigh""}],""hashtags"":[],""urls"":[]},""in_reply_to_screen_name"":""xaaronh"",""in_reply_to_user_id"":139852376,""retweet_count"":0,""id_str"":""493926581610364928"",""favorited"":false,""user"":{""follow_request_sent"":false,""profile_use_background_image"":true,""default_profile_image"":false,""id"":42302246,""profile_background_image_url_hp"":""hp://pbs.twimg.com/profile_background_images/464279459932020736/v1xnMcrV.jpeg"",""verified"":false,""profile_text_color"":""333333"",""profile_image_url_https"":""hp://pbs.twimg.com/profile_images/490791031487463424/udSldTQ3_normal.png"",""profile_sidebar_fill_color"":""DDEEF6"",""entities"":{""description"":{""urls"":[{""url"":""hp:tttt"",""indices"":[67,89],""expanded_url"":""hp://infernalmonkey.com"",""display_url"":""infernalmonkey.com""}]}},""followers_count"":506,""profile_sidebar_border_color"":""000000"",""id_str"":""42302246"",""profile_background_color"":""1A1B1F"",""listed_count"":22,""is_translation_enabled"":false,""utc_offset"":36000,""statuses_count"":8676,""description"":""I probably tweet about video games and onaholes. Let's be friends! (NSFW)"",""friends_count"":261,""location"":""Sydney, Australia"",""profile_link_color"":""2FC2EF"",""profile_image_url"":""hp://pbs.twimg.com/profile_images/490791031487463424/udSldTQ3_normal.png"",""following"":false,""geo_enabled"":false,""profile_banner_url"":""hp://pbs.twimg.com/profile_banners/42302246/1406105444"",""profile_background_image_url"":""hp://pbs.twimg.com/profile_background_images/464279459932020736/v1xnMcrV.jpeg"",""screen_name"":""infernal_monkey"",""lang"":""en"",""profile_background_tile"":false,""favourites_count"":2018,""name"":""Lance McGill"",""notifications"":false,""url"":null,""created_at"":""Sun May 24 23:20:25 +0000 2009"",""contributors_enabled"":false,""time_zone"":""Sydney"",""protected"":false,""default_profile"":false,""is_translator"":false},""geo"":null,""in_reply_to_user_id_str"":""139852376"",""lang"":""en"",""_id"":""493926581610364928"",""created_at"":""Tue Jul 29 01:10:48 +0000 2014"",""in_reply_to_status_id_str"":""493925918998425600"",""place"":null,""metadata"":{""iso_language_code"":""en"",""result_type"":""recent""}}"

id、标志、过期、cas、值
493926581610364928,0,02635740904247446，“{”贡献者“：null“，”截断“：false“，”text“：”@xaaronh@blueredandgold如果Namco Bandai的“一件式无限世界”值得一看，没有本地零售版本也意味着没有eShop=\\”，“在回复状态中”\u id“：493925918998425600”，“id”：493926581610364928，“最爱的”：0，“来源”：“，”“转发”“：false”“坐标”“：null”“实体”“：{”“符号”“：[]，”“用户\提及”“：[{”“id”“：139852376”“，”“索引”“：[0,8]，”“id \ str”“：”“139852376”“，”“屏幕\名称”“：”“xaaronh”“，”“名称”“：”“亚伦”“，{”“id”“：74393990”“，”“索引”“：[9,24]，”“id \ str 74393990”“，”“屏幕\名称”“：”“蓝金色”“，”“名称”“：”“利”“，”“散列标签”“：[]，”“URL”“，”“URL”“。”在“回复”屏幕中“姓名”：“xaaronh”，“回复”屏幕中“用户id”：139852376，“转发次数”：0，“id”：str:“493926581610364928”，“收藏夹”：false，“用户”：“{”跟随请求发送“：false”，“配置文件使用背景图像“：true”，“默认配置文件图像”：false”，“id”：42302246”，“配置文件背景图像”url\u hp:“hp://pbs.twimg.com/profile_background_images/464279459932020736/v1xnMcrV.jpeg“已验证”：false，“profile_text_color”：“333333”，“profile_image_url_https”：“hp://pbs.twimg.com/profile_images/4907910; 1031487463424/udSldTQ3.png”，“profile_边栏_fill_color”：“DDEEF6”，“entications”：“{”“description”：“url:”{索引“：[67,89]，“扩展url:”hp://infernalmonkey.com“，”display\u url“：”infernalmonkey.com“，”followers\u count“：”506“，”profile\u sidebar\u border\u color“：”000000“，”id\u str“：”42302246“，”profile\u background\u color“：”1A1B1BF“，”列出的\u count“：22“，”是否已启用翻译“：false“，”utc\u offset“：36000”，“Status\u count:8676”，“description”“：”我可能在推特上发了关于视频游戏和网络洞的信息。让我们成为朋友吧！（NSFW）”，“friends\u count”“：261”，“location”“：”澳大利亚悉尼“，”个人资料链接颜色“：”2FC2EF“，”个人资料图片url“：”hp://pbs.twimg.com/profile\u images/490791031487463424/udSldTQ3\u normal.png“，”以下“：false“，”geo\u enabled”“：false“，”个人资料横幅url“：”“hp://pbs.twimg.com/profile\u banners/42302246/1406105444”“profile\u background\u image\u url”“：”“hp://pbs.twimg.com/profile\u background\u images/464279459932020736/v1xnMcrV.jpeg”“screen\u name”“：”“无间道的猴子”“郎”“en”“profile\u background\u tile\u tile”“：false”“收藏夹\u count”“：2018”“name”“：”“Lance McGill”“通知”“false”“url”“：null”“。”创建时间：2009年5月24日星期日23:20:25+0000“贡献者”启用时间：false“时区”：Sydney“受保护时间”：false“默认配置文件”：false“is_translator”：false“：geo”：null“回复用户id地址：139852376”“lang:”en“，”id“：”493926581610364928“，”创建时间：2014年7月29日星期二01:10:48+0000”在对状态的回复中，id字符串“：”493925918998425600“，”位置“：null“，”元数据“：{”iso语言代码“：”en“，”结果类型“：”最近“}”

这不是最好的解决方案，只是noob（我）的努力，肯定需要进一步编辑以获得更好的输出。我使用的是windows操作系统

import csv
import json
abc=[]
myList=[]
myDict={}
myFile=open('fileName.csv','r',encoding='utf-8')
myReader=csv.reader(myFile)
header=next(myReader)
for line in myReader:
     abc=json.loads(line[4])
     myDict=abc
     myList.append(myDict['text'])
dct={}
for eachLine in myList:
    item=eachLine.split()
    for one in item:
        if one in dct:
           dct[one]+=1
        else:
           dct[one]=1
finalList=list(dct.items())
finalList.sort()

你能以文本格式发布你试图解析的数据吗？你可以编辑和更新你的问题来添加它。我看到你有它的图像，这比根本没有好，但是文本更容易处理。我很抱歉格式不好。谢谢@Igori don为什么每次我使用json.loads（line）它将返回错误。我的json解析功能今早很弱。看起来网络上肯定有这样的例子，而且做类似事情的人层出不穷。我想这里有一个相关的例子：非常感谢@Igor之前的评论，突出了我关于json列的内容。经过几次尝试，这里和那里，以及对列表的更多了解在python中dict，我终于设法为“文本”字符串中的每个单词获取单词发生率。也许我的代码有点长，但希望更好地理解python。再次感谢@Igor请对我的第一次试用发表评论。将编辑更多内容以应用正则表达式过滤。我将编辑此内容以包括您的导入并修复错误第6行的语法错误，其中缺少关闭

）

。尽管在修复后它对我无效。@Igor:我已经添加了一些额外的行。感谢您的评论。对于额外的信息，我正在使用windows平台。干杯！我仍然收到此错误：

值错误：期望值：第1行第1列（字符0）

我认为要解决这个问题，必须对代码进行编辑，以从文件中排除第一行。