Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/json/14.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
使用Python解析JSON格式的日期_Python_Json_Parsing_Datetime_Pandas - Fatal编程技术网

使用Python解析JSON格式的日期

使用Python解析JSON格式的日期,python,json,parsing,datetime,pandas,Python,Json,Parsing,Datetime,Pandas,我有一组JSON格式的新闻文章,在解析数据的日期时遇到了问题。问题是,一旦文章被转换为JSON格式,日期就成功地转换了,但版本也被转换了。下面是一个例子: {"date": "December 31, 1995, Sunday, Late Edition - Final", "body": "AFTER a year of dizzying new heights for the market, investors may despair of finding any good stocks l

我有一组JSON格式的新闻文章,在解析数据的日期时遇到了问题。问题是,一旦文章被转换为JSON格式,日期就成功地转换了,但版本也被转换了。下面是一个例子:

{"date": "December 31, 1995, Sunday, Late Edition - Final", "body": "AFTER a year of dizzying new heights for the market, investors may despair of finding any good stocks left. Navistar plans to slash costs by $112 million in 1996. Advanced Micro Devices has made a key acquisition. For the bottom-fishing investor, therefore, the big nail-biter is: Will the changes be enough to turn a company around? ", "title": "INVESTING IT;"}
{"date": "December 31, 1995, Sunday, Late Edition - Final", "body": "Few issues stir as much passion in so many communities as the simple act of moving from place to place: from home to work to the mall and home again.  It was an extremely busy and productive year for us, said Frank J. Wilson, the State Commissioner of Transportation. There's a sense of urgency to get things done. ", "title": "ROAD AND RAIL;"}
{"date": "December 31, 1996, Sunday, Late Edition - Final", "body": "Widespread confidence in the state's economy prevailed last January as many businesses celebrated their most robust gains since the recession. And Steven Wynn, the chairman of Mirage Resorts, who left Atlantic City eight years ago because of local and state regulations, is returning to build a $1 billion two-casino complex. ", "title": "NEW JERSEY & CO.;"}
由于我的目标是计算包含特定单词的文章数量,因此我按以下方式循环文章:

import json
import re
import pandas

for i in range(1995,2017):
    df = pandas.DataFrame([json.loads(l) for l in open('USAT_%d.json' % i)])
# Parse dates and set index
    df.date = pandas.to_datetime(df.date) # is giving me a problem
    df.set_index('date', inplace=True)
我正在考虑如何以最有效的方式解决这个问题。在解析日期时,我想到了“忽略星期日期之后的任何内容”。有这样的事吗


提前感谢

您可以将列
日期
拆分为,将第一列和第二列月份、
日期
年份
合并(
12月31日
1995年
)和最后一次通话:


很高兴能帮助你!您好@jezrael,我遇到过一些情况,一年的结尾没有逗号:
1995年12月31日
而不是
1995年12月31日
。这个场景中的解析似乎陷入了困境。任何解决方法的想法,谢谢!嗯,你认为《日期》:“1995年12月31日星期日,晚版-最终版”,“正文”:“经过一年令人眼花缭乱的市场新高峰后,投资者可能对找到任何好股票感到失望。Navistar计划在1996年将成本削减1.12亿美元。Advanced Micro Devices已经完成了一项关键收购。因此,对于垂死挣扎的投资者来说,最大的问题是:这些变化是否足以扭转一家公司?”,“标题”:投资它;“}?
for i in range(1995,2017):
    df = pandas.DataFrame([json.loads(l) for l in open('USAT_%d.json' % i)])
    # Parse dates and set index
    #print (df)
    a = df.date.str.split(', ', expand=True)
    df.date = a.iloc[:,0] + ' ' + a.iloc[:,1]
    df.date = pandas.to_datetime(df.date) 
    df.set_index('date', inplace=True)
    print (df)

                                                    body  \
date                                                            
1995-12-31  AFTER a year of dizzying new heights for the m...   
1995-12-31  Few issues stir as much passion in so many com...   
1996-12-31  Widespread confidence in the state's economy p...   

                        title  
date                           
1995-12-31      INVESTING IT;  
1995-12-31     ROAD AND RAIL;  
1996-12-31  NEW JERSEY & CO.;