Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/json/13.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
仅在Twitter JSON文件的一个元素/数组中筛选_Json_Twitter_Filter_Grep - Fatal编程技术网

仅在Twitter JSON文件的一个元素/数组中筛选

仅在Twitter JSON文件的一个元素/数组中筛选,json,twitter,filter,grep,Json,Twitter,Filter,Grep,我从流式API抓取了twitterJSON文件,得到了一个包含数千行JSON数据的文件。但是,这些数据包含很多元素,例如“创建日期”,“源”,“推文”,等等。 实际上,我想过滤推特文本中的“iphone”一词。但是,如果我使用GREP UNIX进行过滤,它不仅会过滤掉“tweet text”字段,还会过滤掉“source”字段。因此,这意味着不包含“iphone”一词的tweet,但如“Source”字段中所述,从twitterforiphone发布的tweet也将被过滤 无论如何,是否只在一个

我从流式API抓取了twitterJSON文件,得到了一个包含数千行JSON数据的文件。但是,这些数据包含很多元素,例如“创建日期”“源”“推文”,等等。 实际上,我想过滤推特文本中的“iphone”一词。但是,如果我使用GREP UNIX进行过滤,它不仅会过滤掉“tweet text”字段,还会过滤掉“source”字段。因此,这意味着不包含“iphone”一词的tweet,但如“Source”字段中所述,从twitterforiphone发布的tweet也将被过滤

无论如何,是否只在一个特定字段中过滤这个JSON(在我的例子中是“tweet text”字段)

下面是一个JSON行的示例:

{"created_at":"Tue Aug 20 03:48:27 +0000 2013","id":369667218608369666,"id_str":"369667218608369666","text":"@Mattyb_chyeah_ yeah I'm only watching him! :)","source":"\u003ca href=\"http:\/\/twitter.com\/download\/iphone\" rel=\"nofollow\"\u003eTwitter for iPhone\u003c\/a\u003e","truncated":false,"in_reply_to_status_id":369666992334073856,"in_reply_to_status_id_str":"369666992334073856","in_reply_to_user_id":1557571363,"in_reply_to_user_id_str":"1557571363","in_reply_to_screen_name":"Mattyb_chyeah_","user":{"id":1325959333,"id_str":"1325959333","name":"MattyBRapsTexas","screen_name":"MattyBRapsTexas","location":"Atlanta,Georgia","url":"http:\/\/www.instagram.com\/mattybrapstexas","description":"3 RT 6 Mentions He followed me on 4\/15\/13 6\/17\/13 Maddi Jane followed me on 6\/18\/13 @8:25pm! Cimorelli also follows Pizza Hut mentioned me 2 times on 7\/26\/13","protected":false,"followers_count":1095,"friends_count":426,"listed_count":8,"created_at":"Thu Apr 04 02:34:56 +0000 2013","favourites_count":226,"utc_offset":-14400,"time_zone":"Eastern Time (US & Canada)","geo_enabled":false,"verified":false,"statuses_count":3447,"lang":"en","contributors_enabled":false,"is_translator":false,"profile_background_color":"C0DEED","profile_background_image_url":"http:\/\/a0.twimg.com\/images\/themes\/theme1\/bg.png","profile_background_image_url_https":"https:\/\/si0.twimg.com\/images\/themes\/theme1\/bg.png","profile_background_tile":false,"profile_image_url":"http:\/\/a0.twimg.com\/profile_images\/378800000313651225\/afee0cc2286882eeb15f21ed7fae334a_normal.jpeg","profile_image_url_https":"https:\/\/si0.twimg.com\/profile_images\/378800000313651225\/afee0cc2286882eeb15f21ed7fae334a_normal.jpeg","profile_banner_url":"https:\/\/pbs.twimg.com\/profile_banners\/1325959333\/1376759786","profile_link_color":"0084B4","profile_sidebar_border_color":"C0DEED","profile_sidebar_fill_color":"DDEEF6","profile_text_color":"333333","profile_use_background_image":true,"default_profile":true,"default_profile_image":false,"following":null,"follow_request_sent":null,"notifications":null},"geo":null,"coordinates":null,"place":null,"contributors":null,"retweet_count":0,"favorite_count":0,"entities":{"hashtags":[],"symbols":[],"urls":[],"user_mentions":[{"screen_name":"Mattyb_chyeah_","name":"MattyB (\u2661_\u2661\u2740)","id":1557571363,"id_str":"1557571363","indices":[0,15]}]},"favorited":false,"retweeted":false,"filter_level":"medium","lang":"en"

你的grep正则表达式使用什么?如果你只是在正则表达式中使用“iphone”,那么是的,你会得到多次点击。您可以在源代码之前的文本部分中展开正则表达式以仅与iphone匹配:

grep的“text”:“*iphone.*”,“source”:“myfile.txt


将在
“text”
之后但在
“source”
之前搜索模式
iphone
。它将忽略该行其余部分的
iphone

下面的答案有帮助吗?