Java 存储来自twitter的json文件';mongoDB中的api

Java 存储来自twitter的json文件';mongoDB中的api,java,python,json,twitter,bson,Java,Python,Json,Twitter,Bson,我正在使用java和twitter搜索api在mongoDB中存储推文,有没有办法按原样存储来自twitter api的json 我也在考虑使用python,有没有办法在python中实现这一点?我不知道您是否仍然对这个问题感兴趣,但从纯python的角度来看,我就是这样存储原始tweet json的: import tweetstream # Needed For Twitter API Capture (Make sure using modified version with proxy

我正在使用java和twitter搜索api在mongoDB中存储推文,有没有办法按原样存储来自twitter api的json


我也在考虑使用python,有没有办法在python中实现这一点?

我不知道您是否仍然对这个问题感兴趣,但从纯python的角度来看,我就是这样存储原始tweet json的:

import tweetstream # Needed For Twitter API Capture (Make sure using modified version with proxy support)
import argparse    # Needed for taking cmd line input
import gzip        # Needed for compressing output
import json        # Needed for Data conversion for easier DB import
import ast         # Also Needed for Data conversion

collector = argparse.ArgumentParser(description='Collect a lot of Tweets')        # This line sets up the argument collector
collector.add_argument('--username', dest='username', action="store")             # This line collects the Username
collector.add_argument('--password', dest='password', action="store")             # This line collects the password
collector.add_argument('--outputfilename', dest='outputfilename', action="store") # This line collects the output filename

args = collector.parse_args()                                                     # Setup args to store cmd line arguments

def printusername():                                                              # define the username argument

        print args.username

def printpassword():                                                              # define the password argument

        print args.password

def printoutputfilename():                                                        # define the output filename

        print args.outputfilename

output=gzip.open(args.outputfilename, "a")                                        # Open the output file for GZIP writing

with tweetstream.TweetStream(args.username, args.password) as stream:             # Open the Twitter Stream
    for tweet in stream:                                                          # For each tweet within the twitter stream
        line = str(tweet)                                                         # turn the tweet into a string
        line = ast.literal_eval(line)                                             # evaluate the python string (dictionary)
        line = json.dumps(line)                                                   # turn the python dictionary into valid JSON
        output.write(line)                                                        # write the line to the output file
        output.write("\n")  
要运行它,只需:“python myscript.py--username yourusername--password yourpassword--outputfilename yourpath和filename”

您需要安装tweetstream argparse gzip json和ast模块。所有这些都可以通过pip或easy_安装或大多数ubuntu/fedora软件包管理器安装

脚本将创建的输出文件是一个简单的gzip压缩文本文件,其中每一行都是一个包含完整tweet json对象的新json字符串。由于脚本一直运行到达到速率限制为止,因此它不会使用适当的EOF关闭gzip文件。但是python并不关心,所以您可以用另一个脚本打开它,7zip或winrar也不关心


我希望这能有所帮助

我不知道您是否仍然对这个问题感兴趣,但从纯python的角度来看,这就是我存储原始tweet json的方式:

import tweetstream # Needed For Twitter API Capture (Make sure using modified version with proxy support)
import argparse    # Needed for taking cmd line input
import gzip        # Needed for compressing output
import json        # Needed for Data conversion for easier DB import
import ast         # Also Needed for Data conversion

collector = argparse.ArgumentParser(description='Collect a lot of Tweets')        # This line sets up the argument collector
collector.add_argument('--username', dest='username', action="store")             # This line collects the Username
collector.add_argument('--password', dest='password', action="store")             # This line collects the password
collector.add_argument('--outputfilename', dest='outputfilename', action="store") # This line collects the output filename

args = collector.parse_args()                                                     # Setup args to store cmd line arguments

def printusername():                                                              # define the username argument

        print args.username

def printpassword():                                                              # define the password argument

        print args.password

def printoutputfilename():                                                        # define the output filename

        print args.outputfilename

output=gzip.open(args.outputfilename, "a")                                        # Open the output file for GZIP writing

with tweetstream.TweetStream(args.username, args.password) as stream:             # Open the Twitter Stream
    for tweet in stream:                                                          # For each tweet within the twitter stream
        line = str(tweet)                                                         # turn the tweet into a string
        line = ast.literal_eval(line)                                             # evaluate the python string (dictionary)
        line = json.dumps(line)                                                   # turn the python dictionary into valid JSON
        output.write(line)                                                        # write the line to the output file
        output.write("\n")  
要运行它,只需:“python myscript.py--username yourusername--password yourpassword--outputfilename yourpath和filename”

您需要安装tweetstream argparse gzip json和ast模块。所有这些都可以通过pip或easy_安装或大多数ubuntu/fedora软件包管理器安装

脚本将创建的输出文件是一个简单的gzip压缩文本文件,其中每一行都是一个包含完整tweet json对象的新json字符串。由于脚本一直运行到达到速率限制为止,因此它不会使用适当的EOF关闭gzip文件。但是python并不关心,所以您可以用另一个脚本打开它,7zip或winrar也不关心

我希望这能有所帮助