Python 使用smart_open从Amazon S3流式传输数据导致类型错误
我正在尝试将数据从Amazon S3中的一个大文本文件流式传输到我的AWS Lambda,我正在使用smart_open来实现这一点,下面是我的测试代码-Python 使用smart_open从Amazon S3流式传输数据导致类型错误,python,python-3.x,amazon-s3,aws-lambda,Python,Python 3.x,Amazon S3,Aws Lambda,我正在尝试将数据从Amazon S3中的一个大文本文件流式传输到我的AWS Lambda,我正在使用smart_open来实现这一点,下面是我的测试代码- import smart_open def stream_data(): my_bucket = 'monkey-business-dev' my_key = 'incoming_monkey_data/banana/banana' uri = 's3://{}/{}'.format(my_bucket, my_ke
import smart_open
def stream_data():
my_bucket = 'monkey-business-dev'
my_key = 'incoming_monkey_data/banana/banana'
uri = 's3://{}/{}'.format(my_bucket, my_key)
total_lines = 0
total_records = 0
for line in smart_open.smart_open(uri):
total_records += 1
if __name__ == '__main__':
stream_data()
我正在使用Python3x,我面临着这个例外-
/usr/local/lib/python3.6/site-packages/odo/backends/pandas.py:94: FutureWarning: pandas.tslib is deprecated and will be removed in a future version.
You can access NaTType as type(pandas.NaT)
@convert.register((pd.Timestamp, pd.Timedelta), (pd.tslib.NaTType, type(None)))
Traceback (most recent call last):
File "/Users/xxxx/PycharmProjects/monkey_lambda/datastream_from_s3.py", line 16, in <module>
stream_data()
File "/Users/xxxx/PycharmProjects/monkey_lambda/datastream_from_s3.py", line 11, in stream_data
for line in smart_open.smart_open(uri):
File "/usr/local/lib/python3.6/site-packages/smart_open/smart_open_lib.py", line 163, in smart_open
bucket = s3_connection.get_bucket(parsed_uri.bucket_id)
File "/usr/local/lib/python3.6/site-packages/boto/s3/connection.py", line 509, in get_bucket
return self.head_bucket(bucket_name, headers=headers)
File "/usr/local/lib/python3.6/site-packages/boto/s3/connection.py", line 528, in head_bucket
response = self.make_request('HEAD', bucket_name, headers=headers)
File "/usr/local/lib/python3.6/site-packages/boto/s3/connection.py", line 671, in make_request
retry_handler=retry_handler
File "/usr/local/lib/python3.6/site-packages/boto/connection.py", line 1071, in make_request
retry_handler=retry_handler)
File "/usr/local/lib/python3.6/site-packages/boto/connection.py", line 913, in _mexe
self.is_secure)
File "/usr/local/lib/python3.6/site-packages/boto/connection.py", line 705, in get_http_connection
return self.new_http_connection(host, port, is_secure)
File "/usr/local/lib/python3.6/site-packages/boto/connection.py", line 747, in new_http_connection
connection = self.proxy_ssl(host, is_secure and 443 or 80)
File "/usr/local/lib/python3.6/site-packages/boto/connection.py", line 796, in proxy_ssl
sock.sendall("CONNECT %s HTTP/1.0\r\n" % host)
TypeError: a bytes-like object is required, not 'str'
/usr/local/lib/python3.6/site-packages/odo/backends/pandas.py:94: FutureWarning: pandas.tslib is deprecated and will be removed in a future version.
You can access NaTType as type(pandas.NaT)
@convert.register((pd.Timestamp, pd.Timedelta), (pd.tslib.NaTType, type(None)))
Traceback (most recent call last):
File "/Users/xxxxx/PycharmProjects/monkey_lambda/datastream_from_s3.py", line 16, in <module>
stream_data()
File "/Users/xxxxx/PycharmProjects/monkey_lambda/datastream_from_s3.py", line 11, in stream_data
for line in smart_open.smart_open(uri.encode()):
File "/usr/local/lib/python3.6/site-packages/smart_open/smart_open_lib.py", line 208, in smart_open
raise TypeError('don\'t know how to handle uri %s' % repr(uri))
TypeError: don't know how to handle uri b's3://monkey-business-dev/incoming_monkey_data/banana/banana'
Process finished with exit code 1