Sql 在pandas中迭代行的有效方法_Sql_Performance_Python 2.7_Pandas_Oracle11g

Sql 在pandas中迭代行的有效方法

sql performance python-2.7 pandas oracle11g

Sql 在pandas中迭代行的有效方法,sql,performance,python-2.7,pandas,oracle11g,Sql,Performance,Python 2.7,Pandas,Oracle11g,我在pandas中有一个数据框，包含以下信息对事务_ID中的每个条目使用for循环，我调用以下函数 def checkForImages(TransNum): """pass function a transaction number and get the string with image found information then store that string into the same row in a new column""" try: cursor.execut

我在pandas中有一个数据框，包含以下信息

对事务_ID中的每个条目使用for循环，我调用以下函数

def checkForImages(TransNum):
"""pass function a transaction number and get the string with image found information then store that
string into the same row in a new column"""
try:
    cursor.execute('select CAMERA_TYPE from VEHICLE_IMAGE where TRANSACTION_ID=' + str(TransNum))
    result = ''
    for img_type in cursor:
        result = result + img_type[0]
    if result == '':
        result = 'No image available'
    print 'Images found: ' + str(TransNum) + " "+ result
    resultSort = result.split()
    resultSort.sort()
    result = ''
    for i in range(len(resultSort)):
        result = result + " " + resultSort[i]
    cursor.close()
    return result
except Exception as e:
    # print 'Error occured while getting image references: ', e
    pass

此函数返回一个字符串，该字符串要么为“无可用图像”，要么包含图像信息（如果找到）。我必须在填充了这个结果的数据框中创建一个新列，所以我的最终数据框应该如下所示

我的问题是：我如何加快这个过程？在具有100k+条目的行上使用for循环是非常缓慢和痛苦的。我已经研究过像dataframe.map和dataframe.apply这样的函数，但是没有能够让它工作。我看到的其他选项是使用cython或多线程。我应该在哪种选择上投入时间？感谢您提供的任何帮助
您为每个事务查询Oracle，然后在一个循环中为每个事务另外聚合获取的数据-这是非常低效的
首先，我将创建一个“映射”数据框架，如下所示：

transaction_id images 111 No image available 112 FRONT REAR 113 OVERVIEW
这可以通过以下方式实现：
之后，我们可以使用
Series.map（）
方法：

df['Image_Found'] = df.transaction_id.map(cam.images)

请发布实际数据。。。不是图像。通过提供我们可以完美复制/粘贴的数据，使我们能够轻松帮助您！非常感谢
df['Image_Found'] = df.transaction_id.map(cam.images)