Python 从SQL数据库中的OHLC数据中选择7、14、20、50、200天的价格。

Python 从SQL数据库中的OHLC数据中选择7、14、20、50、200天的价格。,python,sql,sqlite,analysis,Python,Sql,Sqlite,Analysis,假设您有如下数据: AC-057|Ethanol CBOT (Pit) Liq Cont|20050329|0.121|0.123|0.121|0.123|47|233|32|219 AC-057|Ethanol CBOT (Pit) Liq Cont|20050330|0.124|0.124|0.122|0.122|68|233|0|219 AC-057|Ethanol CBOT (Pit) Liq Cont|20050331|0.123|0.123|0.123|0.123|68|246|57

假设您有如下数据:

AC-057|Ethanol CBOT (Pit) Liq Cont|20050329|0.121|0.123|0.121|0.123|47|233|32|219
AC-057|Ethanol CBOT (Pit) Liq Cont|20050330|0.124|0.124|0.122|0.122|68|233|0|219
AC-057|Ethanol CBOT (Pit) Liq Cont|20050331|0.123|0.123|0.123|0.123|68|246|57|226
AC-057|Ethanol CBOT (Pit) Liq Cont|20050401|0.122|0.122|0.122|0.122|5|241|5|221
AC-057|Ethanol CBOT (Pit) Liq Cont|20050404|0.12|0.12|0.12|0.12|1|240|0|220
AC-057|Ethanol CBOT (Pit) Liq Cont|20050405|0.12|0.12|0.12|0.12|5|241|0|220
AC-057|Ethanol CBOT (Pit) Liq Cont|20050406|0.12|0.12|0.12|0.12|4|241|2|220
AC-057|Ethanol CBOT (Pit) Liq Cont|20050407|0.119|0.119|0.116|0.116|30|233|23|209
AC-057|Ethanol CBOT (Pit) Liq Cont|20050408|0.115|0.115|0.115|0.115|35|217|34|194
AC-057|Ethanol CBOT (Pit) Liq Cont|20050411|0.117|0.117|0.117|0.117|5|217|0|194
AC-057|Ethanol CBOT (Pit) Liq Cont|20050412|0.117|0.117|0.117|0.117|5|217|2|194
AC-057|Ethanol CBOT (Pit) Liq Cont|20050413|0.117|0.117|0.117|0.117|9|217|0|194
AC-057|Ethanol CBOT (Pit) Liq Cont|20050414|0.117|0.117|0.117|0.117|9|217|0|194
AC-057|Ethanol CBOT (Pit) Liq Cont|20050415|0.117|0.117|0.117|0.117|9|218|4|190
AC-057|Ethanol CBOT (Pit) Liq Cont|20050418|0.117|0.117|0.117|0.117|5|218|0|190
AC-057|Ethanol CBOT (Pit) Liq Cont|20050419|0.119|0.119|0.119|0.119|5|218|5|190
AC-057|Ethanol CBOT (Pit) Liq Cont|20050420|0.119|0.119|0.119|0.119|0|218|0|190
AC-057|Ethanol CBOT (Pit) Liq Cont|20050421|0.119|0.119|0.119|0.119|5|218|0|190
AC-057|Ethanol CBOT (Pit) Liq Cont|20050422|0.119|0.119|0.119|0.119|5|223|0|190
AC-057|Ethanol CBOT (Pit) Liq Cont|20050425|0.119|0.119|0.119|0.119|0|223|0|190
AC-057|Ethanol CBOT (Pit) Liq Cont|20050426|0.119|0.119|0.119|0.119|0|223|0|190
SYMBOL|DESCRIPTION                |yyyymmdd|OPEN |HIGH |LOW  |CLOSE|.|.  |.|...
CREATE TABLE IF NOT EXISTS ma (
    id INTEGER PRIMARY KEY AUTOINCREMENT,
    symbol TEXT,
    description TEXT,
    year INTEGER,
    month INTEGER,
    day INTEGER,

    open REAL,
    high REAL,
    low  REAL,
    close REAL
);

CREATE INDEX ma_id_idx  ON ma(id);
CREATE INDEX ma_sym_idx ON ma(symbol);
CREATE INDEX ma_yea_idx ON ma(year);
CREATE INDEX ma_mon_idx ON ma(month);
CREATE INDEX ma_day_idx ON ma(day);

CREATE INDEX ma_open_idx  ON ma(open);
CREATE INDEX ma_high_idx  ON ma(high);
CREATE INDEX ma_low_idx   ON ma(low);
CREATE INDEX ma_close_idx ON ma(close);
import csv
import sqlite3 as lite

__infile__  = 'ma.csv'
__outfile__ = 'ma3.db'
input = csv.reader(open(__infile__, 'rb'), delimiter='|')
conn  = lite.connect(__outfile__)

ssql = """
    PRAGMA JOURNAL_MODE = MEMORY;

"""

isql = """
    INSERT INTO ma (
        symbol,
        description,
        year,
        month,
        day,
        open,
        high,
        low,
        close
    ) VALUES (
        ?, ?, ?, ?, ?, ?, ?, ?, ?
    )
"""

conn.executescript(ssql)

for row in input:
    year  = row[2][0:4]
    month = row[2][4:6]
    day   = row[2][6:8]
    tup   = (row[0], row[1], year, month, day, row[3], row[4], row[5], row[6])
    conn.execute(isql, tup)

conn.commit()
。。。有各种不同的符号

还有这样一个模式:

AC-057|Ethanol CBOT (Pit) Liq Cont|20050329|0.121|0.123|0.121|0.123|47|233|32|219
AC-057|Ethanol CBOT (Pit) Liq Cont|20050330|0.124|0.124|0.122|0.122|68|233|0|219
AC-057|Ethanol CBOT (Pit) Liq Cont|20050331|0.123|0.123|0.123|0.123|68|246|57|226
AC-057|Ethanol CBOT (Pit) Liq Cont|20050401|0.122|0.122|0.122|0.122|5|241|5|221
AC-057|Ethanol CBOT (Pit) Liq Cont|20050404|0.12|0.12|0.12|0.12|1|240|0|220
AC-057|Ethanol CBOT (Pit) Liq Cont|20050405|0.12|0.12|0.12|0.12|5|241|0|220
AC-057|Ethanol CBOT (Pit) Liq Cont|20050406|0.12|0.12|0.12|0.12|4|241|2|220
AC-057|Ethanol CBOT (Pit) Liq Cont|20050407|0.119|0.119|0.116|0.116|30|233|23|209
AC-057|Ethanol CBOT (Pit) Liq Cont|20050408|0.115|0.115|0.115|0.115|35|217|34|194
AC-057|Ethanol CBOT (Pit) Liq Cont|20050411|0.117|0.117|0.117|0.117|5|217|0|194
AC-057|Ethanol CBOT (Pit) Liq Cont|20050412|0.117|0.117|0.117|0.117|5|217|2|194
AC-057|Ethanol CBOT (Pit) Liq Cont|20050413|0.117|0.117|0.117|0.117|9|217|0|194
AC-057|Ethanol CBOT (Pit) Liq Cont|20050414|0.117|0.117|0.117|0.117|9|217|0|194
AC-057|Ethanol CBOT (Pit) Liq Cont|20050415|0.117|0.117|0.117|0.117|9|218|4|190
AC-057|Ethanol CBOT (Pit) Liq Cont|20050418|0.117|0.117|0.117|0.117|5|218|0|190
AC-057|Ethanol CBOT (Pit) Liq Cont|20050419|0.119|0.119|0.119|0.119|5|218|5|190
AC-057|Ethanol CBOT (Pit) Liq Cont|20050420|0.119|0.119|0.119|0.119|0|218|0|190
AC-057|Ethanol CBOT (Pit) Liq Cont|20050421|0.119|0.119|0.119|0.119|5|218|0|190
AC-057|Ethanol CBOT (Pit) Liq Cont|20050422|0.119|0.119|0.119|0.119|5|223|0|190
AC-057|Ethanol CBOT (Pit) Liq Cont|20050425|0.119|0.119|0.119|0.119|0|223|0|190
AC-057|Ethanol CBOT (Pit) Liq Cont|20050426|0.119|0.119|0.119|0.119|0|223|0|190
SYMBOL|DESCRIPTION                |yyyymmdd|OPEN |HIGH |LOW  |CLOSE|.|.  |.|...
CREATE TABLE IF NOT EXISTS ma (
    id INTEGER PRIMARY KEY AUTOINCREMENT,
    symbol TEXT,
    description TEXT,
    year INTEGER,
    month INTEGER,
    day INTEGER,

    open REAL,
    high REAL,
    low  REAL,
    close REAL
);

CREATE INDEX ma_id_idx  ON ma(id);
CREATE INDEX ma_sym_idx ON ma(symbol);
CREATE INDEX ma_yea_idx ON ma(year);
CREATE INDEX ma_mon_idx ON ma(month);
CREATE INDEX ma_day_idx ON ma(day);

CREATE INDEX ma_open_idx  ON ma(open);
CREATE INDEX ma_high_idx  ON ma(high);
CREATE INDEX ma_low_idx   ON ma(low);
CREATE INDEX ma_close_idx ON ma(close);
import csv
import sqlite3 as lite

__infile__  = 'ma.csv'
__outfile__ = 'ma3.db'
input = csv.reader(open(__infile__, 'rb'), delimiter='|')
conn  = lite.connect(__outfile__)

ssql = """
    PRAGMA JOURNAL_MODE = MEMORY;

"""

isql = """
    INSERT INTO ma (
        symbol,
        description,
        year,
        month,
        day,
        open,
        high,
        low,
        close
    ) VALUES (
        ?, ?, ?, ?, ?, ?, ?, ?, ?
    )
"""

conn.executescript(ssql)

for row in input:
    year  = row[2][0:4]
    month = row[2][4:6]
    day   = row[2][6:8]
    tup   = (row[0], row[1], year, month, day, row[3], row[4], row[5], row[6])
    conn.execute(isql, tup)

conn.commit()
还有一个python脚本,它将数据导入数据库,如下所示:

AC-057|Ethanol CBOT (Pit) Liq Cont|20050329|0.121|0.123|0.121|0.123|47|233|32|219
AC-057|Ethanol CBOT (Pit) Liq Cont|20050330|0.124|0.124|0.122|0.122|68|233|0|219
AC-057|Ethanol CBOT (Pit) Liq Cont|20050331|0.123|0.123|0.123|0.123|68|246|57|226
AC-057|Ethanol CBOT (Pit) Liq Cont|20050401|0.122|0.122|0.122|0.122|5|241|5|221
AC-057|Ethanol CBOT (Pit) Liq Cont|20050404|0.12|0.12|0.12|0.12|1|240|0|220
AC-057|Ethanol CBOT (Pit) Liq Cont|20050405|0.12|0.12|0.12|0.12|5|241|0|220
AC-057|Ethanol CBOT (Pit) Liq Cont|20050406|0.12|0.12|0.12|0.12|4|241|2|220
AC-057|Ethanol CBOT (Pit) Liq Cont|20050407|0.119|0.119|0.116|0.116|30|233|23|209
AC-057|Ethanol CBOT (Pit) Liq Cont|20050408|0.115|0.115|0.115|0.115|35|217|34|194
AC-057|Ethanol CBOT (Pit) Liq Cont|20050411|0.117|0.117|0.117|0.117|5|217|0|194
AC-057|Ethanol CBOT (Pit) Liq Cont|20050412|0.117|0.117|0.117|0.117|5|217|2|194
AC-057|Ethanol CBOT (Pit) Liq Cont|20050413|0.117|0.117|0.117|0.117|9|217|0|194
AC-057|Ethanol CBOT (Pit) Liq Cont|20050414|0.117|0.117|0.117|0.117|9|217|0|194
AC-057|Ethanol CBOT (Pit) Liq Cont|20050415|0.117|0.117|0.117|0.117|9|218|4|190
AC-057|Ethanol CBOT (Pit) Liq Cont|20050418|0.117|0.117|0.117|0.117|5|218|0|190
AC-057|Ethanol CBOT (Pit) Liq Cont|20050419|0.119|0.119|0.119|0.119|5|218|5|190
AC-057|Ethanol CBOT (Pit) Liq Cont|20050420|0.119|0.119|0.119|0.119|0|218|0|190
AC-057|Ethanol CBOT (Pit) Liq Cont|20050421|0.119|0.119|0.119|0.119|5|218|0|190
AC-057|Ethanol CBOT (Pit) Liq Cont|20050422|0.119|0.119|0.119|0.119|5|223|0|190
AC-057|Ethanol CBOT (Pit) Liq Cont|20050425|0.119|0.119|0.119|0.119|0|223|0|190
AC-057|Ethanol CBOT (Pit) Liq Cont|20050426|0.119|0.119|0.119|0.119|0|223|0|190
SYMBOL|DESCRIPTION                |yyyymmdd|OPEN |HIGH |LOW  |CLOSE|.|.  |.|...
CREATE TABLE IF NOT EXISTS ma (
    id INTEGER PRIMARY KEY AUTOINCREMENT,
    symbol TEXT,
    description TEXT,
    year INTEGER,
    month INTEGER,
    day INTEGER,

    open REAL,
    high REAL,
    low  REAL,
    close REAL
);

CREATE INDEX ma_id_idx  ON ma(id);
CREATE INDEX ma_sym_idx ON ma(symbol);
CREATE INDEX ma_yea_idx ON ma(year);
CREATE INDEX ma_mon_idx ON ma(month);
CREATE INDEX ma_day_idx ON ma(day);

CREATE INDEX ma_open_idx  ON ma(open);
CREATE INDEX ma_high_idx  ON ma(high);
CREATE INDEX ma_low_idx   ON ma(low);
CREATE INDEX ma_close_idx ON ma(close);
import csv
import sqlite3 as lite

__infile__  = 'ma.csv'
__outfile__ = 'ma3.db'
input = csv.reader(open(__infile__, 'rb'), delimiter='|')
conn  = lite.connect(__outfile__)

ssql = """
    PRAGMA JOURNAL_MODE = MEMORY;

"""

isql = """
    INSERT INTO ma (
        symbol,
        description,
        year,
        month,
        day,
        open,
        high,
        low,
        close
    ) VALUES (
        ?, ?, ?, ?, ?, ?, ?, ?, ?
    )
"""

conn.executescript(ssql)

for row in input:
    year  = row[2][0:4]
    month = row[2][4:6]
    day   = row[2][6:8]
    tup   = (row[0], row[1], year, month, day, row[3], row[4], row[5], row[6])
    conn.execute(isql, tup)

conn.commit()
如何收集一组记录以生成此架构:

CREATE TABLE trends (
    id INTEGER PRIMARY KEY AUTOINCREMENT,
    symbol TEXT,
    date DATE,
    p1 REAL,
    p20 REAL,
    p50 REAL,
    p100 REAL,
    p200 REAL
);
在该特定符号的每个日期点

我试过很多东西。特别是最后一个,要花很长时间,所以我不知道它是否会起作用。(好吧,它不起作用,因为它需要一周的计算时间)。最初的csv数据现在是250兆,但在未来它将增长到2.5兆或更多,我可能不得不使用更大的数据库

以下是我尝试过(或正在尝试)的其他东西:

谢谢



tl;dr:对于原始列表中的每个条目,我想使用收盘价获得每个符号在每个日期的7、14、20、50、100、200天价格。把它放在桌子上。我更喜欢使用纯SQL,但是python也可以工作。

您可能会松一口气,因为已经有一个python库针对滚动财务计算进行了优化。。。它被称为

我不认为;但是,将从csv中读取。。。我冒昧地使用了csv数据(您似乎已将其存储在
ma.csv
)。。。一旦你做到了这一点,在你的收盘时得到一个滚动的7天平均值就很简单了

>>> import pandas as pn
>>> from datetime import date
>>> df = pn.read_csv('fut.csv', index_col=2, parse_dates=[2])
>>> pn.rolling_mean(df['CLOSE'], window=7)
yyyymmdd
2005-03-29         NaN
2005-03-30         NaN
2005-03-31         NaN
2005-04-01         NaN
2005-04-04         NaN
2005-04-05         NaN
2005-04-06    0.121429
2005-04-07    0.120429
2005-04-08    0.119429
2005-04-11    0.118571
2005-04-12    0.117857
2005-04-13    0.117429
2005-04-14    0.117000
2005-04-15    0.116571
2005-04-18    0.116714
2005-04-19    0.117286
2005-04-20    0.117571
2005-04-21    0.117857
2005-04-22    0.118143
2005-04-25    0.118429
2005-04-26    0.118714
>>>
>>> pn.rolling_mean(df['CLOSE'], window=7)[date(2005,4,26)]
0.11871428571428572
>>>
df
以上是一种特殊结构,用于保存与对象关联的时间索引值表。。。在这种情况下,
DataFrame
会保存您的高位、低位、收盘等

除了让你的工作更容易之外,你还可以把大部分的重物转移到Cython身上,这使得运行数千次这样的计算变得相当快


fut.csv
这真的很有帮助。我不知道还有其他的数学库。也许您可以向其他库添加一些输入,或者添加一些关于SQL代码的注释?这里有一些设置逻辑,我真的很想得到一个解决方案。我想我不知道我还能说什么关于你的SQL以外,你可能会考虑使用,因为它更优化的搜索范围内的日期时相比,SQL。至于设置逻辑,也许最好问另一个问题,因为我们试图将问题范围限定在单个主题上。