数据存储以简化Python中的数据插值_Python_Interpolation

数据存储以简化Python中的数据插值

python

数据存储以简化Python中的数据插值,python,interpolation,Python,Interpolation,我有20多张类似于表1的表格。其中所有字母表示实际值 Table 1: $ / cars |<1 | 2 | 3 | 4+ <10,000 | a | b | c | d 20,000 | e | f | g | h 30,000 | i | j | k | l 40,000+ | m | n | o | p 我应该如何存储表1中的数据（文件、dict、tuple of tuple或dict of list），以便能够最有效和正确地执行双线性插值双线性插值没有什么特别

我有20多张类似于表1的表格。其中所有字母表示实际值

Table 1:
$ / cars |<1 | 2 | 3 | 4+
<10,000  | a | b | c | d
20,000   | e | f | g | h
30,000   | i | j | k | l
40,000+  | m | n | o | p

我应该如何存储表1中的数据（文件、dict、tuple of tuple或dict of list），以便能够最有效和正确地执行双线性插值

双线性插值没有什么特别的地方会让你的用例特别奇怪；您只需执行两次查找（对于整行/整列的存储单元）或四次查找（对于阵列类型存储）。最有效的方法取决于您的访问模式和数据结构

如果您的示例真正具有代表性，总共有16个条目，那么您可以按自己的意愿存储它，并且对于任何类型的正常加载都足够快。

我会保留第一列的排序列表，并使用标准库中的

对分

模块查找值——这是获得立即更低和立即更高索引的最佳方法。其他每一列都可以作为与此列平行的另一个列表保存

如果您想要我能想到的计算效率最高的解决方案，并且不局限于标准库，那么我建议您使用scipy/numpy。首先，将a..p阵列存储为2D numpy阵列，然后将$4k-10k和1-4阵列存储为1D numpy阵列。如果两个1D数组都是单调递增的，则使用scipy的interpolate.interp1d；如果不是，则使用interpolate.bsplrep（双变量样条线表示），并且示例数组与示例一样小。或者只是写你自己的，而不必为scipy操心。以下是一些例子：

# this follows your pseudocode most closely, but it is *not*
# the most efficient since it creates the interpolation 
# functions on each call to bilinterp
from scipy import interpolate
import numpy
data = numpy.arange(0., 16.).reshape((4,4))  #2D array
prices = numpy.arange(10000., 50000., 10000.)
cars = numpy.arange(1., 5.)
def bilinterp(price,car):
    return interpolate.interp1d(cars, interpolate.interp1d(prices, a)(price))(car)
print bilinterp(22000,2)

上次我检查时（2007年ish的scipy版本），它只适用于单调递增的x和y数组）

对于像这个4x4阵列这样的小型阵列，我认为您需要使用：它将处理形状更有趣的曲面，函数只需创建一次。对于较大的阵列，我认为您需要这样做（不确定这是否具有与interp1d相同的限制）：

但是它们都需要与上述示例中的三个数组不同且更详细的数据结构。

请给出一些示例，我有一个类似的问题，但无法在O（log n）中解决。我喜欢这样，因为我已经在我的应用程序中使用了numpy：D谢谢

# this follows your pseudocode most closely, but it is *not*
# the most efficient since it creates the interpolation 
# functions on each call to bilinterp
from scipy import interpolate
import numpy
data = numpy.arange(0., 16.).reshape((4,4))  #2D array
prices = numpy.arange(10000., 50000., 10000.)
cars = numpy.arange(1., 5.)
def bilinterp(price,car):
    return interpolate.interp1d(cars, interpolate.interp1d(prices, a)(price))(car)
print bilinterp(22000,2)