Python 2.7 通过浮点数据列从HDFStore中选择数据_Python 2.7_Pandas

Python 2.7 通过浮点数据列从HDFStore中选择数据

python-2.7 pandas

Python 2.7 通过浮点数据列从HDFStore中选择数据,python-2.7,pandas,Python 2.7,Pandas,我在HDFStore中有一个表，其中一列floatf存储为data\u列。我想选择行的子集，例如，f==0.6 我遇到了麻烦，我假设它与某个地方的浮点精度不匹配有关。以下是一个例子： In [1]: f = np.arange(0, 1, 0.1) In [2]: s = f.astype('S') In [3]: df = pd.DataFrame({'f': f, 's': s}) In [4]: df Out[4]: f s 0 0.0 0.0 1 0.1

我在

HDFStore

中有一个表，其中一列float

存储为

data\u列

。我想选择行的子集，例如，

f==0.6

我遇到了麻烦，我假设它与某个地方的浮点精度不匹配有关。以下是一个例子：

In [1]: f = np.arange(0, 1, 0.1)

In [2]: s = f.astype('S')

In [3]: df = pd.DataFrame({'f': f, 's': s})

In [4]: df
Out[4]: 
     f    s
0  0.0  0.0
1  0.1  0.1
2  0.2  0.2
3  0.3  0.3
4  0.4  0.4
5  0.5  0.5
6  0.6  0.6
7  0.7  0.7
8  0.8  0.8
9  0.9  0.9

[10 rows x 2 columns]

In [5]: with pd.get_store('test.h5', mode='w') as store:
   ...:     store.append('df', df, data_columns=True)
   ...:     

In [6]: with pd.get_store('test.h5', mode='r') as store:
   ...:     selection = store.select('df', 'f=f')
   ...:     

In [7]: selection
Out[7]: 
     f    s
0  0.0  0.0
1  0.1  0.1
2  0.2  0.2
4  0.4  0.4
5  0.5  0.5
8  0.8  0.8
9  0.9  0.9

[7 rows x 2 columns]

我希望查询返回所有的行，但是缺少几行。使用

的查询，其中

返回一个空表：
In [8]: with pd.get_store('test.h5', mode='r') as store:
    selection = store.select('df', 'f=0.3')
   ...:     

In [9]: selection
Out[9]: 
Empty DataFrame
Columns: [f, s]
Index: []

[0 rows x 2 columns]

我想知道这是否是预期的行为，如果是的话，是否有一个简单的解决方法，比如在pandas中为浮点查询设置精度限制？我使用的是0.13.1版：
In [10]: pd.__version__
Out[10]: '0.13.1-55-g7d3e41c'

我不这么认为，不。Pandas是围绕numpy构建的，除了测试诸如assert\u allclose
之类的实用程序之外，我从未见过任何近似浮点相等的工具，这在这里是没有帮助的
您所能做的最好的事情是：
In [17]: with pd.get_store('test.h5', mode='r') as store:
      selection = store.select('df', '(f > 0.2) & (f < 0.4)')
   ....:     

In [18]: selection
Out[18]: 
     f    s
3  0.3  0.3

[17]中的：将pd.get_store（'test.h5'，mode='r'）作为存储：
选择=存储。选择（'df'，'（f>0.2）和（f<0.4'））
....:     
在[18]中：选择
出[18]：
f-s
3  0.3  0.3

如果这是一个常见的习惯用法，请为它创建一个函数。您甚至可以通过合并来获得乐趣。
我不这么认为，不。Pandas是围绕numpy构建的，我从未见过任何近似浮点相等的工具，除了测试诸如assert\u allclose之类的实用程序，这在这里没有帮助
您所能做的最好的事情是：
In [17]: with pd.get_store('test.h5', mode='r') as store:
      selection = store.select('df', '(f > 0.2) & (f < 0.4)')
   ....:     

In [18]: selection
Out[18]: 
     f    s
3  0.3  0.3

[17]中的：将pd.get_store（'test.h5'，mode='r'）作为存储：
选择=存储。选择（'df'，'（f>0.2）和（f<0.4'））
....:     
在[18]中：选择
出[18]：
f-s
3  0.3  0.3

如果这是一个常见的习惯用法，请为它创建一个函数。您甚至可以通过合并来获得乐趣。
我不这么认为，不。Pandas是围绕numpy构建的，我从未见过任何近似浮点相等的工具，除了测试诸如assert\u allclose之类的实用程序，这在这里没有帮助
您所能做的最好的事情是：
In [17]: with pd.get_store('test.h5', mode='r') as store:
      selection = store.select('df', '(f > 0.2) & (f < 0.4)')
   ....:     

In [18]: selection
Out[18]: 
     f    s
3  0.3  0.3

[17]中的：将pd.get_store（'test.h5'，mode='r'）作为存储：
选择=存储。选择（'df'，'（f>0.2）和（f<0.4'））
....:     
在[18]中：选择
出[18]：
f-s
3  0.3  0.3

如果这是一个常见的习惯用法，请为它创建一个函数。您甚至可以通过合并来获得乐趣。
我不这么认为，不。Pandas是围绕numpy构建的，我从未见过任何近似浮点相等的工具，除了测试诸如assert\u allclose之类的实用程序，这在这里没有帮助
您所能做的最好的事情是：
In [17]: with pd.get_store('test.h5', mode='r') as store:
      selection = store.select('df', '(f > 0.2) & (f < 0.4)')
   ....:     

In [18]: selection
Out[18]: 
     f    s
3  0.3  0.3

[17]中的：将pd.get_store（'test.h5'，mode='r'）作为存储：
选择=存储。选择（'df'，'（f>0.2）和（f<0.4'））
....:     
在[18]中：选择
出[18]：
f-s
3  0.3  0.3

如果这是一个常见的习惯用法，请为它创建一个函数。你甚至可以通过合并来获得幻想