Python 为机器学习选择正确的数据类型

Python 为机器学习选择正确的数据类型,python,scipy,scikit-learn,Python,Scipy,Scikit Learn,我一直对它非常好奇,我习惯于学习 我能够毫无问题地编译代码,并生成图形 我想使用不同的数据源。目前他们使用的是股票价格: d1 = datetime.datetime(2003, 01, 01) d2 = datetime.datetime(2008, 01, 01) symbol_dict = { 'TOT': 'Total', 'XOM': 'Exxon', 'CVX': 'Chevron', 'COP': 'ConocoPh

我一直对它非常好奇,我习惯于学习

我能够毫无问题地编译代码,并生成图形

我想使用不同的数据源。目前他们使用的是股票价格:

d1 = datetime.datetime(2003, 01, 01)
d2 = datetime.datetime(2008, 01, 01)

symbol_dict = {
        'TOT': 'Total',
        'XOM': 'Exxon',
        'CVX': 'Chevron',
        'COP': 'ConocoPhillips',
     ...
...
    }

symbols, names = np.array(symbol_dict.items()).T

quotes = [finance.quotes_historical_yahoo(symbol, d1, d2, asobject=True)
          for symbol in symbols]

open = np.array([q.open for q in quotes]).astype(np.float)
close = np.array([q.close for q in quotes]).astype(np.float)
  • quotes
    返回什么?我知道这是每只股票的价格,但我得到的是这样的东西:
  • [rec.array([(datetime.date(2003,1,2)),2003,1,2,731217.0, 28.12235692134198, 28.5, 28.564279672963064, 28.09825204398083, 12798800.0, 28.5), (datetime.date(2003,1,3),2003,1,3,731218.0,28.329084507042257,28.53,28.634476056338034,28.2889014408450719221900.0,28.53), (datetime.date(2003,1,6),2003,1,6,731221.0,28.482778999450247,29.23,29.406761957119297,28.4506404617921911925100.0,29.23),

  • 我想输入我自己的数据集。你能给我一个可以输入到
    引号中的数据集示例吗
  • 整个代码如下所示:


    如果您在ipython中执行
    finance.quotes\u historical\u yahoo?
    ,它会告诉您:

    In [53]: finance.quotes_historical_yahoo?
    Type:       function
    String Form:<function quotes_historical_yahoo at 0x10f311d70>
    File:       /Users/dvelkov/src/matplotlib/lib/matplotlib/finance.py
    Definition: finance.quotes_historical_yahoo(ticker, date1, date2, asobject=False, adjusted=True, cachename=None)
    Docstring:
    Get historical data for ticker between date1 and date2.  date1 and
    date2 are datetime instances or (year, month, day) sequences.
    
    See :func:`parse_yahoo_historical` for explanation of output formats
    and the *asobject* and *adjusted* kwargs.
    
    ...(more stuff)
    

    在你的例子中,你使用的是
    asobject=True
    所以你得到的格式是
    日期、年、月、日、d、开、关、高、低、音量、调整的\u关

    看起来我有ipython,我怎么打开它呢?我通过键入
    ipython
    从终端打开它。你在使用什么操作系统?从来没有在windows上使用过它,这是有说明的再次非常感谢。仅供参考,关于这篇文章,我问了另一个问题
    In [54]: finance.parse_yahoo_historical?
    Type:       function
    String Form:<function parse_yahoo_historical at 0x10f996ed8>
    File:       /Users/dvelkov/src/matplotlib/lib/matplotlib/finance.py
    Definition: finance.parse_yahoo_historical(fh, adjusted=True, asobject=False)
    Docstring:
    Parse the historical data in file handle fh from yahoo finance.
    
    *adjusted*
      If True (default) replace open, close, high, and low prices with
      their adjusted values. The adjustment is by a scale factor, S =
      adjusted_close/close. Adjusted prices are actual prices
      multiplied by S.
    
      Volume is not adjusted as it is already backward split adjusted
      by Yahoo. If you want to compute dollars traded, multiply volume
      by the adjusted close, regardless of whether you choose adjusted
      = True|False.
    
    
    *asobject*
      If False (default for compatibility with earlier versions)
      return a list of tuples containing
    
        d, open, close, high, low, volume
    
      If None (preferred alternative to False), return
      a 2-D ndarray corresponding to the list of tuples.
    
      Otherwise return a numpy recarray with
    
        date, year, month, day, d, open, close, high, low,
        volume, adjusted_close
    
      where d is a floating poing representation of date,
      as returned by date2num, and date is a python standard
      library datetime.date instance.
    
      The name of this kwarg is a historical artifact.  Formerly,
      True returned a cbook Bunch
      holding 1-D ndarrays.  The behavior of a numpy recarray is
      very similar to the Bunch.