Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/288.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181

Warning: file_get_contents(/data/phpspider/zhask/data//catemap/3/arrays/13.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181

Warning: file_get_contents(/data/phpspider/zhask/data//catemap/9/java/309.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
在Python中重新格式化制表符分隔的数据_Python_Arrays_Pandas_Numpy_Tab Delimited - Fatal编程技术网

在Python中重新格式化制表符分隔的数据

在Python中重新格式化制表符分隔的数据,python,arrays,pandas,numpy,tab-delimited,Python,Arrays,Pandas,Numpy,Tab Delimited,我在Excel中有一些制表符分隔的数据,需要重新格式化才能在Tableau中工作。下面是它的样子: State 2001 2002 2003 2004 2005 2006 2007 Alabama 5.6 5.71 5.88 6.08 6.46 7.07 7.57 Alaska 10.54 10.46 10.5 10.99 11.72 12.84 13.28 Arizona 7.27 7.21

我在Excel中有一些制表符分隔的数据,需要重新格式化才能在Tableau中工作。下面是它的样子:

State   2001    2002    2003    2004    2005    2006    2007
Alabama 5.6 5.71    5.88    6.08    6.46    7.07    7.57
Alaska  10.54   10.46   10.5    10.99   11.72   12.84   13.28
Arizona 7.27    7.21    7.34    7.45    7.79    8.24    8.54
Arkansas    6.05    5.61    5.57    5.67    6.3 6.99    6.96
以下是我需要它的外观:

State   Cost    Date
Alabama 5.6 12/31/2001
Alabama 5.71    12/31/2002
Alabama 5.88    12/31/2003
Alabama 6.08    12/31/2004
Alabama 6.46    12/31/2005
Alabama 7.07    12/31/2006
Alabama 7.57    12/31/2007
Alaska  10.54   12/31/2001
Alaska  10.46   12/31/2002
Alaska  10.5    12/31/2003
Alaska  10.99   12/31/2004
Alaska  11.72   12/31/2005
Alaska  12.84   12/31/2006
Alaska  13.28   12/31/2007
Arizona 7.27    12/31/2001
Arizona 7.21    12/31/2002
Arizona 7.34    12/31/2003
Arizona 7.45    12/31/2004
Arizona 7.79    12/31/2005
Arizona 8.24    12/31/2006
Arizona 8.54    12/31/2007
Arkansas    6.05    12/31/2001
Arkansas    5.61    12/31/2002
Arkansas    5.57    12/31/2003
Arkansas    5.67    12/31/2004
Arkansas    6.3 12/31/2005
Arkansas    6.99    12/31/2006
Arkansas    6.96    12/31/2007

在Python中实现这一点的最佳方法是什么?我熟悉Numpy和Pandas,所以这些都是选项,但我真正想要的是Python吐出重新格式化的数据,这样我就可以轻松地粘贴到Excel中

在熊猫中,我会这样做:

假设您有以下数据框(从Excel读取):

解决方案:

In [102]: d = pd.melt(df, 'State', var_name='Date', value_name='Cost')

In [103]: d.assign(Date=pd.to_datetime(d['Date'])+pd.offsets.YearEnd())
Out[103]:
       State       Date   Cost
0    Alabama 2001-12-31   5.60
1     Alaska 2001-12-31  10.54
2    Arizona 2001-12-31   7.27
3   Arkansas 2001-12-31   6.05
4    Alabama 2002-12-31   5.71
5     Alaska 2002-12-31  10.46
6    Arizona 2002-12-31   7.21
7   Arkansas 2002-12-31   5.61
8    Alabama 2003-12-31   5.88
9     Alaska 2003-12-31  10.50
..       ...        ...    ...
18   Arizona 2005-12-31   7.79
19  Arkansas 2005-12-31   6.30
20   Alabama 2006-12-31   7.07
21    Alaska 2006-12-31  12.84
22   Arizona 2006-12-31   8.24
23  Arkansas 2006-12-31   6.99
24   Alabama 2007-12-31   7.57
25    Alaska 2007-12-31  13.28
26   Arizona 2007-12-31   8.54
27  Arkansas 2007-12-31   6.96

[28 rows x 3 columns]
您还可以轻松地将其保存为Excel文件:

d.assign(Date=pd.to_datetime(d['Date'])+pd.offsets.YearEnd()) \
 .to_excel(r'/path/to/output.xlsx', index=False)

您询问了如何直接在Python中执行此操作,
所以应该是这样的:

#!/usr/bin/env python
inputname  = 'Excel.txt'
outputname  = 'Tableau.txt'

with open( filename, 'r' ) as text:
  container = []
  for line in text:
    l = line.split()
    name = l[0]
    if name == 'State':
      dates = l
    else:
      end = len(l) -1
      for e in range(1, end):
        date = dates[e]
        cost = l[e]
        container.append(name, '12/31/' +date, cost)

with open(outputname, 'w') as out:
  stringified = '\n' .join(container)
  out.write(stringified)

非常感谢!我唯一的问题是,由于某种原因,所有日期都返回到1970-12-31。否则,那就太完美了@RagnarLothbrok,我很高兴能帮助你:如果你认为它已经回答了你的问题,请考虑/回答。
#!/usr/bin/env python
inputname  = 'Excel.txt'
outputname  = 'Tableau.txt'

with open( filename, 'r' ) as text:
  container = []
  for line in text:
    l = line.split()
    name = l[0]
    if name == 'State':
      dates = l
    else:
      end = len(l) -1
      for e in range(1, end):
        date = dates[e]
        cost = l[e]
        container.append(name, '12/31/' +date, cost)

with open(outputname, 'w') as out:
  stringified = '\n' .join(container)
  out.write(stringified)