Python 如何将一组csv文件转换为另一组csv文件（具有不同的文件名和格式）？_Python_String_Bash_Shell_Text

Python 如何将一组csv文件转换为另一组csv文件（具有不同的文件名和格式）？

python string bash shell text

Python 如何将一组csv文件转换为另一组csv文件（具有不同的文件名和格式）？,python,string,bash,shell,text,Python,String,Bash,Shell,Text,我有一些文件，包括以下格式的日终股票数据：文件名：NYSE_20120116.txt <ticker>,<date>,<open>,<high>,<low>,<close>,<vol> A,20120116,36.15,36.36,35.59,36.19,3327400 AA,20120116,10.73,10.78,10.53,10.64,20457600 ，，，，，， A、 20120116,36.15,

我有一些文件，包括以下格式的日终股票数据：

文件名：NYSE_20120116.txt

<ticker>,<date>,<open>,<high>,<low>,<close>,<vol>
A,20120116,36.15,36.36,35.59,36.19,3327400
AA,20120116,10.73,10.78,10.53,10.64,20457600

，，，，，，
A、 20120116,36.15,36.36,35.59,36.193327400
AA，2012016,10.73,10.78,10.53,10.6420457600

如何为每个符号创建文件？例如A公司

文件名：A.txt

<ticker>,<date>,<open>,<high>,<low>,<close>,<vol>
A,20120116,36.15,36.36,35.59,36.19,3327400
A,20120117,39.76,40.39,39.7,39.99,4157900

，，，，，，
A、 20120116,36.15,36.36,35.59,36.193327400
A、 20120117,39.76,40.39,39.7,39.994157900

是否要在记录级别拆分第一个文件，然后根据第一个字段的值将每行路由到不同的文件

 # To skip first line, see later
 cat endday.txt | while read line; do
     # Careful with backslashes here - they're not quote signs
     # If supported, use:
     # symbol=$( echo "$line" | cut -f1 -d, )
     symbol=`echo "$line" | cut -f1 -d,`

     # If file is not there, create it with a header
     # if [ ! -r $symbol.txt ]; then
     #    head -n 1 endday.txt > $symbol.txt
     # fi
     echo "$line" >> $symbol.txt
 done

效率不是很高：Perl或Python会更好

如果目录中有多个文件（请注意，您必须自己删除它们，否则它们将被一次又一次地处理…），您可以执行以下操作：

 for file in *.txt; do
    echo "Now processing $file..."
    # A quick and dirty way of ignoring line number 1 --- start at line 2.
    tail -n +2 $file | while read line; do
       # Careful with backslashes here - they're not quote signs
       # If supported, use:
       # symbol=$( echo "$line" | cut -f1 -d, )
       symbol=`echo "$line" | cut -f1 -d,`

       # If file is not there, create it with a header
       # if [ ! -r $symbol.txt ]; then
       #    head -n 1 $file > $symbol.csv
       # fi
       # Output file is named .CSV so as not to create new .txt files
       # which this script might find
       echo "$line" >> $symbol.csv
    done
    # Change the name from .txt to .txt.ok, so it won't be found again
    mv $file $file.ok
    # or better move it elsewhere to avoid clogging this directory
    # mv $file /var/data/files/already-processed
 done

是否可以将外部循环添加到此脚本，这样我就不必对每个源文件都执行此操作？再次感谢。我收到以下消息：脚本已启动，输出文件为typescript，然后我在文件夹中创建了一个小typescript文件。我正在将bash shell与macos lion一起使用。我已修复此问题（我必须键入。/script）。现在，我得到的错误是“。/script:line 5:$symbol.txt:dimensional redirect”…当然，必须在第一行就绪的情况下创建新文件（A.txt等）。否则，您可以在echo“$line”之前执行以下操作：“if[！-r$symbol.txt]；然后head-n 1$file>$symbol.txt；fi”--这将使用$file的第一行创建新文件$symbol.txt……只有在没有任何文本文件的目录下运行脚本时，才会发生这种情况（我应该添加一个存在性测试）。在这种情况下，“for file In.txt”不会展开，并且$file被分配值“.txt”，而不是按顺序分配值20120730.txt、20120731.txt等。因此，tail尝试查找*.txt，但无法找到。没有生成任何内容。是否可以通过更改脚本从结果文件中删除ticker列？