通过echo管道将python变量（字符串）传递给bash命令_Python_Bash_Subprocess_Stdin_Biopython

通过echo管道将python变量（字符串）传递给bash命令

python bash

通过echo管道将python变量（字符串）传递给bash命令,python,bash,subprocess,stdin,biopython,Python,Bash,Subprocess,Stdin,Biopython,在命令行（bash）上将python中的字符串（python变量）作为输入传递给序列对齐程序（muscle）时遇到问题肌肉可以从命令行获取stdin，例如 ~# echo -e ">1\nATTTCTCT\n>2\nATTTCTCC" | muscle MUSCLE v3.8.31 by Robert C. Edgar http://www.drive5.com/muscle This software is donated to the public domain. Pleas

在命令行（bash）上将python中的字符串（python变量）作为输入传递给序列对齐程序（

muscle

）时遇到问题<代码>肌肉可以从命令行获取stdin，例如

~# echo -e ">1\nATTTCTCT\n>2\nATTTCTCC" | muscle

MUSCLE v3.8.31 by Robert C. Edgar

http://www.drive5.com/muscle
This software is donated to the public domain.
Please cite: Edgar, R.C. Nucleic Acids Res 32(5), 1792-97.

- 2 seqs, max length 8, avg  length 8
00:00:00     22 MB(2%)  Iter   1  100.00%  K-mer dist pass 1
00:00:00     22 MB(2%)  Iter   1  100.00%  K-mer dist pass 2
00:00:00     23 MB(2%)  Iter   1  100.00%  Align node       
00:00:00     23 MB(2%)  Iter   1  100.00%  Root alignment
>1
ATTTCTCT
>2
ATTTCTCC

我所追求的正是这个fasta对齐（最后4行）-您可以重定向肌肉的输出（

echo-e“>1\nATTTCTCT\n>2\nATTTCTCC”| muscle>out.file

以获得fasta对齐，这是下游处理所需的。但要实现这一点，我必须将fasta序列字符串传递给“muscle”，我认为最好通过bash中的

echo

实现

因此，该脚本获取两个multiFASTA文件，并根据每个文件的ID列表对每个FASTA序列进行配对-这是可行的（尽管我意识到这可能不是最有效的方法-我是python的新用户）。然后，在计算距离/差异之前，我需要对齐

muscle

中的每个集合

以下是我到目前为止的情况：

#! /env/python
from pairwise_distances import K2Pdistance
import pairwise_distances
import subprocess
from subprocess import call
import os     

fasta1=['']
for line in open('test1.fasta'):
    if not line.startswith('>'):
         fasta1.append(line.strip())

fasta2=['']
for line in open('test2.fasta'):
    if not line.startswith('>'):
         fasta2.append(line.strip())

for l1, l2 in zip(open('test1.list'), open ('test2.list')):
 try:
   a=fasta1[int(l1)]
 except IndexError,e:  
   a="GGG"

 try:
   b=fasta2[int(l2)]
 except (IndexError):
   b="CCC"

 temp_align=str(">1"+'\n'+a+'\n'+">2"+'\n'+b) 

 first=subprocess.check_output(['echo','-e',temp_align])
 print first
 subprocess.call(['bash','muscle'], stdin=first.stdout)
 print second

 #new=K2Pdistance(outfast1,outfast2) 
 #subprocess.Popen(['bash','muscle'], stdin=subprocess.check_output(['echo','-e',temp_align], stdout=subprocess.PIPE).std.out)`

“temp_align”变量是我想传递给muscle的变量，它是将每个multiFASTA文件中的相应fasta序列组合在一起的结果，用于id/列表上的每个循环，并且格式类似于fasta文件

这个问题是我可以

echo

FASTA字符串，但我似乎无法通过stdin将其“管道”到muscle…我得到的主要错误是：

AttributeError:'str'对象没有属性“stdout”

~#python Beta3.py 
>1
ATCGACTACT
>2
ATCGCGCTACT

Traceback (most recent call last):
  File "Beta3.py", line 38, in <module>
    subprocess.call(['bash','muscle'], stdin=first.stdout)
AttributeError: 'str' object has no attribute 'stdout'

编辑2：前一篇文章是相关的，但我的问题是关于链接两个bash命令，从一个字符串python变量开始…然后，理想情况下，将第二个命令的stdout捕获回一个python变量…我不太明白如何将那篇文章上的答案转换为我特定问题的解决方案…我想我不完全理解海报想要做什么。

似乎您想要与

肌肉

过程进行沟通，那么您需要一个管道，使用此管道

(out, err) = subprocess.Popen(['muscle'], stdin=subprocess.PIPE, stdout=subprocess.PIPE, stderr=subprocess.PIPE).communicate(temp_align)
print out

first

是一个字符串，要将其发送到命令，请参见：为什么

子进程调用（['bash'，'muscle'，stdin=first.stdout）

而不仅仅是

子进程调用（['muscle'，stdin=first.stdout）

？这没有任何区别（字符串错误仍然存在），但我想这里不需要“bash”。子流程的结果是检查输出是一个字符串（执行子流程的stdout）。您可以使用

子流程。调用或提供一个类似字符串的对象作为stdin参数，这样可以简化问题。这个问题已经作为重复问题解决了，因此我无法提供完整答案，但BioPython为muscle提供了一个包装器，用于执行多序列对齐，请参见教程中的。啊，我现在明白了.非常感谢你的帮助。
(out, err) = subprocess.Popen(['muscle'], stdin=subprocess.PIPE, stdout=subprocess.PIPE, stderr=subprocess.PIPE).communicate(temp_align)
print out