Python逐行数据处理

Python逐行数据处理,python,bash,awk,Python,Bash,Awk,我是python新手,搜索了几篇文章,但没有找到正确的语法来读取文件和在python中进行awk行处理。我需要你的帮助来解决这个问题 while read line do case "$line" in */package*) continue ;; esac host_file_array+=("$line") done < ${HOST_FILE} for ((i=0 ; i < ${#host_file_array[*]}; i++)) do

我是python新手,搜索了几篇文章,但没有找到正确的语法来读取文件和在python中进行awk行处理。我需要你的帮助来解决这个问题

while read line
  do
    case "$line" in */package*) continue ;; esac
    host_file_array+=("$line")
  done < ${HOST_FILE}
 for ((i=0 ; i < ${#host_file_array[*]}; i++))
  do
    # echo "${host_file_array[i]}"
     host_file_line="${host_file_array[i]}"

     if [[ "$host_file_line" != "#"* ]];
     then
       COMPONENT_NAME=$(echo $host_file_line  | awk '{print $1;}' )
       DIRECTORY=$(echo $host_file_line  | awk '{print $2;}' )
       VERSION=$(echo $host_file_line  | awk '{print $3;}' )

       if [[ ("${COMPONENT_NAME}" == *"oracle"*)  ]];
       then
         print_parameters "Status ${DIRECTORY}/${COMPONENT_NAME}"
         /bin/bash ${DIRECTORY}/${COMPONENT_NAME}/current/script/manage-oracle.sh  ${FORMAT_STRING} start
       fi
     etc .........
这就是我用于构建和部署的bash脚本的外观,我在bash中读取了一个configurationf文件,如下所示

backup             /apps/backup
oracle             /opt/qosmon/qostool/oracle    oracle-client-12.1.0.1.0
while read line
  do
    case "$line" in */package*) continue ;; esac
    host_file_array+=("$line")
  done < ${HOST_FILE}
 for ((i=0 ; i < ${#host_file_array[*]}; i++))
  do
    # echo "${host_file_array[i]}"
     host_file_line="${host_file_array[i]}"

     if [[ "$host_file_line" != "#"* ]];
     then
       COMPONENT_NAME=$(echo $host_file_line  | awk '{print $1;}' )
       DIRECTORY=$(echo $host_file_line  | awk '{print $2;}' )
       VERSION=$(echo $host_file_line  | awk '{print $3;}' )

       if [[ ("${COMPONENT_NAME}" == *"oracle"*)  ]];
       then
         print_parameters "Status ${DIRECTORY}/${COMPONENT_NAME}"
         /bin/bash ${DIRECTORY}/${COMPONENT_NAME}/current/script/manage-oracle.sh  ${FORMAT_STRING} start
       fi
     etc .........
bash阅读部分的脚本如下所示

while read line
  do
    case "$line" in */package*) continue ;; esac
    host_file_array+=("$line")
  done < ${HOST_FILE}
 for ((i=0 ; i < ${#host_file_array[*]}; i++))
  do
    # echo "${host_file_array[i]}"
     host_file_line="${host_file_array[i]}"

     if [[ "$host_file_line" != "#"* ]];
     then
       COMPONENT_NAME=$(echo $host_file_line  | awk '{print $1;}' )
       DIRECTORY=$(echo $host_file_line  | awk '{print $2;}' )
       VERSION=$(echo $host_file_line  | awk '{print $3;}' )

       if [[ ("${COMPONENT_NAME}" == *"oracle"*)  ]];
       then
         print_parameters "Status ${DIRECTORY}/${COMPONENT_NAME}"
         /bin/bash ${DIRECTORY}/${COMPONENT_NAME}/current/script/manage-oracle.sh  ${FORMAT_STRING} start
       fi
     etc .........

我尝试在python中使用split,因为配置文件在所有行中的列数都不相等,所以我得到了索引绑定错误。处理它的最佳方法是什么。

我认为字典可能非常适合这里,您可以按如下方式生成它们:

while read line
  do
    case "$line" in */package*) continue ;; esac
    host_file_array+=("$line")
  done < ${HOST_FILE}
 for ((i=0 ; i < ${#host_file_array[*]}; i++))
  do
    # echo "${host_file_array[i]}"
     host_file_line="${host_file_array[i]}"

     if [[ "$host_file_line" != "#"* ]];
     then
       COMPONENT_NAME=$(echo $host_file_line  | awk '{print $1;}' )
       DIRECTORY=$(echo $host_file_line  | awk '{print $2;}' )
       VERSION=$(echo $host_file_line  | awk '{print $3;}' )

       if [[ ("${COMPONENT_NAME}" == *"oracle"*)  ]];
       then
         print_parameters "Status ${DIRECTORY}/${COMPONENT_NAME}"
         /bin/bash ${DIRECTORY}/${COMPONENT_NAME}/current/script/manage-oracle.sh  ${FORMAT_STRING} start
       fi
     etc .........
>>> result = []
>>> keys = ["COMPONENT_NAME", "DIRECTORY", "VERSION"]
>>> with open(hosts_file) as f:
...     for line in f:
...         result.append(dict(zip(keys, line.strip().split())))
...     
>>> result
[{'DIRECTORY': '/apps/backup', 'COMPONENT_NAME': 'backup'},
 {'DIRECTORY': '/opt/qosmon/qostool/oracle', 'VERSION': 'oracle-client-12.1.0.1.0', 'COMPONENT_NAME': 'oracle'}]
如您所见,这将创建一个字典列表。现在,当您访问字典时,您知道其中一些可能不包含
'VERSION'
键。有多种处理方法。您可以
try/except KeyError
或使用
dict.get()
获取值

while read line
  do
    case "$line" in */package*) continue ;; esac
    host_file_array+=("$line")
  done < ${HOST_FILE}
 for ((i=0 ; i < ${#host_file_array[*]}; i++))
  do
    # echo "${host_file_array[i]}"
     host_file_line="${host_file_array[i]}"

     if [[ "$host_file_line" != "#"* ]];
     then
       COMPONENT_NAME=$(echo $host_file_line  | awk '{print $1;}' )
       DIRECTORY=$(echo $host_file_line  | awk '{print $2;}' )
       VERSION=$(echo $host_file_line  | awk '{print $3;}' )

       if [[ ("${COMPONENT_NAME}" == *"oracle"*)  ]];
       then
         print_parameters "Status ${DIRECTORY}/${COMPONENT_NAME}"
         /bin/bash ${DIRECTORY}/${COMPONENT_NAME}/current/script/manage-oracle.sh  ${FORMAT_STRING} start
       fi
     etc .........
例如:

while read line
  do
    case "$line" in */package*) continue ;; esac
    host_file_array+=("$line")
  done < ${HOST_FILE}
 for ((i=0 ; i < ${#host_file_array[*]}; i++))
  do
    # echo "${host_file_array[i]}"
     host_file_line="${host_file_array[i]}"

     if [[ "$host_file_line" != "#"* ]];
     then
       COMPONENT_NAME=$(echo $host_file_line  | awk '{print $1;}' )
       DIRECTORY=$(echo $host_file_line  | awk '{print $2;}' )
       VERSION=$(echo $host_file_line  | awk '{print $3;}' )

       if [[ ("${COMPONENT_NAME}" == *"oracle"*)  ]];
       then
         print_parameters "Status ${DIRECTORY}/${COMPONENT_NAME}"
         /bin/bash ${DIRECTORY}/${COMPONENT_NAME}/current/script/manage-oracle.sh  ${FORMAT_STRING} start
       fi
     etc .........
>>> for r in result:
...     print r.get('VERSION', "No version")
...     
... 
No version
oracle-client-12.1.0.1.0

我更关心我在bash中写的下面几行。对于((i=0;i<${host#u file_array[]};i++)do#echo“${host#u file_array[i]}”host#u file_line=“${host#u file_array[i]}”if[“$host#u file_line”!=“#”];然后组件名称=$(echo$主机文件{print$1;})目录=$(echo$主机文件{print$2;})版本=$(echo$主机文件{print$3;})如果[(“${COMPONENT}NAME}=”oracle“]),则版本=$(echo$主机文件{print$3;});