Warning: file_get_contents(/data/phpspider/zhask/data//catemap/5/bash/16.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
String 通过weka机器学习传递双引号_String_Bash_Escaping_Weka_Double Quotes - Fatal编程技术网

String 通过weka机器学习传递双引号

String 通过weka机器学习传递双引号,string,bash,escaping,weka,double-quotes,String,Bash,Escaping,Weka,Double Quotes,我正在使用Weka的CLI,即Primer,我尝试了许多不同的组合,传递了几个参数,但都没有成功。当我经过像这样的东西时: weka_options=("weka.classifiers.functions.SMOreg -C 1.0 -N 0") weka_options=("weka.classifiers.functions.SMOreg -C 1.0 -N 0 -I \"weka.classifiers.functions.supportVector.RegSMOImproved -L

我正在使用Weka的
CLI
,即
Primer
,我尝试了许多不同的组合,传递了几个参数,但都没有成功。当我经过像这样的东西时:

weka_options=("weka.classifiers.functions.SMOreg -C 1.0 -N 0")
weka_options=("weka.classifiers.functions.SMOreg -C 1.0 -N 0 -I \"weka.classifiers.functions.supportVector.RegSMOImproved -L 1.0e-3 -W 1 -P 1.0e-12 -T 0.001 -V\" -K \"weka.classifiers.functions.supportVector.NormalizedPolyKernel -C 250007 -E 8.0\"")
程序运行时没有问题,但传递如下内容:

weka_options=("weka.classifiers.functions.SMOreg -C 1.0 -N 0")
weka_options=("weka.classifiers.functions.SMOreg -C 1.0 -N 0 -I \"weka.classifiers.functions.supportVector.RegSMOImproved -L 1.0e-3 -W 1 -P 1.0e-12 -T 0.001 -V\" -K \"weka.classifiers.functions.supportVector.NormalizedPolyKernel -C 250007 -E 8.0\"")
使用/不使用转义字符,甚至使用单引号`,会在bash脚本中导致错误:

bash ./weka.sh "$sub_working_dir" $train_percentage "$weka_options" $files_string > $predictions
其中
weka.sh
包含:

java -Xmx1024m -classpath ".:$WEKAPATH" $weka_options -t "$train_set" -T "$test_set" -c 53 -p 53
以下是我得到的:

---Registering Weka Editors---
Trying to add database driver (JDBC): jdbc.idbDriver - Error, not in CLASSPATH?

Weka exception: Can't open file No suitable converter found for '0.001'!.
有人能指出这个问题吗

更新问题:以下是代码:

    # Usage:
# 
# ./aca2_explore.sh working-dir datasets/*
# e.g.
# ./aca2_explore.sh "aca2-explore-working-dir/" datasets/*
#
# Place this script in the same folder as aca2.sh and the folder containing the datasets. 
# 
#
# Please note that:
#   - All the notes contained in aca2.sh apply
#   - This script will erase the contents of working-dir

# to properly sort negative floating numbers, independently of local language options

export LC_ALL=C

# parameters parsing

output_directory=$1

first_file_index=2
files=${@:$first_file_index}

# global constants

datasets=$(($# - 1))
output_row=$(($datasets + 3))
output_columns_range="2-7"
learned_model_mae_column=4
results_learned_model_mae_column=4

# parameters

working_dir="$output_directory"
if [ -d "$working_dir" ];
then
    rm -r "$working_dir"
fi
mkdir "$working_dir"

sub_working_dir="$working_dir""aca2-explore-sub-working-dir/"
path_to_results_file="$sub_working_dir""results.csv"

train_percentage=25

logfile="$working_dir""aca2_explore_log.csv"
echo "" > "$logfile"

reduced_log_header="Options,average_test_set_speedup,null_model_mae,learned_model_mae,learned_model_rmse,mae_ratio,R^2"

reduced_logfile="$working_dir""aca2_explore_reduced_log.csv"
echo "$reduced_log_header" > "$reduced_logfile"

sorted_reduced_logfile="$working_dir""aca2_explore_sorted_reduced_log.csv"

weka_options_list=(
"weka.classifiers.functions.LinearRegression -S 0 -R 1.0E-8"
"weka.classifiers.functions.MultilayerPerceptron -L 0.3 -M 0.2 -N 100 -V 0 -S 0 -E 20 -H a"
"weka.classifiers.meta.AdditiveRegression -S 1.0 -I 10 -W weka.classifiers.trees.DecisionStump"
"weka.classifiers.meta.Bagging -P 100 -S 1 -num-slots 1 -I 10 -W weka.classifiers.trees.REPTree -- -M 2 -V 0.001 -N 3 -S 1 -L -1 -I 0.0"
"weka.classifiers.meta.CVParameterSelection -X 10 -S 1 -W weka.classifiers.rules.ZeroR"
"weka.classifiers.meta.MultiScheme -X 0 -S 1 -B \"weka.classifiers.rules.ZeroR \""
"weka.classifiers.meta.RandomCommittee -S 1 -num-slots 1 -I 10 -W weka.classifiers.trees.RandomTree -- -K 0 -M 1.0 -V 0.001 -S 1"
"weka.classifiers.meta.RandomizableFilteredClassifier -S 1 -F \"weka.filters.unsupervised.attribute.RandomProjection -N 10 -R 42 -D Sparse1\" -W weka.classifiers.lazy.IBk -- -K 1 -W 0 -A \"weka.core.neighboursearch.LinearNNSearch -A \"weka.core.EuclideanDistance -R first-last\"\""
"weka.classifiers.meta.RandomSubSpace -P 0.5 -S 1 -num-slots 1 -I 10 -W weka.classifiers.trees.REPTree -- -M 2 -V 0.001 -N 3 -S 1 -L -1 -I 0.0"
"weka.classifiers.meta.RegressionByDiscretization -B 10 -K weka.estimators.UnivariateEqualFrequencyHistogramEstimator -W weka.classifiers.trees.J48 -- -C 0.25 -M 2"
"weka.classifiers.meta.Stacking -X 10 -M \"weka.classifiers.rules.ZeroR \" -S 1 -num-slots 1 -B \"weka.classifiers.rules.ZeroR \""
"weka.classifiers.meta.Vote -S 1 -B \"weka.classifiers.rules.ZeroR \" -R AVG"
"weka.classifiers.rules.DecisionTable -X 1 -S \"weka.attributeSelection.BestFirst -D 1 -N 5\""
"weka.classifiers.rules.M5Rules -M 4.0"
"weka.classifiers.rules.ZeroR"
"weka.classifiers.trees.DecisionStump"
"weka.classifiers.trees.M5P -M 4.0"
"weka.classifiers.trees.RandomForest -I 100 -K 0 -S 1 -num-slots 1"
"weka.classifiers.trees.RandomTree -K 0 -M 1.0 -V 0.001 -S 1"
"weka.classifiers.trees.REPTree -M 2 -V 0.001 -N 3 -S 1 -L -1 -I 0.0")

files_string=""

for file in ${files[@]}
do
    files_string="$files_string""$file"" "
done

#echo $files_string

for weka_options in "${weka_options_list[@]}"
do
        echo "$weka_options"
        echo "$weka_options" >> "$logfile"
        bash ./aca2.sh "$sub_working_dir" $train_percentage "$weka_options" $files_string
        cat "$path_to_results_file" >> "$logfile"

        result_columns=$(tail -n +"$output_row" "$path_to_results_file"  | head -1 | cut -d, -f"$output_columns_range")
        echo "$weka_options"",""$result_columns" >> "$reduced_logfile"

        echo "" >> "$logfile"
done

tail -n +2 "$reduced_logfile" > "$sorted_reduced_logfile"
sort --field-separator=',' --key="$results_learned_model_mae_column" "$sorted_reduced_logfile" -o "$sorted_reduced_logfile"".tmp"
echo "$reduced_log_header" > "$sorted_reduced_logfile"
cat "$sorted_reduced_logfile"".tmp" >> "$sorted_reduced_logfile"
rm "$sorted_reduced_logfile"".tmp"
其中文件
aca2.sh
为:

#!/bin/bash

# Run this script as ./script.sh working-directory train-set-filter-percentage "weka_options" datasets/*
#
# e.g.
# Place this script in a folder together with a directory containing your datasets. Call then the script as 
# ./aca2.sh "aca2-working-dir/" 25 "weka.classifiers.functions.LinearRegression -S 0 -R 1.0E-8" datasets_folder/*
#
# NOTE: the script will erase the content of working-directory
#       for correct behaviour $WEKAHOME environment variable must be set to the folder containing weka.jar, otherwise modify the call to the weka classifier below
#
# To define the error measures used in this script, I made use of some of the notions found in this article:
# http://scott.fortmann-roe.com/docs/MeasuringError.html


# parameters parsing

output_directory=$1
train_set_percentage=$2

if [ $train_set_percentage -lt 1 ] || [ $train_set_percentage -gt 100 ];
then
    echo "Invalid train set percentage: "$train_set_percentage
    exit 1
fi


weka_options=$3

first_file_index=4
files=${@:$first_file_index}

# global constants

predictions_characters_range_value="23-28"
predictions_characters_range_error="34-39"

tmp_dir="$output_directory"
if [ -d "$tmp_dir" ];
then
    rm -r "$tmp_dir"
fi
mkdir "$tmp_dir"

results_header="testfile,average_test_set_speedup,null_model_mae,learned_model_mae,learned_model_rmse,mae_ratio,R^2"
results_file=$tmp_dir"results.csv"

echo "$results_header" > "$results_file"

arff_header="% ARFF conversion of CSV dataset

@RELATION program

@ATTRIBUTE ...

@DATA"

# global constants

datasets_per_program=5
entries_per_dataset=128

train_set_instances_to_select=$((datasets_per_program*entries_per_dataset*train_set_percentage/100))

all_prediction="$tmp_dir""all_predictions.txt"

count=0
prediction_efficiency_ideal_avg=0

arff_header_file="$tmp_dir""arff_header.txt"
echo "$arff_header" > "$arff_header_file"



count=0

for filename in ${files[@]}
do
    echo "Test set: $filename"

    echo "$filename" >> "$all_prediction"

    cur_dir="$tmp_dir$filename.dir/"
    mkdir -p $cur_dir

    testfile=$filename

    train_set="$cur_dir""train_set.arff"

    echo "$arff_header" > $train_set

    selected_train_subset="$cur_dir""selected_train_subset.csv"

    for trainfile in ${files[@]}
    do
        if [ "$trainfile" != "$testfile" ]; then
#           filter train set to feed only top 25% for model generation
            sort --field-separator=',' --key=53 "$trainfile" -o "$selected_train_subset"
            head -$train_set_instances_to_select "$selected_train_subset" >> $train_set
        fi
    done

    test_set="$cur_dir""test_set.arff"

    #echo "$arff_header" > $test_set
    cp "$testfile" "$test_set"


    # This file will contain the full configuration space dataset relative to the test program
    complete_test_set="$cur_dir""complete_test_set.csv"
    cp "$test_set" "$complete_test_set"

    sort --field-separator=',' --key=53 "$test_set" -o "$test_set"
    head -8 "$test_set" > "$test_set"".tmp"
    mv "$test_set"".tmp" "$test_set"

    cur_prediction="$cur_dir""cur_prediction.tmp"

    # generate basis for predicted test set file by copying the actual test set, removing speedups
    predicted_test_set="$cur_dir""predicted_test_set.csv"
    cp "$test_set" "$predicted_test_set"
    cut -d, -f53 --complement "$predicted_test_set" > "$predicted_test_set"".tmp"
    mv "$predicted_test_set"".tmp" "$predicted_test_set"

    cat "$arff_header_file" "$test_set" > "$test_set"".tmp"
    mv "$test_set"".tmp" "$test_set"    

    java -Xmx1024m -classpath ".:$WEKAHOME/weka.jar:$WEKAJARS/*" $weka_options -t "$train_set" -T "$test_set" -c 53 -p 53 | tail -n +6 | head -8  > "$cur_prediction"

    predictions_file="$cur_dir""predictions.csv"
    cut -c"$predictions_characters_range_value" "$cur_prediction" | tr -d " " > "$predictions_file"

    paste -d',' "$actual_speedups" "$predictions_file" > "$predictions_file"".tmp"
    mv "$predictions_file"".tmp" "$predictions_file"
done

你几乎有这个权利。你试图做正确的事情(或者只是不小心走近了)

不能将字符串用于任意引用的参数(这是错误的)

您需要改用数组。但是您需要一个数组,每个参数有一个单独的元素。不仅仅是一个论点

weka_options=(weka.classifiers.functions.SMOreg -C 1.0 -N 0)

(我假设字符串
weka.classifiers.functions.supportVector.regsmoomproved-l1.0e-3-w1-p1.0e-12-t0.001-V
-I
标志的参数,字符串
weka.classifiers.functions.supportVector.NormalizedPolyKernel-c250007-e8.0
-K
标志的参数。如果不是case则这些引用可能也要删除。)

然后,当您使用数组时,您需要使用
“${weka_options[@]}”
将数组的元素作为单独的引用词

java -Xmx1024m -classpath ".:$WEKAPATH" "${weka_options[@]}" -t "$train_set" -T "$test_set" -c 53 -p 53

感谢@Etan的回答,如果我的
weka_options
是我想要传递的不止一个选项呢?那些不是不止一个选项吗?整个字符串是java模块的一个长选项吗?如果整个字符串真的需要作为一个长参数传递给java,那么您所拥有的可能就是您想要的,并且您可能需要当你使用它时,必须在脚本中引用
“$weka_options”
。很抱歉让你困惑,我的意思是,如果我想在这种情况下探索7-10个不同的机器学习选项:
weka_options_list=(“weka.classifiers.functions.LinearRegression-S 0-R 1.0E-8”“weka.classifiers.functions.MultilayerPerceptron-L0.3-M0.2-N100-V0-S0-E20-HA”“weka.classifiers.meta.AdditiveRegression-S1.0-I10-Wweka.classifiers.trees.DecisionStump”“weka.classifiers.meta.Bagging-P100-S1-num插槽1-I10-Wweka.classifiers.trees.REPTree---M2-V0.001-N3-S1-L-I0.0”weka.classifiers.meta.CVParameterSelection-x10-s1-W)
顺便说一句,我测试了您提到的答案,但它不起作用,它只接受第一个参数前的第一个字符串,并将其传递给java如果您手动编写java命令(或原始命令),它会是什么样子?