String 通过weka机器学习传递双引号
我正在使用Weka的String 通过weka机器学习传递双引号,string,bash,escaping,weka,double-quotes,String,Bash,Escaping,Weka,Double Quotes,我正在使用Weka的CLI,即Primer,我尝试了许多不同的组合,传递了几个参数,但都没有成功。当我经过像这样的东西时: weka_options=("weka.classifiers.functions.SMOreg -C 1.0 -N 0") weka_options=("weka.classifiers.functions.SMOreg -C 1.0 -N 0 -I \"weka.classifiers.functions.supportVector.RegSMOImproved -L
CLI
,即Primer
,我尝试了许多不同的组合,传递了几个参数,但都没有成功。当我经过像这样的东西时:
weka_options=("weka.classifiers.functions.SMOreg -C 1.0 -N 0")
weka_options=("weka.classifiers.functions.SMOreg -C 1.0 -N 0 -I \"weka.classifiers.functions.supportVector.RegSMOImproved -L 1.0e-3 -W 1 -P 1.0e-12 -T 0.001 -V\" -K \"weka.classifiers.functions.supportVector.NormalizedPolyKernel -C 250007 -E 8.0\"")
程序运行时没有问题,但传递如下内容:
weka_options=("weka.classifiers.functions.SMOreg -C 1.0 -N 0")
weka_options=("weka.classifiers.functions.SMOreg -C 1.0 -N 0 -I \"weka.classifiers.functions.supportVector.RegSMOImproved -L 1.0e-3 -W 1 -P 1.0e-12 -T 0.001 -V\" -K \"weka.classifiers.functions.supportVector.NormalizedPolyKernel -C 250007 -E 8.0\"")
使用/不使用转义字符,甚至使用单引号`,会在bash脚本中导致错误:
bash ./weka.sh "$sub_working_dir" $train_percentage "$weka_options" $files_string > $predictions
其中weka.sh
包含:
java -Xmx1024m -classpath ".:$WEKAPATH" $weka_options -t "$train_set" -T "$test_set" -c 53 -p 53
以下是我得到的:
---Registering Weka Editors---
Trying to add database driver (JDBC): jdbc.idbDriver - Error, not in CLASSPATH?
Weka exception: Can't open file No suitable converter found for '0.001'!.
有人能指出这个问题吗
更新问题:以下是代码:
# Usage:
#
# ./aca2_explore.sh working-dir datasets/*
# e.g.
# ./aca2_explore.sh "aca2-explore-working-dir/" datasets/*
#
# Place this script in the same folder as aca2.sh and the folder containing the datasets.
#
#
# Please note that:
# - All the notes contained in aca2.sh apply
# - This script will erase the contents of working-dir
# to properly sort negative floating numbers, independently of local language options
export LC_ALL=C
# parameters parsing
output_directory=$1
first_file_index=2
files=${@:$first_file_index}
# global constants
datasets=$(($# - 1))
output_row=$(($datasets + 3))
output_columns_range="2-7"
learned_model_mae_column=4
results_learned_model_mae_column=4
# parameters
working_dir="$output_directory"
if [ -d "$working_dir" ];
then
rm -r "$working_dir"
fi
mkdir "$working_dir"
sub_working_dir="$working_dir""aca2-explore-sub-working-dir/"
path_to_results_file="$sub_working_dir""results.csv"
train_percentage=25
logfile="$working_dir""aca2_explore_log.csv"
echo "" > "$logfile"
reduced_log_header="Options,average_test_set_speedup,null_model_mae,learned_model_mae,learned_model_rmse,mae_ratio,R^2"
reduced_logfile="$working_dir""aca2_explore_reduced_log.csv"
echo "$reduced_log_header" > "$reduced_logfile"
sorted_reduced_logfile="$working_dir""aca2_explore_sorted_reduced_log.csv"
weka_options_list=(
"weka.classifiers.functions.LinearRegression -S 0 -R 1.0E-8"
"weka.classifiers.functions.MultilayerPerceptron -L 0.3 -M 0.2 -N 100 -V 0 -S 0 -E 20 -H a"
"weka.classifiers.meta.AdditiveRegression -S 1.0 -I 10 -W weka.classifiers.trees.DecisionStump"
"weka.classifiers.meta.Bagging -P 100 -S 1 -num-slots 1 -I 10 -W weka.classifiers.trees.REPTree -- -M 2 -V 0.001 -N 3 -S 1 -L -1 -I 0.0"
"weka.classifiers.meta.CVParameterSelection -X 10 -S 1 -W weka.classifiers.rules.ZeroR"
"weka.classifiers.meta.MultiScheme -X 0 -S 1 -B \"weka.classifiers.rules.ZeroR \""
"weka.classifiers.meta.RandomCommittee -S 1 -num-slots 1 -I 10 -W weka.classifiers.trees.RandomTree -- -K 0 -M 1.0 -V 0.001 -S 1"
"weka.classifiers.meta.RandomizableFilteredClassifier -S 1 -F \"weka.filters.unsupervised.attribute.RandomProjection -N 10 -R 42 -D Sparse1\" -W weka.classifiers.lazy.IBk -- -K 1 -W 0 -A \"weka.core.neighboursearch.LinearNNSearch -A \"weka.core.EuclideanDistance -R first-last\"\""
"weka.classifiers.meta.RandomSubSpace -P 0.5 -S 1 -num-slots 1 -I 10 -W weka.classifiers.trees.REPTree -- -M 2 -V 0.001 -N 3 -S 1 -L -1 -I 0.0"
"weka.classifiers.meta.RegressionByDiscretization -B 10 -K weka.estimators.UnivariateEqualFrequencyHistogramEstimator -W weka.classifiers.trees.J48 -- -C 0.25 -M 2"
"weka.classifiers.meta.Stacking -X 10 -M \"weka.classifiers.rules.ZeroR \" -S 1 -num-slots 1 -B \"weka.classifiers.rules.ZeroR \""
"weka.classifiers.meta.Vote -S 1 -B \"weka.classifiers.rules.ZeroR \" -R AVG"
"weka.classifiers.rules.DecisionTable -X 1 -S \"weka.attributeSelection.BestFirst -D 1 -N 5\""
"weka.classifiers.rules.M5Rules -M 4.0"
"weka.classifiers.rules.ZeroR"
"weka.classifiers.trees.DecisionStump"
"weka.classifiers.trees.M5P -M 4.0"
"weka.classifiers.trees.RandomForest -I 100 -K 0 -S 1 -num-slots 1"
"weka.classifiers.trees.RandomTree -K 0 -M 1.0 -V 0.001 -S 1"
"weka.classifiers.trees.REPTree -M 2 -V 0.001 -N 3 -S 1 -L -1 -I 0.0")
files_string=""
for file in ${files[@]}
do
files_string="$files_string""$file"" "
done
#echo $files_string
for weka_options in "${weka_options_list[@]}"
do
echo "$weka_options"
echo "$weka_options" >> "$logfile"
bash ./aca2.sh "$sub_working_dir" $train_percentage "$weka_options" $files_string
cat "$path_to_results_file" >> "$logfile"
result_columns=$(tail -n +"$output_row" "$path_to_results_file" | head -1 | cut -d, -f"$output_columns_range")
echo "$weka_options"",""$result_columns" >> "$reduced_logfile"
echo "" >> "$logfile"
done
tail -n +2 "$reduced_logfile" > "$sorted_reduced_logfile"
sort --field-separator=',' --key="$results_learned_model_mae_column" "$sorted_reduced_logfile" -o "$sorted_reduced_logfile"".tmp"
echo "$reduced_log_header" > "$sorted_reduced_logfile"
cat "$sorted_reduced_logfile"".tmp" >> "$sorted_reduced_logfile"
rm "$sorted_reduced_logfile"".tmp"
其中文件aca2.sh
为:
#!/bin/bash
# Run this script as ./script.sh working-directory train-set-filter-percentage "weka_options" datasets/*
#
# e.g.
# Place this script in a folder together with a directory containing your datasets. Call then the script as
# ./aca2.sh "aca2-working-dir/" 25 "weka.classifiers.functions.LinearRegression -S 0 -R 1.0E-8" datasets_folder/*
#
# NOTE: the script will erase the content of working-directory
# for correct behaviour $WEKAHOME environment variable must be set to the folder containing weka.jar, otherwise modify the call to the weka classifier below
#
# To define the error measures used in this script, I made use of some of the notions found in this article:
# http://scott.fortmann-roe.com/docs/MeasuringError.html
# parameters parsing
output_directory=$1
train_set_percentage=$2
if [ $train_set_percentage -lt 1 ] || [ $train_set_percentage -gt 100 ];
then
echo "Invalid train set percentage: "$train_set_percentage
exit 1
fi
weka_options=$3
first_file_index=4
files=${@:$first_file_index}
# global constants
predictions_characters_range_value="23-28"
predictions_characters_range_error="34-39"
tmp_dir="$output_directory"
if [ -d "$tmp_dir" ];
then
rm -r "$tmp_dir"
fi
mkdir "$tmp_dir"
results_header="testfile,average_test_set_speedup,null_model_mae,learned_model_mae,learned_model_rmse,mae_ratio,R^2"
results_file=$tmp_dir"results.csv"
echo "$results_header" > "$results_file"
arff_header="% ARFF conversion of CSV dataset
@RELATION program
@ATTRIBUTE ...
@DATA"
# global constants
datasets_per_program=5
entries_per_dataset=128
train_set_instances_to_select=$((datasets_per_program*entries_per_dataset*train_set_percentage/100))
all_prediction="$tmp_dir""all_predictions.txt"
count=0
prediction_efficiency_ideal_avg=0
arff_header_file="$tmp_dir""arff_header.txt"
echo "$arff_header" > "$arff_header_file"
count=0
for filename in ${files[@]}
do
echo "Test set: $filename"
echo "$filename" >> "$all_prediction"
cur_dir="$tmp_dir$filename.dir/"
mkdir -p $cur_dir
testfile=$filename
train_set="$cur_dir""train_set.arff"
echo "$arff_header" > $train_set
selected_train_subset="$cur_dir""selected_train_subset.csv"
for trainfile in ${files[@]}
do
if [ "$trainfile" != "$testfile" ]; then
# filter train set to feed only top 25% for model generation
sort --field-separator=',' --key=53 "$trainfile" -o "$selected_train_subset"
head -$train_set_instances_to_select "$selected_train_subset" >> $train_set
fi
done
test_set="$cur_dir""test_set.arff"
#echo "$arff_header" > $test_set
cp "$testfile" "$test_set"
# This file will contain the full configuration space dataset relative to the test program
complete_test_set="$cur_dir""complete_test_set.csv"
cp "$test_set" "$complete_test_set"
sort --field-separator=',' --key=53 "$test_set" -o "$test_set"
head -8 "$test_set" > "$test_set"".tmp"
mv "$test_set"".tmp" "$test_set"
cur_prediction="$cur_dir""cur_prediction.tmp"
# generate basis for predicted test set file by copying the actual test set, removing speedups
predicted_test_set="$cur_dir""predicted_test_set.csv"
cp "$test_set" "$predicted_test_set"
cut -d, -f53 --complement "$predicted_test_set" > "$predicted_test_set"".tmp"
mv "$predicted_test_set"".tmp" "$predicted_test_set"
cat "$arff_header_file" "$test_set" > "$test_set"".tmp"
mv "$test_set"".tmp" "$test_set"
java -Xmx1024m -classpath ".:$WEKAHOME/weka.jar:$WEKAJARS/*" $weka_options -t "$train_set" -T "$test_set" -c 53 -p 53 | tail -n +6 | head -8 > "$cur_prediction"
predictions_file="$cur_dir""predictions.csv"
cut -c"$predictions_characters_range_value" "$cur_prediction" | tr -d " " > "$predictions_file"
paste -d',' "$actual_speedups" "$predictions_file" > "$predictions_file"".tmp"
mv "$predictions_file"".tmp" "$predictions_file"
done
你几乎有这个权利。你试图做正确的事情(或者只是不小心走近了) 不能将字符串用于任意引用的参数(这是错误的) 您需要改用数组。但是您需要一个数组,每个参数有一个单独的元素。不仅仅是一个论点
weka_options=(weka.classifiers.functions.SMOreg -C 1.0 -N 0)
或
(我假设字符串weka.classifiers.functions.supportVector.regsmoomproved-l1.0e-3-w1-p1.0e-12-t0.001-V
是-I
标志的参数,字符串weka.classifiers.functions.supportVector.NormalizedPolyKernel-c250007-e8.0
是-K
标志的参数。如果不是case则这些引用可能也要删除。)
然后,当您使用数组时,您需要使用“${weka_options[@]}”
将数组的元素作为单独的引用词
java -Xmx1024m -classpath ".:$WEKAPATH" "${weka_options[@]}" -t "$train_set" -T "$test_set" -c 53 -p 53
感谢@Etan的回答,如果我的
weka_options
是我想要传递的不止一个选项呢?那些不是不止一个选项吗?整个字符串是java模块的一个长选项吗?如果整个字符串真的需要作为一个长参数传递给java,那么您所拥有的可能就是您想要的,并且您可能需要当你使用它时,必须在脚本中引用“$weka_options”
。很抱歉让你困惑,我的意思是,如果我想在这种情况下探索7-10个不同的机器学习选项:weka_options_list=(“weka.classifiers.functions.LinearRegression-S 0-R 1.0E-8”“weka.classifiers.functions.MultilayerPerceptron-L0.3-M0.2-N100-V0-S0-E20-HA”“weka.classifiers.meta.AdditiveRegression-S1.0-I10-Wweka.classifiers.trees.DecisionStump”“weka.classifiers.meta.Bagging-P100-S1-num插槽1-I10-Wweka.classifiers.trees.REPTree---M2-V0.001-N3-S1-L-I0.0”weka.classifiers.meta.CVParameterSelection-x10-s1-W)
顺便说一句,我测试了您提到的答案,但它不起作用,它只接受第一个参数前的第一个字符串,并将其传递给java如果您手动编写java命令(或原始命令),它会是什么样子?