Regex 在结构化数组中存储文本文件_Regex_Matlab_Io

Regex 在结构化数组中存储文本文件

regex matlab io

Regex 在结构化数组中存储文本文件,regex,matlab,io,Regex,Matlab,Io,我有一个结构良好的输入文本文件： START_PARAMETERS C:\Users\admin\Desktop\Bladed_wind_generator\_wind C:\Users\admin\Desktop\Bladed_wind_generator\reference_v_4_2.$PJ END_PARAMETERS --------------------------------------------------------------------------- START_DLC

我有一个结构良好的输入文本文件：

START_PARAMETERS
C:\Users\admin\Desktop\Bladed_wind_generator\_wind
C:\Users\admin\Desktop\Bladed_wind_generator\reference_v_4_2.$PJ
END_PARAMETERS
---------------------------------------------------------------------------
START_DLC1-2
4 6 8 10 12 14 16 18 20 22 24 26 28 29
6
8192
600
END_DLC1-2
---------------------------------------------------------------------------
START_DLC6-1
44.8
30
8192
600
END_DLC6-1
---------------------------------------------------------------------------
START_DLC6-4
3 31 33 35
6
8192
600
END_DLC6-4
---------------------------------------------------------------------------
START_DLC7-2
2 4 6 8 10 12 14 16 18 20 22 24 
6
8192
600
END_DLC7-2
---------------------------------------------------------------------------

目前我是这样读的：

clc,clear all,close all

f = fopen('inp.txt','rt');  % Read Input File
C = textscan(f, '%s', 'Delimiter', '\r\n');
C = C{1}; % Store Input File in a Cell
fclose(f);

然后，通过regexp读取（START_DLC/END_DLC）块的每次出现：

其目的是将每个开始/结束数据块之间的文本内容存储在结构化单元（应称为存储数据块）中。结果必须为（例如DLC1-2）：

等等，直到DLC7-2

你介意给我一些关于如何进行的提示吗

我提前感谢大家

比尔，

弗朗西斯科

到目前为止，您的代码还可以。不过有一件事，我会稍微将您对

startIndx

和

endIndx

的计算更改为：

startIndx = find(~cellfun(@isempty, regexp(C, 'START_DLC', 'match')));
endIndx = find(~cellfun(@isempty, regexp(C, 'END_DLC', 'match')));

这样您就可以得到实际的索引（为了视觉上的方便，我在这里转置了它们），如下所示：

startIndx =

     6    13    20    27


endIndx =

    11    18    25    32

我还将添加一个断言来检查输入的完整性：

assert(all(size(startIndx) == size(endIndx)))

现在，使用上面计算的所有索引，您可以继续将数据提取到单元格中：

extract_dlc = @(n)({C{startIndx(n):endIndx(n) - 1}});
store_DLCs = arrayfun(extract_dlc, 1:numel(startIndx), 'UniformOutput', false)

要“修复”每个单元格的名称（即第一个条目），可以执行以下操作：

fix_dlc_name = @(x){strrep(x{1}, 'START_', ''), x{2:end}};
store_DLCs = cellfun(fix_dlc_name, store_DLCs,  'UniformOutput', false);

应用于示例输入的代码将生成一个1×4单元的单元数组：

store_DLCs =

    {'DLC1-2', '4 6 8 10 12 14 16 18 20 22 24 26 28 29', '6', '8192', '600'}  
    {'DLC6-1', '44.8', '30', '8192', '600'}   
    {'DLC6-4', '3 31 33 35', '6', '8192', '600'}   
    {'DLC7-2', '2 4 6 8 10 12 14 16 18 20 22 24', '6', '8192', '600'}

谢谢你宝贵的帮助；我需要一些时间完全进入regexp。你的建议值得解决我的问题。我还有一个容易解决的问题：我希望数字数组（例如，对于DLC6-1，44.8、30、8192、600）被理解为数字，而不是字符。相反，DLCx-y可以保留为字符串。我会使用str2num，但也许有一种更简单的方法可以实现相同的结果。提前感谢并致以最诚挚的问候。您不可能知道您正在阅读的是哪一行（字符串或数字）。我认为您必须将每一行作为字符串读取，然后根据需要将其转换为数值。

fix_dlc_name = @(x){strrep(x{1}, 'START_', ''), x{2:end}};
store_DLCs = cellfun(fix_dlc_name, store_DLCs,  'UniformOutput', false);

store_DLCs =

    {'DLC1-2', '4 6 8 10 12 14 16 18 20 22 24 26 28 29', '6', '8192', '600'}  
    {'DLC6-1', '44.8', '30', '8192', '600'}   
    {'DLC6-4', '3 31 33 35', '6', '8192', '600'}   
    {'DLC7-2', '2 4 6 8 10 12 14 16 18 20 22 24', '6', '8192', '600'}