如何在java中对第一行以模式开头的所有行进行分组

如何在java中对第一行以模式开头的所有行进行分组,java,sqlite,Java,Sqlite,我有一个文档,其页面位于sqlite数据库中。每一页看起来都像这样: <ar>some words in arabic</ar> :Some more words in arabic and urdu mixed <ar>again arabic</ar>: some more mixed <ar>again arabic</ar>again urdu arabic mixed <ar>some words in

我有一个文档,其页面位于sqlite数据库中。每一页看起来都像这样:

<ar>some words in arabic</ar> :Some more words in arabic and urdu mixed <ar>again arabic</ar>: some more mixed <ar>again arabic</ar>again urdu arabic mixed
<ar>some words in arabic</ar> :Some more words in arabic and urdu mixed <ar>again arabic</ar>: some more mixed <ar>again arabic</ar>again urdu arabic mixed
Few lines in arabic urdu
Again sample line <ar>some arabic</ar> again mix
Again mixed
<ar>some words in arabic</ar> :Some more words in arabic and urdu mixed <ar>again arabic</ar>: some more mixed <ar>again arabic</ar>again urdu arabic mixed
<ar>some words in arabic</ar> :Some more words in arabic and urdu mixed <ar>again arabic</ar>: some more mixed <ar>again arabic</ar>again urdu arabic mixed
<ar>some words in arabic</ar> :Some more words in arabic and urdu mixed <ar>again arabic</ar>: some more mixed <ar>again arabic</ar>again urdu arabic mixed
Few lines in arabic urdu
Again sample line <ar>some arabic</ar> again mix
Again mixed
新生产线的表结构

CREATE TABLE Words (
 Id Integer primary key autoincrement,
 PageNo integer,
 WordLines
)
我必须将所选的行插入到WordLines列中

编辑: 功能

String pageText = getPageText(pageNum);
String[] wordLines = getWordLines(pageText);
for(int i=0, i<wordLines.length, i++) {
 insertIntoDB(wordLines[i], pageNum);
}

我无法理解函数getWordLines的实现。

使用Java将每个文档拆分为一个字符串数组,其中包含要作为记录的字符串。例如,一旦将文档的文本作为名为documentText的字符串,请使用

它在有一个或多个换行符后跟的点拆分文档


您可以对表中的每个文档执行此操作,并在执行时将stringArray中的字符串插入到临时表中。或者您可以将它们全部保存在内存中,直到您将它们全部插入表中。

您是否遇到过任何困难?如果这个问题与sqlite有关,那么描述表的结构会有所帮助。如果没有,请移除标签以清除任何混淆谢谢@Ahmad。我已经添加了结构。好的。你面临的麻烦是什么?如果您只编写一个算法,在所有行中循环并匹配您指定的条件,似乎就可以完成。是什么阻止了你?问题是我在弄清楚如何匹配和分组线路时遇到了问题。
String pageText = getPageText(pageNum);
String[] wordLines = getWordLines(pageText);
for(int i=0, i<wordLines.length, i++) {
 insertIntoDB(wordLines[i], pageNum);
}
string[] stringArray = documentText.split(“[\r\n]+(?=<ar>)”);