Php 按换行符拆分文件中的内容?
我有一个包含以下内容的文件:Php 按换行符拆分文件中的内容?,php,regex,nlp,Php,Regex,Nlp,我有一个包含以下内容的文件: ( (CODE <begin_A_defense_of_Michael_Moore>)) ( (NP (NP (NP (DT A) (NN defense)) (PP (IN of) (NP (NP (NNP Michael) (NNP Moore)) (CC and) (" ") (S-NOM-TTL (NP-SBJ (-NONE- *PRO*))
( (CODE <begin_A_defense_of_Michael_Moore>))
( (NP (NP (NP (DT A) (NN defense))
(PP (IN of)
(NP (NP (NNP Michael) (NNP Moore))
(CC and)
(" ")
(S-NOM-TTL (NP-SBJ (-NONE- *PRO*))
(VP (VBG Bowling)
(PP-PRP (IN for)
(NP (NNP Columbine))))))))
(" ")
(CODE -LRB-)
(PRN (NP (NN Op-Ed)))
(CODE -RRB-)
(PP (IN By)
(NP (NNP Eloquence)))))
( (FRAG (NP (NNP Wed))
(NP (NML (NNP Aug))
(JJ 13th)
(, ,)
(NN 2003))
(PP-TMP (IN at)
(NP (CD 09:00:09)
(FW AM) (FW EST)))))
( (S (NP-SBJ (DT This))
(VP (VBZ is)
(NP-PRD (NP (DT an) (JJ open) (NN letter))
(PP (IN to)
(NP (NP (NNP David) (NNP Hardy))
(, ,)
(NP (NP (NN author))
(PP (IN of)
(NP (NP-TTL (S-NOM-TTL (NP-SBJ (-NONE- *PRO*))
(VP (VB Bowling)
(PP-PRP (IN for)
(NP (NNP Columbine)))))
(: :)
(NP (NN Documentary) (CC or) (NN Fiction)))
(, ?)
(, ,)
(RRC (ADVP (RB probably))
(NP-PRD (NP (DT the)
(ADJP (RBS most) (JJ comprehensive)))
(PP (IN among)
(NP (NP (JJ many) (NNS rebuttals))
(PP (IN of)
(NP (DT the)
(ADJP (NNP Oscar) (HYPH -) (VBG winning))
(NN documentary))))))))))))))
(. .)))
( (S (NP-SBJ (NNS Critics))
(VP (VBP have)
(ADVP-TMP (RB now))
(VP (VBN gone)
(ADVP (ADVP (RB so) (RB far))
(SBAR (IN as)
(S (NP-SBJ (-NONE- *PRO*))
(VP (TO to)
(VP (VB call)
(PP-CLR (IN for)
(NP (NP (DT the) (NN revocation))
(PP (IN of)
(NP (DT the) (NN award))))))))))))
(. .)))
( (S (NP-SBJ (PRP$ Their) (NNS chances))
(VP (VBP are)
(ADJP-PRD (JJ small))
(, ,)
(ADVP (RB however))
(, ,)
(SBAR-PRP (IN as)
(S (NP-SBJ (PRP$ their) (NNS arguments))
(VP (VP (VBP rely)
(PP-CLR=1 (IN on)
(NP (NN polemic) (, ,) (NN exaggeration) (CC and) (NN misrepresentation))))
(: --)
(VP (PP (IN in)
(NP (JJ other) (NNS words)))
(, ,)
(PP-CLR=1 (IN on)
(NP (NP (DT the) (JJ same) (NNS techniques))
(SBAR (WHNP-2 (WP which))
(S (NP-SBJ (PRP they))
(VP (VBP accuse)
(NP (NNP Moore))
(PP-CLR (IN of)
(S-NOM (NP-SBJ (-NONE- *PRO*))
(VP (VBG using)
(NP (-NONE- *T*-2)))))))))))))))
(. .)))
((代码))
((NP(NP(NP(dta)(NN防御))
(第页,共页)
(NP)(NP(NNP迈克尔)(NNP摩尔))
(抄送及)
(" ")
(S-NOM-TTL(NP-SBJ(-NONE-*PRO*))
(副总裁(VBG保龄球)
(PP-PRP(适用于)
(NP(NNP Columbine(()())())))
(" ")
(代码-LRB-)
(PRN(NP(NN Op Ed)))
(代码-RRB-)
(第页,由)
(NP(NNP雄辩(())))
(新界北总区(北区西))
(NP(NML(NNP-8月))
(JJ第13版)
(,)
(2003年)
(PP-TMP(在at中)
(NP(CD 09:00:09)
(FW AM)(FW EST()()))
((S(NP-SBJ(DT本))
(副总裁(VBZ is)
(NP-PRD(NP(DT-an)(JJ公开)(NN信函))
(PP)(至)
(NP(NP(NNP大卫)(NNP哈代))
(,)
(NP(NP(NN作者))
(第页,共页)
(NP-TTL(S-NOM-TTL(NP-SBJ(-NONE-*PRO*))
(副总裁(保龄球)
(PP-PRP(适用于)
(NP(NNP Columbine(()())))
(: :)
(NP(NN纪录片)(CC或)(NN小说)))
(, ?)
(,)
(RRC(ADVP(RB可能))
(NP-PRD(NP(DT)
(澳大利亚皇家银行most(JJ综合)))
(PP)(其中)
(NP(NP(JJ多)(NNS反驳))
(第页,共页)
(NP(DT)
(ADJP(NNP奥斯卡)(连字符-(VBG获奖))
(NN纪录片(()(()())())()))
(. .)))
((新南威尔士州立大学)
(副总裁(越南船民)
(ADVP-TMP(现在为RB))
(副总裁(VBN离开)
(高级警司(高级警司(经常预算案至今)(经常预算案至今))
(SBAR)(以as为单位)
(S(NP-SBJ(-NONE-*PRO*))
(副总裁(收件人)
(副总裁(VB通话)
(PP-CLR(适用于)
(NP(NP(DT)(NN撤销))
(第页,共页)
(NP(DT)(NN奖()()()())())())(
(. .)))
((NP-SBJ(PRP$他们的)(NNS机会))
(副总裁(VBP为)
(ADJP-PRD(JJ小型))
(,)
(高级顾问(但经常预算))
(,)
(SBAR-PRP(在as中)
(NP-SBJ(PRP$他们的)(NNS论点))
(副总裁(副总裁)(VBP依靠)
(PP-CLR=1(接通时)
(NP(NN论战)(NN夸张)(CC和)(NN误传)))
(: --)
(副总裁(副总裁)
(NP(JJ其他)(NNS字样)
(,)
(PP-CLR=1(接通时)
(NP(NP(DT)(JJ相同)(NNS技术))
(SBAR(WHNP-2(WP-2))
(S(NP-SBJ(PRP-THE))
(副总裁(越南船民指控)
(NP(NNP摩尔))
(PP-CLR(共页)
(S-NOM(NP-SBJ(-NONE-*PRO*))
(VP(VBG使用)
(NP(-NONE-*T*-2(()()()())()()()
(. .)))
我需要分别获得每个特定的解析。我认为最好的方法是用新的空行分割这个文件(是否有其他方法)。有人知道怎么做吗?我正在使用PHP。
此文件来自MASC语料库
谢谢。我实际上是通过以下方式完成的:
$newfile= file("textfile.txt");
$temp_str='';
$parses=array();
foreach ($newfile as $line) {
$temp=trim($line);
if(strlen($temp)>0){
$temp_str.=$temp;
}
else{
array_push($parses, $temp_str);
$temp_str='';
}
}
如果这是一个PHP问题,为什么要使用java标记?