Java 从文本文件中选择多行
我正在开发一个程序,该程序将根据候选人在总统辩论中所说的话创建一个词云。文本文件的设置方式一个人可以说多行,我想把所有这些行都记下来,这样我就可以计算他们说话的频率。还有一个Java 从文本文件中选择多行,java,text,file-io,Java,Text,File Io,我正在开发一个程序,该程序将根据候选人在总统辩论中所说的话创建一个词云。文本文件的设置方式一个人可以说多行,我想把所有这些行都记下来,这样我就可以计算他们说话的频率。还有一个停止词列表,该列表将不计入单词云。停止词的一些例子有:“is”、“a”、“the”等等。到目前为止,我已经能够接收到辩论的所有停止词和整个发言稿,并从发言稿中删除停止词。现在我想把成绩单分为每个候选人所说的内容,因为一个人说的话有多行,所以我很难理解。我们将非常感谢您的帮助 迄今为止的代码: import java.io.F
停止词列表
,该列表将不计入单词云。停止词的一些例子有:“is”、“a”、“the”等等。到目前为止,我已经能够接收到辩论的所有停止词
和整个发言稿,并从发言稿中删除停止词
。现在我想把成绩单分为每个候选人所说的内容,因为一个人说的话有多行,所以我很难理解。我们将非常感谢您的帮助
迄今为止的代码:
import java.io.File;
import java.io.FileNotFoundException;
import java.util.ArrayList;
import java.util.Scanner;
public class ResendizYonzon {
public static void main(String[] args) throws FileNotFoundException
{
readTextFile("democratic-debate2015Oct13.txt");
}
public static String readTextFile(String text) throws FileNotFoundException {
File f = new File(text);
Scanner out = new Scanner(f);
String word = "";
File f1 = new File("stopwords.txt");
Scanner out1 = new Scanner(f1);
ArrayList<String> stopWords = new ArrayList<String>();
ArrayList<String> words = new ArrayList<String>();
while (out1.hasNext()) {
stopWords.add(out1.next());
}
while (out.hasNext()) {
words.add(out.next());
}
words.removeAll(stopWords);
out.close();
out1.close();
return word;
}
}
根据您在问题中对问题的描述以及对问题的评论,您希望将用户提供的所有语音连接到一个语音中,以便当用户要求发言者的语音时,即克林顿的语音,您只需给他们克林顿的语音部分即可
这很容易做到。哪个?冒号(:)
是解决此问题的凭证。如果查看输入文件,只要有新的说话人,该行就以说话人名称开头,后跟冒号
您需要执行以下操作列表:
- 打开文件
- 逐行读取文件
- 对于每一行,检查该行是否包含冒号(:)
- 如果该行包含冒号,则需要使用冒号作为分隔符拆分该行
- 假设您在其他地方没有冒号,拆分以下行
库珀:请记录在案,你是进步派还是温和派?
为您提供以下标记(假设其他地方没有冒号)
令牌[0]=COOPER
Token[1]=仅作记录,你是进步派还是温和派?
- 现在你们有了两个代币,检查你们是在刚开始阅读成绩单文件,还是已经有演讲者了
- 如果你刚开始读文件,那么你就有了第一个演讲者,所以把他加起来,初始化变量
- 如果已经有其他演讲者,则在阅读新演讲者(或再次访问的演讲者)的演讲之前,添加上一位演讲者(如果尚未添加),并更新他/她的演讲李>
如果继续上述步骤,则每次遇到说话人时,都会更新该说话人的语音并将其添加到哈希映射中,直到到达行的末尾
下面是上面的示例代码,并对其进行了完整的注释以帮助您理解
//public static HashSet that stores your speakers.
public static Map<String, String> speakerSpeech = new HashMap<String, String>();
public static void main(String[] args) throws FileNotFoundException
{
readTextFile("C:\\test_java\\transcript.txt");
}
public static void readTextFile(String text) throws FileNotFoundException {
File f = new File(text);
String line;
BufferedReader br;
try {
//open input stream to the path passed as text
FileInputStream fstream = new FileInputStream(text);
//open buffered reader using the input stream
br = new BufferedReader(new InputStreamReader(fstream));
//String builder used to append speech and lines (String is immutable)
StringBuilder speech = new StringBuilder();
// currentSpeaker is used for history. when new speaker is found, we should know who was previous one
// so we save all the speech that so far we have read
String currentSpeaker = null;
// while loop keeps looping over file line by line and terminates when line == null
// that is when end of file is reached.
while((line=br.readLine()) != null) {
//if line contains : then it is a line having a speaker, based on structure of your input file
if(line.contains(":")) {
//split the line using colon as seperator gives us 2 values (speaker and sentence) based
//on structure of your file
String[] chunks = line.split(":");
//store the speaker name CLINTON that was chunks[0] because left most value to colon
//triming whitespace (leading and trailing if any)
String speakerName = chunks[0].trim();
//condition to check if we just started reading transcripts or already read some
if(currentSpeaker == null) {
//just started reading transcript file, this is the first speaker ever
// assign the speaker to currentSpeaker
currentSpeaker = speakerName;
//add the remainder of speech after colon : to the speech StringBuilder
speech.append(chunks[1]);
} else {
//else because currentSpeaker is not null, we already have read speakers before
//current speaker is old speaker and we are about to scan new speaker so
//condition to check if speaker is already added to out list of speakers
if(speakerSpeech.containsKey(currentSpeaker)) {
//yes speaker is already added in map, then get its previous speechs
String previousSpeech = speakerSpeech.get(currentSpeaker);
//re-add the speaker in map and but this time with updated speech
//concatenating previous speech with current speech
speakerSpeech.put(currentSpeaker, previousSpeech + " >>> " + speech.toString());
} else {
//no speaker is new, then add it to the map with its speech
speakerSpeech.put(currentSpeaker, speech.toString());
}
//after storing previous speaker in list, add current speaker for record
currentSpeaker = speakerName.trim();
//initialize speech variable with new speakers speech after : colon
speech = new StringBuilder(chunks[1]);
}
} else {
//this else is because line did not have colon : hence, its continuation of speech
// of current speaker, just append to the speech
speech.append(line);
}
}
//because last line == null and loop terminates, we have to add the last speaker's speech to
//the list manually.
if(speakerSpeech.containsKey(currentSpeaker)) {
String previousSpeech = speakerSpeech.get(currentSpeaker);
speakerSpeech.put(currentSpeaker, previousSpeech + " >>> " + speech.toString());
} else {
speakerSpeech.put(currentSpeaker, speech.toString());
}
System.out.println("No. of speakers: " + speakerSpeech.size());
} catch(Exception ex) {
//handle error
}
//all speakers with their speech one giant string.
System.out.println(speakerSpeech.toString());
}
你面临的问题是什么?有错误吗?没有错误,只是想弄明白如何接受某个候选人所说的每一句话。例如,如果用户想要克林顿,那么我只需要克林顿所说的一切。嗯,对不起,但我不能相信你不知道如何解决这个问题。你怎么知道哪篇文章是克林顿写的?提示:当说话人改变时,在行首有说话人的名字。首先确定发言者的姓名。然后每个文本紧跟在一个名字之后,直到下一个名字或文件的结尾与这个说话人关联。我知道。如果一切都在一条线上,那就容易了。但是一个人说的是多行文字,我很难弄清楚如何把所有这些文字都记下来。也许逐行阅读文字会更好。使用带其方法的扫描仪,而不是扫描仪
。然后在空白处拆分行,查找第一个单词。如果全部大写并以冒号结尾,则为说话人名称。以下单词与这位演讲者有关。读下一行。如果第一个单词不是说话人姓名,请将单词添加到最后一个说话人,依此类推。
//public static HashSet that stores your speakers.
public static Map<String, String> speakerSpeech = new HashMap<String, String>();
public static void main(String[] args) throws FileNotFoundException
{
readTextFile("C:\\test_java\\transcript.txt");
}
public static void readTextFile(String text) throws FileNotFoundException {
File f = new File(text);
String line;
BufferedReader br;
try {
//open input stream to the path passed as text
FileInputStream fstream = new FileInputStream(text);
//open buffered reader using the input stream
br = new BufferedReader(new InputStreamReader(fstream));
//String builder used to append speech and lines (String is immutable)
StringBuilder speech = new StringBuilder();
// currentSpeaker is used for history. when new speaker is found, we should know who was previous one
// so we save all the speech that so far we have read
String currentSpeaker = null;
// while loop keeps looping over file line by line and terminates when line == null
// that is when end of file is reached.
while((line=br.readLine()) != null) {
//if line contains : then it is a line having a speaker, based on structure of your input file
if(line.contains(":")) {
//split the line using colon as seperator gives us 2 values (speaker and sentence) based
//on structure of your file
String[] chunks = line.split(":");
//store the speaker name CLINTON that was chunks[0] because left most value to colon
//triming whitespace (leading and trailing if any)
String speakerName = chunks[0].trim();
//condition to check if we just started reading transcripts or already read some
if(currentSpeaker == null) {
//just started reading transcript file, this is the first speaker ever
// assign the speaker to currentSpeaker
currentSpeaker = speakerName;
//add the remainder of speech after colon : to the speech StringBuilder
speech.append(chunks[1]);
} else {
//else because currentSpeaker is not null, we already have read speakers before
//current speaker is old speaker and we are about to scan new speaker so
//condition to check if speaker is already added to out list of speakers
if(speakerSpeech.containsKey(currentSpeaker)) {
//yes speaker is already added in map, then get its previous speechs
String previousSpeech = speakerSpeech.get(currentSpeaker);
//re-add the speaker in map and but this time with updated speech
//concatenating previous speech with current speech
speakerSpeech.put(currentSpeaker, previousSpeech + " >>> " + speech.toString());
} else {
//no speaker is new, then add it to the map with its speech
speakerSpeech.put(currentSpeaker, speech.toString());
}
//after storing previous speaker in list, add current speaker for record
currentSpeaker = speakerName.trim();
//initialize speech variable with new speakers speech after : colon
speech = new StringBuilder(chunks[1]);
}
} else {
//this else is because line did not have colon : hence, its continuation of speech
// of current speaker, just append to the speech
speech.append(line);
}
}
//because last line == null and loop terminates, we have to add the last speaker's speech to
//the list manually.
if(speakerSpeech.containsKey(currentSpeaker)) {
String previousSpeech = speakerSpeech.get(currentSpeaker);
speakerSpeech.put(currentSpeaker, previousSpeech + " >>> " + speech.toString());
} else {
speakerSpeech.put(currentSpeaker, speech.toString());
}
System.out.println("No. of speakers: " + speakerSpeech.size());
} catch(Exception ex) {
//handle error
}
//all speakers with their speech one giant string.
System.out.println(speakerSpeech.toString());
}
{COOPER= Just for the record, are you a progressive, or are you a moderate? >>> Secretary... >>> ...thank you... >>> ...Senator... >>> Senator Sanders. A Gallup poll says half the country would not put a socialist in the White House. You call yourself a democratic socialist. How can any kind of socialist win a general election in the United States?, SANDERS= Well, we're gonna win because first, we're gonna explain what democratic socialism is.And what democratic socialism is about is saying that it is immoral and wrong that the top one-tenth of 1 percent in this country own almost 90 percent - almost - own almost as much wealth as the bottom 90 percent. That it is wrong, today, in a rigged economy, that 57 percent of all new income is going to the top 1 percent.That when you look around the world, you see every other major country providing health care to all people as a right, except the United States. You see every other major country saying to moms that, when you have a baby, we're not gonna separate you from your newborn baby, because we are going to have - we are gonna have medical and family paid leave, like every other country on Earth.Those are some of the principles that I believe in, and I think we should look to countries like Denmark, like Sweden and Norway, and learn from what they have accomplished for their working people.(APPLAUSE), CLINTON= No. I think that, like most people that I know, I have a range of views, but they are rooted in my values and my experience. And I don't take a back seat to anyone when it comes to progressive experience and progressive commitment.You know, when I left law school, my first job was with the Children's Defense Fund, and for all the years since, I have been focused on how we're going to un-stack the deck, and how we're gonna make it possible for more people to have the experience I had.You know, to be able to come from a grandfather who was a factory worker, a father who was a small business person, and now asking the people of America to elect me president. >>> I'm a progressive. But I'm a progressive who likes to get things done. And I know...(APPLAUSE)...how to find common ground, and I know how to stand my ground, and I have proved that in every position that I've had, even dealing with Republicans who never had a good word to say about me, honestly. But we found ways to work together on everything from... >>> ...reforming foster care and adoption to the Children's Health Insurance Program, which insures... >>> ...8 million kids. So I have a long history of getting things done, rooted in the same values... >>> ...I've always had.}