Java 从文本文件中选择多行_Java_Text_File Io

Java 从文本文件中选择多行

java text file-io

Java 从文本文件中选择多行,java,text,file-io,Java,Text,File Io,我正在开发一个程序，该程序将根据候选人在总统辩论中所说的话创建一个词云。文本文件的设置方式一个人可以说多行，我想把所有这些行都记下来，这样我就可以计算他们说话的频率。还有一个停止词列表，该列表将不计入单词云。停止词的一些例子有：“is”、“a”、“the”等等。到目前为止，我已经能够接收到辩论的所有停止词和整个发言稿，并从发言稿中删除停止词。现在我想把成绩单分为每个候选人所说的内容，因为一个人说的话有多行，所以我很难理解。我们将非常感谢您的帮助迄今为止的代码： import java.io.F

我正在开发一个程序，该程序将根据候选人在总统辩论中所说的话创建一个词云。文本文件的设置方式一个人可以说多行，我想把所有这些行都记下来，这样我就可以计算他们说话的频率。还有一个

停止词列表

，该列表将不计入单词云。

停止词的一些例子有：“is”、“a”、“the”等等。到目前为止，我已经能够接收到辩论的所有停止词
和整个发言稿，并从发言稿中删除停止词
。现在我想把成绩单分为每个候选人所说的内容，因为一个人说的话有多行，所以我很难理解。我们将非常感谢您的帮助
迄今为止的代码：
import java.io.File;
import java.io.FileNotFoundException;
import java.util.ArrayList;
import java.util.Scanner;

public class ResendizYonzon {
   public static void main(String[] args) throws FileNotFoundException
   {
       readTextFile("democratic-debate2015Oct13.txt");
   }

   public static String readTextFile(String text) throws FileNotFoundException {
       File f = new File(text);
       Scanner out = new Scanner(f);
       String word = "";
       File f1 = new File("stopwords.txt");
       Scanner out1 = new Scanner(f1);
       ArrayList<String> stopWords = new ArrayList<String>();
       ArrayList<String> words = new ArrayList<String>();
       while (out1.hasNext()) {
           stopWords.add(out1.next());
       }
       while (out.hasNext()) {
           words.add(out.next());
       }
       words.removeAll(stopWords);
       out.close();
       out1.close();
       return word;
   }
}

根据您在问题中对问题的描述以及对问题的评论，您希望将用户提供的所有语音连接到一个语音中，以便当用户要求发言者的语音时，即克林顿的语音，您只需给他们克林顿的语音部分即可
这很容易做到。哪个？冒号（：）
是解决此问题的凭证。如果查看输入文件，只要有新的说话人，该行就以说话人名称开头，后跟冒号
您需要执行以下操作列表：

打开文件
逐行读取文件
对于每一行，检查该行是否包含冒号（：）
如果该行包含冒号，则需要使用冒号作为分隔符拆分该行
假设您在其他地方没有冒号，拆分以下行
库珀：请记录在案，你是进步派还是温和派？

为您提供以下标记（假设其他地方没有冒号）
令牌[0]=COOPER
Token[1]=仅作记录，你是进步派还是温和派？

现在你们有了两个代币，检查你们是在刚开始阅读成绩单文件，还是已经有演讲者了
如果你刚开始读文件，那么你就有了第一个演讲者，所以把他加起来，初始化变量
如果已经有其他演讲者，则在阅读新演讲者（或再次访问的演讲者）的演讲之前，添加上一位演讲者（如果尚未添加），并更新他/她的演讲

如果继续上述步骤，则每次遇到说话人时，都会更新该说话人的语音并将其添加到哈希映射中，直到到达行的末尾
下面是上面的示例代码，并对其进行了完整的注释以帮助您理解
   //public static HashSet that stores your speakers.
   public static Map<String, String> speakerSpeech = new HashMap<String, String>(); 

   public static void main(String[] args) throws FileNotFoundException
   {
       readTextFile("C:\\test_java\\transcript.txt");
   }

   public static void readTextFile(String text) throws FileNotFoundException {
       File f = new File(text);

       String line; 
       BufferedReader br; 
       try {
           //open input stream to the path passed as text
           FileInputStream fstream = new FileInputStream(text);

           //open buffered reader using the input stream
           br = new BufferedReader(new InputStreamReader(fstream));

           //String builder used to append speech and lines (String is immutable) 
           StringBuilder speech = new StringBuilder(); 

           // currentSpeaker is used for history. when new speaker is found, we should know who was previous one
           // so we save all the speech that so far we have read
           String currentSpeaker = null; 

           // while loop keeps looping over file line by line and terminates when line == null
           // that is when end of file is reached. 
           while((line=br.readLine()) != null) {

               //if line contains : then it is a line having a speaker, based on structure of your input file
               if(line.contains(":")) {
                   //split the line using colon as seperator gives us 2 values (speaker and sentence) based
                   //on structure of your file
                   String[] chunks = line.split(":");

                   //store the speaker name CLINTON that was chunks[0] because left most value to colon
                   //triming whitespace (leading and trailing if any)
                   String speakerName = chunks[0].trim(); 

                   //condition to check if we just started reading transcripts or already read some
                   if(currentSpeaker == null) {
                       //just started reading transcript file, this is the first speaker ever
                       // assign the speaker to currentSpeaker
                       currentSpeaker = speakerName; 

                       //add the remainder of speech after colon : to the speech StringBuilder 
                       speech.append(chunks[1]); 
                   } else {
                       //else because currentSpeaker is not null, we already have read speakers before
                       //current speaker is old speaker and we are about to scan new speaker so

                       //condition to check if speaker is already added to out list of speakers
                       if(speakerSpeech.containsKey(currentSpeaker)) {
                           //yes speaker is already added in map, then get its previous speechs
                           String previousSpeech = speakerSpeech.get(currentSpeaker); 

                           //re-add the speaker in map and but this time with updated speech
                           //concatenating previous speech with current speech
                           speakerSpeech.put(currentSpeaker, previousSpeech + " >>> " + speech.toString()); 
                       } else {
                           //no speaker is new, then add it to the map with its speech
                           speakerSpeech.put(currentSpeaker, speech.toString()); 
                       }

                       //after storing previous speaker in list, add current speaker for record
                       currentSpeaker = speakerName.trim(); 

                       //initialize speech variable with new speakers speech after : colon
                       speech = new StringBuilder(chunks[1]); 
                   }
               } else {
                   //this else is because line did not have colon : hence, its continuation of speech 
                   // of current speaker, just append to the speech
                   speech.append(line); 
               }
           }

           //because last line == null and loop terminates, we have to add the last speaker's speech to 
           //the list manually. 
           if(speakerSpeech.containsKey(currentSpeaker)) {
               String previousSpeech = speakerSpeech.get(currentSpeaker); 
               speakerSpeech.put(currentSpeaker, previousSpeech + " >>> " + speech.toString()); 
           } else {
               speakerSpeech.put(currentSpeaker, speech.toString()); 
           }

           System.out.println("No. of speakers: " + speakerSpeech.size());
       } catch(Exception ex) {
           //handle error
       }

       //all speakers with their speech one giant string. 
       System.out.println(speakerSpeech.toString());
}

你面临的问题是什么？有错误吗？没有错误，只是想弄明白如何接受某个候选人所说的每一句话。例如，如果用户想要克林顿，那么我只需要克林顿所说的一切。嗯，对不起，但我不能相信你不知道如何解决这个问题。你怎么知道哪篇文章是克林顿写的？提示：当说话人改变时，在行首有说话人的名字。首先确定发言者的姓名。然后每个文本紧跟在一个名字之后，直到下一个名字或文件的结尾与这个说话人关联。我知道。如果一切都在一条线上，那就容易了。但是一个人说的是多行文字，我很难弄清楚如何把所有这些文字都记下来。也许逐行阅读文字会更好。使用带其方法的扫描仪，而不是扫描仪。然后在空白处拆分行，查找第一个单词。如果全部大写并以冒号结尾，则为说话人名称。以下单词与这位演讲者有关。读下一行。如果第一个单词不是说话人姓名，请将单词添加到最后一个说话人，依此类推。
   //public static HashSet that stores your speakers.
   public static Map<String, String> speakerSpeech = new HashMap<String, String>(); 

   public static void main(String[] args) throws FileNotFoundException
   {
       readTextFile("C:\\test_java\\transcript.txt");
   }

   public static void readTextFile(String text) throws FileNotFoundException {
       File f = new File(text);

       String line; 
       BufferedReader br; 
       try {
           //open input stream to the path passed as text
           FileInputStream fstream = new FileInputStream(text);

           //open buffered reader using the input stream
           br = new BufferedReader(new InputStreamReader(fstream));

           //String builder used to append speech and lines (String is immutable) 
           StringBuilder speech = new StringBuilder(); 

           // currentSpeaker is used for history. when new speaker is found, we should know who was previous one
           // so we save all the speech that so far we have read
           String currentSpeaker = null; 

           // while loop keeps looping over file line by line and terminates when line == null
           // that is when end of file is reached. 
           while((line=br.readLine()) != null) {

               //if line contains : then it is a line having a speaker, based on structure of your input file
               if(line.contains(":")) {
                   //split the line using colon as seperator gives us 2 values (speaker and sentence) based
                   //on structure of your file
                   String[] chunks = line.split(":");

                   //store the speaker name CLINTON that was chunks[0] because left most value to colon
                   //triming whitespace (leading and trailing if any)
                   String speakerName = chunks[0].trim(); 

                   //condition to check if we just started reading transcripts or already read some
                   if(currentSpeaker == null) {
                       //just started reading transcript file, this is the first speaker ever
                       // assign the speaker to currentSpeaker
                       currentSpeaker = speakerName; 

                       //add the remainder of speech after colon : to the speech StringBuilder 
                       speech.append(chunks[1]); 
                   } else {
                       //else because currentSpeaker is not null, we already have read speakers before
                       //current speaker is old speaker and we are about to scan new speaker so

                       //condition to check if speaker is already added to out list of speakers
                       if(speakerSpeech.containsKey(currentSpeaker)) {
                           //yes speaker is already added in map, then get its previous speechs
                           String previousSpeech = speakerSpeech.get(currentSpeaker); 

                           //re-add the speaker in map and but this time with updated speech
                           //concatenating previous speech with current speech
                           speakerSpeech.put(currentSpeaker, previousSpeech + " >>> " + speech.toString()); 
                       } else {
                           //no speaker is new, then add it to the map with its speech
                           speakerSpeech.put(currentSpeaker, speech.toString()); 
                       }

                       //after storing previous speaker in list, add current speaker for record
                       currentSpeaker = speakerName.trim(); 

                       //initialize speech variable with new speakers speech after : colon
                       speech = new StringBuilder(chunks[1]); 
                   }
               } else {
                   //this else is because line did not have colon : hence, its continuation of speech 
                   // of current speaker, just append to the speech
                   speech.append(line); 
               }
           }

           //because last line == null and loop terminates, we have to add the last speaker's speech to 
           //the list manually. 
           if(speakerSpeech.containsKey(currentSpeaker)) {
               String previousSpeech = speakerSpeech.get(currentSpeaker); 
               speakerSpeech.put(currentSpeaker, previousSpeech + " >>> " + speech.toString()); 
           } else {
               speakerSpeech.put(currentSpeaker, speech.toString()); 
           }

           System.out.println("No. of speakers: " + speakerSpeech.size());
       } catch(Exception ex) {
           //handle error
       }

       //all speakers with their speech one giant string. 
       System.out.println(speakerSpeech.toString());
}

{COOPER=  Just for the record, are you a progressive, or are you a moderate? >>>   Secretary... >>>   ...thank you... >>>   ...Senator... >>>   Senator Sanders.  A Gallup poll says half the country would not put a socialist in the White House.  You call yourself a democratic socialist.  How can any kind of socialist win a general election in the United States?, SANDERS=  Well, we're gonna win because first, we're gonna explain what democratic socialism is.And what democratic socialism is about  is saying that it is immoral and wrong that the top one-tenth of 1 percent in this country own almost 90 percent - almost - own almost as much wealth as the bottom 90 percent.  That it is wrong, today, in a rigged economy, that 57 percent of all new income is going to the top 1 percent.That when you look around the world, you see every other major country providing health care to all people as a right, except the United States.  You see every other major country saying to moms that, when you have a baby, we're not gonna separate you from your newborn baby, because we are going to have - we are gonna have medical and family paid leave, like every other country on Earth.Those are some of the principles that I believe in, and I think we should look to countries like Denmark, like Sweden and Norway, and learn from what they have accomplished for their working people.(APPLAUSE), CLINTON=  No.  I think that, like most people that I know, I have a range of views, but they are rooted in my values and my experience. And I don't take a back seat to anyone when it comes to progressive experience and progressive commitment.You know, when I left law school, my first job was with the Children's Defense Fund, and for all the years since, I have been focused on how we're going to un-stack the deck, and how we're gonna make it possible for more people to have the experience I had.You know, to be able to come from a grandfather who was a factory worker, a father who was a small business person, and now asking the people of America to elect me president. >>>   I'm a progressive.  But I'm a progressive who likes to get things done.  And I know...(APPLAUSE)...how to find common ground, and I know how to stand my ground, and I have proved that in every position that I've had, even dealing with Republicans who never had a good word to say about me, honestly. But we found ways to work together on everything from... >>>   ...reforming foster care and adoption to the Children's Health Insurance Program, which insures... >>>   ...8 million kids.  So I have a long history of getting things done, rooted in the same values... >>>   ...I've always had.}