Hadoop 跳过mapreduce中具有特定值的记录_Hadoop_Mapreduce_Hdfs

Hadoop 跳过mapreduce中具有特定值的记录

hadoop mapreduce

Hadoop 跳过mapreduce中具有特定值的记录,hadoop,mapreduce,hdfs,Hadoop,Mapreduce,Hdfs,我有一个包含许多记录的数据集。考虑记录的前两个字段是Field1和Field2。如果field1或field2的值为AA，我必须在映射过程中跳过此记录。请帮助我使用该程序。在mapper类中，您可以将if条件设置为 if((field1!='AA')||(field2!='AA')){ //your code here } 此if条件将跳过值为“A”的字段，您将能够处理其余记录，并可以将结果返回到上下文。|应为&&，最好使用字符串的equals方法比较字符串。 map () {

我有一个包含许多记录的数据集。考虑记录的前两个字段是Field1和Field2。如果field1或field2的值为AA，我必须在映射过程中跳过此记录。

请帮助我使用该程序。

在mapper类中，您可以将if条件设置为

if((field1!='AA')||(field2!='AA')){

 //your code here
}

此if条件将跳过值为“A”的字段，您将能够处理其余记录，并可以将结果返回到上下文。

应为

&&

，最好使用字符串的

equals

方法比较字符串。

map () {
  //your existing code to extract field1 and field2
  if (field1.equals("AA") || field2.equals("AA") {
    return; // map stops here. you can also increment a counter to count how many such records exist in your dataset
  }
  // add the rest of your existing code here
  context.write(...);
}