Java 使用Jackson从所有JSON数组中删除重复的文本值

Java 使用Jackson从所有JSON数组中删除重复的文本值,java,json,jackson,Java,Json,Jackson,我有一个JSON文件,它有几个包含重复值的文本数组。例如: { "mName": "Carl Sanchez", "mEmailID": "csanchez0@msn.com", "mPhoneNo": 7954041324, "tutorTypes": [ " Freelancer/Professional Tutor", " Freelancer/Professional Tutor", " Coaching

我有一个JSON文件,它有几个包含重复值的文本数组。例如:

{
    "mName": "Carl Sanchez",
    "mEmailID": "csanchez0@msn.com",
    "mPhoneNo": 7954041324,

    "tutorTypes": [
        " Freelancer/Professional Tutor",
        " Freelancer/Professional Tutor",
        " Coaching Institute Teacher ",
        " Corporate Professional ",
        " Freelancer/Professional Tutor",
        " Freelancer/Professional Tutor",
        " Freelancer/Professional Tutor",
        " Freelancer/Professional Tutor",
        " Freelancer/Professional Tutor",
        " Freelancer/Professional Tutor",
        " Freelancer/Professional Tutor",
        " Freelancer/Professional Tutor",
        " Freelancer/Professional Tutor"
    ],
    "disciplines": [
        " Japanese",
        " German ",
        " Japanese",
        " German ",
        " Japanese",
        " Hindi ",
        " Japanese",
        " French "
    ]
}
我想从JSON源中的所有数组中删除重复值(文本值)。在上面的示例中,这就是从数组中删除重复的语言和导师类型。所需的输出将是上述JSON源,只要在适用的情况下删除重复的值。此外,我不希望将代码绑定到特定的JSON字段名,而是希望将代码绑定到任何文本值数组。上述示例中的期望输出为

{
    "mName": "Carl Sanchez",
    "mEmailID": "csanchez0@msn.com",
    "mPhoneNo": 7954041324,

    "tutorTypes": [
        " Freelancer/Professional Tutor",
        " Coaching Institute Teacher ",
        " Corporate Professional "
    ],
    "disciplines": [
        " Japanese",
        " German ",
        " Hindi ",
        " French "
    ]
}
JSON的输入源是一个文件,我想将输出写入一个文件。 我已尝试使用Jackson数据绑定API实现这一点:

public static void removeDuplicateStringElementsFromAllArrays(String file) throws IOException {

        Writer fileWriter = new BufferedWriter(new FileWriter(new File("out.json")));

        JsonFactory f = new MappingJsonFactory();
        JsonParser jp = f.createJsonParser(new File(file));

        parse(jp, fileWriter);
    }

    private static void parse(JsonParser jp, Writer writer) throws IOException{
        JsonToken current;
        current = jp.nextToken();

        if(current != null){
            System.out.println(current.asString());
            writer.write(current.asString());
        }

        if(current == JsonToken.START_ARRAY){
            if(jp.nextTextValue() != null){
                JsonNode node = jp.readValueAsTree();
                // Trim the String values
                String[] values = ArraysUtil.trimArray("\"" , node.toString().split(","), "\"");
                // Ensure that there is no duplicate value
                values = new HashSet<String>(Arrays.asList(values)).toArray(new String[0]);
                // Finally, concatenate the values back and stash them to file
                String concatValue = String.join(",", values);

                // Write the concatenated values to file
                writer.write(concatValue);
            }
            else{
                parse(jp, writer);
            }
        }
        else{
            // Move on directly
            parse(jp, writer);
        }
    }
public static void removedUpplicateStringElementsFromallarray(字符串文件)引发IOException{
Writer fileWriter=new BufferedWriter(新文件(“out.json”));
JsonFactory f=新映射JsonFactory();
JsonParser jp=f.createJsonParser(新文件(File));
parse(jp,fileWriter);
}
私有静态void parse(JsonParser-jp,Writer-Writer)抛出IOException{
杰森托肯电流;
当前=jp.nextToken();
如果(当前!=null){
System.out.println(current.asString());
writer.write(current.asString());
}
if(当前==JsonToken.START\u数组){
如果(jp.nextTextValue()!=null){
JsonNode=jp.readValueAsTree();
//修剪字符串值
String[]value=ArraysUtil.trimArray(“\”,node.toString().split(“,”,“\”);
//确保没有重复的值
values=newhashset(Arrays.asList(values)).toArray(新字符串[0]);
//最后,将这些值连接回来并将它们隐藏到文件中
String concatValue=String.join(“,”,value);
//将连接的值写入文件
writer.write(concatValue);
}
否则{
parse(jp,作家);
}
}
否则{
//直接前进
parse(jp,作家);
}
}
我得到几个空值作为输出。我知道为什么会发生这种情况。我想,当我调用
jp.nextTextValue()
时,解析器已经开始运行,构建一个值树可能会导致这种情况,但我无法找到任何解决方法。有人知道我如何才能完成这项任务吗

编辑:


只想在这里添加一件事——我使用Jackson Databind API,因为它是基于流式API构建的,在解析大型JSON源时非常有效,这就是我的情况。因此,考虑到这一点的解决方案将不胜感激。

这里是一个使用Json Simple的示例。注意,这假设数组存在于根级别,并且不检查每个参数中是否有嵌套数组。如果您想支持,可以添加递归逻辑

package test.json.jsonsimple;

import java.io.FileNotFoundException;
import java.io.FileReader;
import java.io.FileWriter;
import java.io.IOException;
import java.util.HashSet;
import java.util.Iterator;
import java.util.Map;
import java.util.Set;

import org.json.simple.JSONArray;
import org.json.simple.JSONObject;
import org.json.simple.parser.JSONParser;
import org.json.simple.parser.ParseException;

public class App 
{
    @SuppressWarnings("unchecked")
    public static void main( String[] args )
    {
        System.out.println( "Hello World!" );

        JSONParser parser = new JSONParser();

        try {
            JSONObject outmap = new JSONObject();
            Object obj = parser.parse(new FileReader("d:\\in.json"));
            JSONObject jsonObject = (JSONObject) obj;
            for(Object o : jsonObject.entrySet()){
                if(o instanceof Map.Entry){
                    Map.Entry<String, Object> entry = (Map.Entry<String, Object>) o;
                    if(entry !=null ){
                        if(entry.getValue() instanceof JSONArray){
                            Set<String> uniqueValues = removeDuplicates(entry.getValue());
                            outmap.put(entry.getKey(), uniqueValues);
                        }else{
                            outmap.put(entry.getKey(), entry.getValue());
                        }
                    }
                }
            }

            FileWriter file = new FileWriter("d:\\out.json");
            file.write(outmap.toJSONString());
            file.flush();
            file.close();

        } catch (FileNotFoundException e) {
            e.printStackTrace();
        } catch (IOException e) {
            e.printStackTrace();
        } catch (ParseException e) {
            e.printStackTrace();
        }

    }

    @SuppressWarnings("unchecked")
    private static Set<String> removeDuplicates(Object value) {
        Set<String> outset = new HashSet<String>();
        JSONArray inset = (JSONArray) value;

        if (inset != null) {
            Iterator<String> iterator = inset.iterator();
            while (iterator.hasNext()) {
                outset.add(iterator.next());
            } 
        }
        return outset;
    }
}
package test.json.jsonsimple;
导入java.io.FileNotFoundException;
导入java.io.FileReader;
导入java.io.FileWriter;
导入java.io.IOException;
导入java.util.HashSet;
导入java.util.Iterator;
导入java.util.Map;
导入java.util.Set;
导入org.json.simple.JSONArray;
导入org.json.simple.JSONObject;
导入org.json.simple.parser.JSONParser;
导入org.json.simple.parser.ParseException;
公共类应用程序
{
@抑制警告(“未选中”)
公共静态void main(字符串[]args)
{
System.out.println(“你好,世界!”);
JSONParser=新的JSONParser();
试一试{
JSONObject outmap=新的JSONObject();
Object obj=parser.parse(新文件读取器(“d:\\in.json”);
JSONObject JSONObject=(JSONObject)对象;
对于(对象o:jsonObject.entrySet()){
if(o映射项的实例){
Map.Entry=(Map.Entry)o;
if(条目!=null){
if(JSONArray的entry.getValue()instanceof){
Set uniqueValues=removeDuplicates(entry.getValue());
outmap.put(entry.getKey(),uniqueValues);
}否则{
outmap.put(entry.getKey(),entry.getValue());
}
}
}
}
FileWriter file=新的FileWriter(“d:\\out.json”);
write(outmap.toJSONString());
flush()文件;
file.close();
}catch(filenotfounde异常){
e、 printStackTrace();
}捕获(IOE异常){
e、 printStackTrace();
}捕获(解析异常){
e、 printStackTrace();
}
}
@抑制警告(“未选中”)
私有静态集RemovedUpplicates(对象值){
Set start=newhashset();
JSONArray inset=(JSONArray)值;
如果(插入!=null){
迭代器迭代器=inset.Iterator();
while(iterator.hasNext()){
add(iterator.next());
} 
}
回归起点;
}
}

创建一个beanContact.java,并将属性声明为
Set
,以便删除重复项

当序列化JSON时,集合将完成删除重复项的工作。不需要额外的代码

package com.tmp;

import java.util.Set;

public class Contact {

    String      mName;
    String      mEmailID;
    long        mPhoneNo;

    Set<String> tutorTypes; // to remove duplicates
    Set<String> disciplines; // to remove duplicates

    // setter and getter methods goes here...    
}
package com.tmp;

import java.io.File;
import java.io.IOException;

import com.fasterxml.jackson.databind.ObjectMapper;


/**
 * 
 * @author Ravi P
 */
class Tmp {

    public static void main( String[] args ) throws IOException {

        ObjectMapper mapper = new ObjectMapper();

        Contact contact = mapper.readValue( new File( "D:\\tmp\\file.json" ), Contact.class );

        mapper.writeValue( new File( "D:\\tmp\\file1.json" ), contact );

    }
}