Java 从API读取数据
我已经编写了一个函数来从外部API读取一些数据。我的函数所做的是,它在从磁盘读取文件时调用该API。我想优化我的代码,以适应大文件(35000条记录)。你能给我提个建议吗 下面是我的代码Java 从API读取数据,java,json,optimization,csv,Java,Json,Optimization,Csv,我已经编写了一个函数来从外部API读取一些数据。我的函数所做的是,它在从磁盘读取文件时调用该API。我想优化我的代码,以适应大文件(35000条记录)。你能给我提个建议吗 下面是我的代码 public void readCSVFile() { try { br = new BufferedReader(new FileReader(getFileName())); while ((line = br.readLine()) != null) {
public void readCSVFile() {
try {
br = new BufferedReader(new FileReader(getFileName()));
while ((line = br.readLine()) != null) {
String[] splitLine = line.split(cvsSplitBy);
String campaign = splitLine[0];
String adGroup = splitLine[1];
String url = splitLine[2];
long searchCount = getSearchCount(url);
StringBuilder sb = new StringBuilder();
sb.append(campaign + ",");
sb.append(adGroup + ",");
sb.append(searchCount + ",");
writeToFile(sb, getNewFileName());
}
} catch (Exception e) {
e.printStackTrace();
}
}
private long getSearchCount(String url) {
long recordCount = 0;
try {
DefaultHttpClient httpClient = new DefaultHttpClient();
HttpGet getRequest = new HttpGet(
"api.com/querysearch?q="
+ url);
getRequest.addHeader("accept", "application/json");
HttpResponse response = httpClient.execute(getRequest);
if (response.getStatusLine().getStatusCode() != 200) {
throw new RuntimeException("Failed : HTTP error code : "
+ response.getStatusLine().getStatusCode());
}
BufferedReader br = new BufferedReader(new InputStreamReader(
(response.getEntity().getContent())));
String output;
while ((output = br.readLine()) != null) {
try {
JSONObject json = (JSONObject) new JSONParser()
.parse(output);
JSONObject result = (JSONObject) json.get("result");
recordCount = (long) result.get("count");
System.out.println(url + "=" + recordCount);
} catch (Exception e) {
System.out.println(e.getMessage());
}
}
httpClient.getConnectionManager().shutdown();
} catch (Exception e) {
e.getStackTrace();
}
return recordCount;
}
由于远程调用比本地磁盘访问慢,所以您可能希望以某种方式并行化或批处理远程调用。如果您无法对远程API进行批处理调用,但它允许多个并发读取,那么您可能希望使用线程池之类的东西来进行远程调用:
public void readCSVFile() {
// exception handling ignored for space
br = new BufferedReader(new FileReader(getFileName()));
List<Future<String>> futures = new ArrayList<Future<String>>();
ExecutorService pool = Executors.newFixedThreadPool(5);
while ((line = br.readLine()) != null) {
final String[] splitLine = line.split(cvsSplitBy);
futures.add(pool.submit(new Callable<String> {
public String call() {
long searchCount = getSearchCount(splitLine[2]);
return new StringBuilder()
.append(splitLine[0]+ ",")
.append(splitLine[1]+ ",")
.append(searchCount + ",")
.toString();
}
}));
}
for (Future<String> fs: futures) {
writeToFile(fs.get(), getNewFileName());
}
pool.shutdown();
}
public void readCSVFile(){
//已忽略空间的异常处理
br=新的BufferedReader(新的文件读取器(getFileName());
列表期货=新的ArrayList();
ExecutorService池=Executors.newFixedThreadPool(5);
而((line=br.readLine())!=null){
最终字符串[]拆分行=行.拆分(cvsSplitBy);
futures.add(pool.submit)(新的可调用{
公共字符串调用(){
long searchCount=getSearchCount(拆分行[2]);
返回新的StringBuilder()
.append(拆分行[0]+“,”)
.append(拆分行[1]+“,”)
.append(searchCount+,“”)
.toString();
}
}));
}
for(未来财政司司长:期货){
writeToFile(fs.get(),getNewFileName());
}
pool.shutdown();
}
但是,理想情况下,如果可能的话,您确实希望从远程API读取单个批处理。您的瓶颈肯定是HTTP内容。我想优化这个。如果可能的话,可能不会关闭连接或获得批量结果。是的,存在问题。问题是,我必须用一个来自文件的GET参数调用这个API。谢谢你的建议。顺便说一句,我无法进行单批读取。但允许多个并发读取。