Warning: file_get_contents(/data/phpspider/zhask/data//catemap/9/java/369.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Java 从BigQuery读取数据并将数据存储到Google存储(特殊字符问题)_Java_Google Cloud Storage - Fatal编程技术网

Java 从BigQuery读取数据并将数据存储到Google存储(特殊字符问题)

Java 从BigQuery读取数据并将数据存储到Google存储(特殊字符问题),java,google-cloud-storage,Java,Google Cloud Storage,参考: 代码正在工作,但问题是,当它将BigQuery的响应保存到google存储时,所有日文字符都已损坏 PCollectionTuple QVCollections = rows.apply("FilterEmptyRows", ParDo.of(new FilterEmptyRowDoFn("TransactionId", "TransactionDateTime"))).apply("CreateQVFiles",ParDo.of(new TransactionToQVFilesDoFn

参考:

代码正在工作,但问题是,当它将BigQuery的响应保存到google存储时,所有日文字符都已损坏

PCollectionTuple QVCollections = rows.apply("FilterEmptyRows", ParDo.of(new FilterEmptyRowDoFn("TransactionId", "TransactionDateTime"))).apply("CreateQVFiles",ParDo.of(new TransactionToQVFilesDoFnJP())
        .withOutputTags(BobShare.QVHeaders, TupleTagList.of(BobShare.QVEvents).and(BobShare.QVPayments)));

QVCollections.get(BobShare.QVEvents).apply("WriteQVEvents", TextIO.write().to(storagePath + CSV_OUTPUT_FOLDER + "events_" + timeSuffix).withoutSharding().withHeader(CSV_HEADER_EVENTS).withSuffix(".csv"));
QVCollections.get(BobShare.QVPayments).apply("WriteQVPayments", TextIO.write().to(storagePath + CSV_OUTPUT_FOLDER + "payments_" + timeSuffix).withoutSharding().withHeader(CSV_HEADER_PAYMENTS).withSuffix(".csv"));
QVCollections.get(BobShare.QVHeaders).apply("WriteQVHeaders", TextIO.write().to(storagePath + CSV_OUTPUT_FOLDER + "header_" + timeSuffix).withoutSharding().withHeader(CSV_HEADER_TRANSACTION).withSuffix(".csv"));
根据我的发现,需要使用
.withCoder(StringUtf8Coder.of())

此外,这也是我们尝试过的(但仅在本地工作-DirectRunner)

这是数据的外观(已损坏):

有什么建议吗?任何((

您需要更换

BufferedReader br=Files.newBufferedReader(路径文件, 标准字符集(UTF_8))

BufferedReader br=Files.newBufferedReader(路径文件,
Charset.forName(“UTF-8”)

我建议的第一件事是对流程进行仪器化,以找出数据在什么时候被转换成错误的格式。可能是在开始、中间或结束时,如果我们能够缩小范围,这将非常有帮助。这是点-
TextIO.write()。到(storagePath+CSV\u输出\u文件夹+“事件”+timeSuffix)
为了澄清,这些不是实际的日语字符,而是错误编码/解码的字符。从保留拉丁字符、数字和标点符号这一事实判断,这意味着输入编码和输出编码之间存在不匹配。例如,如果输入类似于ASCII 1-字节编码,输出为UTF8,则可能会发生类似情况。或者,如果您的文件是UTF8,但您的文本编辑器不知道这一点,并尝试将其显示为ASCII,例如。我认为这里仍然缺少一些内容:文本数据是如何生成的,管道的输入是什么?您能否共享从f获得的预期输出与DataflowRunner的中断输出相对应的rom DirectRunner,以便更容易确定编码问题(如果是编码问题)?您能解释一下更改吗?查看
标准字符集.UTF_8
,我看到:
公共静态最终字符集UTF_8=Charset.forName(“UTF-8”)
所以这一变化实际上一点也没有改变。
private static void uploadBlob(String project, String bucket, String filename, String localfile) {
    String listFromCsv = readCsvFromLocalStorage(localfile);

    Storage storage = StorageOptions.newBuilder().setProjectId(project).build().getService();
    BlobId blobId = BlobId.of(bucket, filename);
    BlobInfo blobInfo = BlobInfo.newBuilder(blobId).setContentType("application/json").setContentEncoding(UTF_8).build();
    try {
        storage.create(blobInfo, listFromCsv.getBytes(UTF_8));
    } catch (UnsupportedEncodingException e) {
        e.printStackTrace();
    }
}


private static String readCsvFromLocalStorage(String fileName) {
    StringBuilder builder = new StringBuilder();
    Path pathToFile = Paths.get(fileName);

    try (BufferedReader br = Files.newBufferedReader(pathToFile,
            StandardCharsets.UTF_8)) {

        // read the first line from the text file
        String line = br.readLine();

        // loop until all lines are read
        while (line != null) {
            builder.append(line).append("\n");
            line = br.readLine();
        }

    } catch (IOException ioe) {
        ioe.printStackTrace();
    }

    return builder.toString();
}

private static void deleteLocalFile (String fileName)
{
    try {
        if (new File(fileName).delete()) {
            System.out.println(fileName + " deleted.");
        } else {
            System.out.println(fileName + " could not be deleted.");
        }
    } catch (Exception e)
    {
        System.out.println(fileName + " could not be deleted.");
        e.printStackTrace();
    }
}