Hadoop工作目录_Hadoop - Fatal编程技术网

Hadoop工作目录

hadoop

Hadoop工作目录,hadoop,Hadoop,我试图在Hadoop应用程序的主类中保存一个文件，以便映射程序稍后可以读取它。该文件是用于加密数据的加密密钥。我这里的问题是，如果我将文件写入工作目录，数据将在哪里结束 public class HadoopIndexProject { private static SecretKey generateKey(int size, String Algorithm) throws UnsupportedEncodingException, NoSuchAlgorithmException

我试图在Hadoop应用程序的主类中保存一个文件，以便映射程序稍后可以读取它。该文件是用于加密数据的加密密钥。我这里的问题是，如果我将文件写入工作目录，数据将在哪里结束

public class HadoopIndexProject {

    private static SecretKey generateKey(int size, String Algorithm) throws UnsupportedEncodingException, NoSuchAlgorithmException {
        KeyGenerator keyGen = KeyGenerator.getInstance(Algorithm);
        keyGen.init(size);
        return keyGen.generateKey();
    }

    private static IvParameterSpec generateIV() {
        byte[] b = new byte[16];
        new Random().nextBytes(b);
        return new IvParameterSpec(b);    
    }

    public static void saveKey(SecretKey key, IvParameterSpec IV, String path) throws IOException {
        FileOutputStream stream = new FileOutputStream(path);
        //FSDataOutputStream stream = fs.create(new Path(path));
        try {
            stream.write(key.getEncoded());
            stream.write(IV.getIV());
        } finally {
            stream.close();
        }
    }

    /**
     * @param args the command line arguments
     * @throws java.lang.Exception
     */
    public static void main(String[] args) throws Exception {
        // TODO code application logic here
        Configuration conf = new Configuration();
        //FileSystem fs = FileSystem.getLocal(conf);
        String[] otherArgs = new GenericOptionsParser(conf, args).getRemainingArgs();
        SecretKey KEY;
        IvParameterSpec IV;
        if (otherArgs.length != 2) {
            System.err.println("Usage: Index <in> <out>");
            System.exit(2);
        }
        try {
            if(! new File("key.dat").exists()) {
                KEY = generateKey(128, "AES");
                IV = generateIV();
                saveKey(KEY, IV, "key.dat");
            }
        } catch (NoSuchAlgorithmException ex) {
            Logger.getLogger(HadoopIndexMapper.class.getName()).log(Level.SEVERE, null, ex);
        }
        conf.set("mapred.textoutputformat.separator", ":");

        Job job = Job.getInstance(conf);
        job.setJobName("Index creator");
        job.setJarByClass(HadoopIndexProject.class);      
        job.setMapperClass(HadoopIndexMapper.class);
        job.setReducerClass(HadoopIndexReducer.class);

        job.setMapOutputKeyClass(Text.class);
        job.setMapOutputValueClass(IntWritable.class);

        job.setOutputKeyClass(Text.class);
        job.setOutputValueClass(IntArrayWritable.class);

        FileInputFormat.addInputPath(job, new Path(otherArgs[0]) {});
        FileOutputFormat.setOutputPath(job, new Path(otherArgs[1]));

        System.exit(job.waitForCompletion(true) ? 0 : 1);
    }

}

公共类HadoopIndexProject{
私有静态SecretKey generateKey（int size，String Algorithm）抛出不支持的编码异常，NoSuchAlgorithmException{
KeyGenerator keyGen=KeyGenerator.getInstance（算法）；
密钥初始（大小）；
返回keyGen.generateKey（）；
}
私有静态IvParameterSpec generateIV（）{
字节[]b=新字节[16]；
新的Random（）.nextBytes（b）；
返回新的IvParameterSpec（b）；
}
公共静态void saveKey（SecretKey key，IvParameterSpec IV，字符串路径）引发IOException{
FileOutputStream=新的FileOutputStream（路径）；
//FSDataOutputStream=fs.create（新路径（Path））；
试一试{
stream.write（key.getEncoded（））；
stream.write（IV.getIV（））；
}最后{
stream.close（）；
}
}
/**
*@param指定命令行参数
*@java.lang.Exception
*/
公共静态void main（字符串[]args）引发异常{
//此处的TODO代码应用程序逻辑
Configuration conf=新配置（）；
//FileSystem fs=FileSystem.getLocal（conf）；
String[]otherArgs=新的GenericOptionsParser（conf，args）；
秘钥；
IV参数规范IV；
if（otherArgs.length！=2）{
System.err.println（“用法：索引”）；
系统出口（2）；
}
试一试{
如果（！new File（“key.dat”）.exists（））{
密钥=生成密钥（128，“AES”）；
IV=生成IV（）；
saveKey（KEY，IV，“KEY.dat”）；
}
}捕获（nosuchalgorithmex异常）{
getLogger（HadoopIndexMapper.class.getName（））.log（Level.SEVERE，null，ex）；
}
conf.set（“mapred.textoutputformat.separator”，“：”；
Job Job=Job.getInstance（conf）；
job.setJobName（“索引创建者”）；
setJarByClass（HadoopIndexProject.class）；
setMapperClass（HadoopIndexMapper.class）；
setReducerClass（HadoopIndexReducer.class）；
job.setMapOutputKeyClass（Text.class）；
setMapOutputValueClass（IntWritable.class）；
job.setOutputKeyClass（Text.class）；
job.setOutputValueClass（IntArrayWritable.class）；
addInputPath（作业，新路径（其他参数[0]）{}）；
setOutputPath（作业，新路径（其他参数[1]）；
系统退出（作业等待完成（真）？0:1；
}
}

HDFS中没有工作目录的概念。所有相对路径都是来自

/user/

的路径，因此您的文件将位于

/user//key.dat

中

但是在Thread中，您有分布式缓存的概念，因此您可以使用

job.addCacheFile

在Thread应用程序中添加其他文件。我没有使用我创建的文件系统实例。现在我得到了你提到的要创建的目录，我可以找到我的密钥。问题不是如何让我的映射器和还原器读取该键。另外，你能解释一下分布式缓存的概念吗？