Hadoop 使用更新键进行Sqoop导出_Hadoop_Hdfs_Sqoop2

Hadoop 使用更新键进行Sqoop导出

hadoop

Hadoop 使用更新键进行Sqoop导出,hadoop,hdfs,sqoop2,Hadoop,Hdfs,Sqoop2,我必须将HDFS文件导出到MySql中。假设我的HDFS文件是： 1,abcd,23 2,efgh,24 3,ijkl,25 4,mnop,26 5,qrst,27 假设我的Mysql数据库模式是： +-----+-----+-------------+ | ID | AGE | NAME | +-----+-----+-------------+ | | | | +-----+-----+-------------+ 使用以下Sqo

我必须将HDFS文件导出到MySql中。假设我的HDFS文件是：

1,abcd,23
2,efgh,24
3,ijkl,25
4,mnop,26
5,qrst,27

假设我的Mysql数据库模式是：

+-----+-----+-------------+
| ID  | AGE |    NAME     |
+-----+-----+-------------+
|     |     |             |
+-----+-----+-------------+

使用以下Sqoop命令插入时：

sqoop export \
--connect jdbc:mysql://localhost/DBNAME \
--username root \
--password root \
--export-dir /input/abc \
--table test \
--fields-terminated-by "," \
--columns "id,name,age"

它工作正常，正在插入数据库

但是，当我需要更新已经存在的记录时，我必须使用-updatekey和-columns

现在，当我尝试使用以下命令更新表时：

sqoop export \
--connect jdbc:mysql://localhost/DBNAME \
--username root \
--password root \
--export-dir /input/abc \
--table test \
--fields-terminated-by "," \
--columns "id,name,age" \
--update-key id

我面临的问题是数据没有按照-columns中的指定更新到列中

我做错什么了吗

我们不能这样更新数据库吗？HDFS文件应该在Mysql架构中才能更新

有没有其他方法可以实现这一点？

只需尝试使用-update key primary\u key即可

它对我有效。它更新所有与主键匹配的记录。它可能不会插入新数据

明智地使用-update模式updateonly/allowinsert

只需尝试使用-update key primary\u key即可

它对我有效。它更新所有与主键匹配的记录。它可能不会插入新数据

明智地使用-update模式updateonly/allowinsert

4b。将HDFS中的数据更新到关系数据库中的表中

在mysql测试数据库中创建emp表tbl

create table emp
(
id int not null primary key,
name varchar(50)
);

vi emp->创建包含以下内容的文件

1,Thiru
2,Vikram
3,Brij
4,Sugesh

将文件移动到hdfs

hadoop fs -put emp <dir>

更新emp文件&将更新后的文件移动到hdfs中。更新文件的内容

1,Thiru
2,Vikram
3,Sugesh
4,Brij
5,Sagar

Sqoop export for upsert-如果键与else insert匹配，则更新

sqoop export --connect <jdbc connection> \
--username sqoop \
--password sqoop \
--table emp \
--update-mode allowinsert \
--update-key id \
--export-dir <dir> \
--input-fields-terminated-by ',';

Note: --update-mode <mode> - we can pass two arguments "updateonly" - to update the records. this will update the records if the update key matches.
if you want to do upsert (If exists UPDATE else INSERT) then use "allowinsert" mode.
example: 
--update-mode updateonly \ --> for updates
--update-mode allowinsert \ --> for upsert

4b.将HDFS中的数据更新到关系数据库中的表中

在mysql测试数据库中创建emp表tbl

create table emp
(
id int not null primary key,
name varchar(50)
);

vi emp->创建包含以下内容的文件

1,Thiru
2,Vikram
3,Brij
4,Sugesh

将文件移动到hdfs

hadoop fs -put emp <dir>

更新emp文件&将更新后的文件移动到hdfs中。更新文件的内容

1,Thiru
2,Vikram
3,Sugesh
4,Brij
5,Sagar

Sqoop export for upsert-如果键与else insert匹配，则更新

sqoop export --connect <jdbc connection> \
--username sqoop \
--password sqoop \
--table emp \
--update-mode allowinsert \
--update-key id \
--export-dir <dir> \
--input-fields-terminated-by ',';

Note: --update-mode <mode> - we can pass two arguments "updateonly" - to update the records. this will update the records if the update key matches.
if you want to do upsert (If exists UPDATE else INSERT) then use "allowinsert" mode.
example: 
--update-mode updateonly \ --> for updates
--update-mode allowinsert \ --> for upsert

您可能希望尝试使用以结尾的输入字段。

当前您正在使用以结尾的字段，该字段用于导入。

您可能希望尝试使用-输入字段以结尾。

目前，您正在使用以结尾的字段，该字段用于导入。

我实际上在Sqoop上以多种方式进行了尝试。Update键只能更新表中已经存在的列，并且不能插入它们，除非您还向allowinsert提及更新模式，这不是所有数据库都支持的。如果您确实尝试使用update key进行更新，它将更新update key中提到的key的行。

我实际上在Sqoop上以多种方式进行了尝试。Update键只能更新表中已经存在的列，并且不能插入它们，除非您还向allowinsert提及更新模式，这不是所有数据库都支持的。如果您确实尝试使用更新键进行更新，它将更新更新键中提到的键的行。

如果您仍在搜索答案，我需要更多的澄清。您是否有任何错误，或者它没有按照您的预期更新列？如果您仍在搜索答案，我需要更多的澄清。您是否收到任何错误，或者它没有按照您的预期更新列？