Warning: file_get_contents(/data/phpspider/zhask/data//catemap/0/amazon-s3/2.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 2.7 如何在两台服务器之间可靠地复制Cassandra数据库?_Python 2.7_Cassandra_Cqlsh - Fatal编程技术网

Python 2.7 如何在两台服务器之间可靠地复制Cassandra数据库?

Python 2.7 如何在两台服务器之间可靠地复制Cassandra数据库?,python-2.7,cassandra,cqlsh,Python 2.7,Cassandra,Cqlsh,我有一个测试设置,我想有一个主数据的副本 我使用的是datastax版本3.0.9中的Cassandra包 我正在使用CQLSH转储数据,并在测试设置中恢复。 我正在使用 复制到,分隔符为'\t',NULL为''NULL',引号为''',标头为True 我正在使用 复制自,分隔符为'\t',NULL为'NULL',引号为''',标头为True 从复制_之后,CQLSH表示它成功地复制了文件中的所有行。但是当我在表上运行count(*)时,少了几行。 缺少行没有特定的模式。如果在截断表后重播该命令

我有一个测试设置,我想有一个主数据的副本

我使用的是datastax版本3.0.9中的Cassandra包

我正在使用CQLSH转储数据,并在测试设置中恢复。 我正在使用

复制到,分隔符为'\t',NULL为''NULL',引号为''',标头为True

我正在使用

复制自,分隔符为'\t',NULL为'NULL',引号为''',标头为True

从复制_之后,CQLSH表示它成功地复制了文件中的所有行。但是当我在表上运行count(*)时,少了几行。 缺少行没有特定的模式。如果在截断表后重播该命令,将丢失一组新行。缺少行的计数是随机的

表结构包含用户定义数据类型的列表/集合,UDT的内容中可能有“null”值

除了以编程方式读取和写入两个数据库之间的单独行之外,还有其他可靠的方法来复制数据吗


表的架构(字段名称已更改):

创建类型UDT1(
字段1文本,
字段2 int,
字段3文本
);
创建UDT2类型(
字段1布尔值,
字段2浮动
);
创建表格cypher.table1(
id int主键,
清单1,
数据文本,
第1组
)使用bloom_过滤器时,fp_概率=0.01
和缓存={'keys':'ALL','rows\u per\u partition':'NONE'}
和注释=“”
和compression={'class':'org.apache.cassandra.db.compression.SizeTieredCompactionStrategy','max_threshold':'32','min_threshold':'4'}
压缩={'chunk_length_in_kb':'64','class':'org.apache.cassandra.io.compress.LZ4Compressor'}
和crc检查机会=1.0
和dclocal\u read\u repair\u chance=0.1
并且默认的\u time\u to\u live=0
gc_grace_秒=864000
最大指数间隔=2048
和memtable_flush_period_in_ms=0
最小索引间隔=128
并读取_repair_chance=0.0
推测性_重试='99百分位';

除了导出/导入数据,您还可以尝试复制数据本身

  • 使用“nodetool snapshot”从原始集群获取数据快照
  • 在测试集群上创建架构
  • 将快照从原始群集加载到测试群集:

    a。如果测试中的所有节点都保存了所有数据(单节点/3节点rf=3)-或者数据量很小-将文件从原始集群复制到keyspace/column_family目录并执行nodetool refresh()-确保文件不重叠

    b。如果测试集群节点没有保存所有数据/数据量很大-请使用sstablloader()将文件从快照流式传输到测试集群


  • 我已经用不带定界符的常规
    复制到
    复制自
    模式测试了您的模式,效果很好。我已经试过好几次了,但没有遗漏什么

    cassandra@cqlsh:cypher> INSERT INTO table1 (id, data, list1, set1 ) VALUES ( 1, 'cypher', ['a',1,'b'], {true}) ;
    cassandra@cqlsh:cypher> SELECT * FROM table1 ; 
    
     id | data   | list1                                                                                                                             | set1
    ----+--------+-----------------------------------------------------------------------------------------------------------------------------------+--------------------------------
      1 | cypher | [{field1: 'a', field2: null, field3: null}, {field1: '1', field2: null, field3: null}, {field1: 'b', field2: null, field3: null}] | {{field1: True, field2: null}}
    
    cassandra@cqlsh:cypher> INSERT INTO table1 (id, data, list1, set1 ) VALUES ( 2, '2_cypher', ['amp','avd','ball'], {true, false}) ;
    cassandra@cqlsh:cypher> SELECT * FROM table1 ;
    
     id | data     | list1                                                                                                                                    | set1
    ----+----------+------------------------------------------------------------------------------------------------------------------------------------------+---------------------------------------------------------------
      1 |   cypher |        [{field1: 'a', field2: null, field3: null}, {field1: '1', field2: null, field3: null}, {field1: 'b', field2: null, field3: null}] |                                {{field1: True, field2: null}}
      2 | 2_cypher | [{field1: 'amp', field2: null, field3: null}, {field1: 'avd', field2: null, field3: null}, {field1: 'ball', field2: null, field3: null}] | {{field1: False, field2: null}, {field1: True, field2: null}}
    
    cassandra@cqlsh:cypher> COPY table1 TO 'table1.csv';
    Using 1 child processes
    
    Starting copy of cypher.table1 with columns [id, data, list1, set1].
    Processed: 2 rows; Rate:       0 rows/s; Avg. rate:       0 rows/s
    2 rows exported to 1 files in 4.358 seconds.
    cassandra@cqlsh:cypher> TRUNCATE table table1 ;
    cassandra@cqlsh:cypher> SELECT * FROM table1;
    
     id | data | list1 | set1
    ----+------+-------+------
    
    cassandra@cqlsh:cypher> COPY table1 FROM 'table1.csv';
    Using 1 child processes
    
    Starting copy of cypher.table1 with columns [id, data, list1, set1].
    Processed: 2 rows; Rate:       2 rows/s; Avg. rate:       3 rows/s
    2 rows imported from 1 files in 0.705 seconds (0 skipped).
    cassandra@cqlsh:cypher> SELECT * FROM table1  ;
    
     id | data     | list1                                                                                                                                    | set1
    ----+----------+------------------------------------------------------------------------------------------------------------------------------------------+---------------------------------------------------------------
      1 |   cypher |        [{field1: 'a', field2: null, field3: null}, {field1: '1', field2: null, field3: null}, {field1: 'b', field2: null, field3: null}] |                                {{field1: True, field2: null}}
      2 | 2_cypher | [{field1: 'amp', field2: null, field3: null}, {field1: 'avd', field2: null, field3: null}, {field1: 'ball', field2: null, field3: null}] | {{field1: False, field2: null}, {field1: True, field2: null}}
    
    (2 rows)
    cassandra@cqlsh:cypher>
    

    您能提供测试设置的模式吗?添加了模式@AnowerPervesI在测试集群上具有不同的拓扑。因此,“nodetool刷新”不起作用。将尝试sstableloader,并让您知道。谢谢您的努力。由于桌子小,我没有发现任何问题。但当我使用一个大表(约100万个条目)时,它开始删除条目。
    cassandra@cqlsh:cypher> INSERT INTO table1 (id, data, list1, set1 ) VALUES ( 1, 'cypher', ['a',1,'b'], {true}) ;
    cassandra@cqlsh:cypher> SELECT * FROM table1 ; 
    
     id | data   | list1                                                                                                                             | set1
    ----+--------+-----------------------------------------------------------------------------------------------------------------------------------+--------------------------------
      1 | cypher | [{field1: 'a', field2: null, field3: null}, {field1: '1', field2: null, field3: null}, {field1: 'b', field2: null, field3: null}] | {{field1: True, field2: null}}
    
    cassandra@cqlsh:cypher> INSERT INTO table1 (id, data, list1, set1 ) VALUES ( 2, '2_cypher', ['amp','avd','ball'], {true, false}) ;
    cassandra@cqlsh:cypher> SELECT * FROM table1 ;
    
     id | data     | list1                                                                                                                                    | set1
    ----+----------+------------------------------------------------------------------------------------------------------------------------------------------+---------------------------------------------------------------
      1 |   cypher |        [{field1: 'a', field2: null, field3: null}, {field1: '1', field2: null, field3: null}, {field1: 'b', field2: null, field3: null}] |                                {{field1: True, field2: null}}
      2 | 2_cypher | [{field1: 'amp', field2: null, field3: null}, {field1: 'avd', field2: null, field3: null}, {field1: 'ball', field2: null, field3: null}] | {{field1: False, field2: null}, {field1: True, field2: null}}
    
    cassandra@cqlsh:cypher> COPY table1 TO 'table1.csv';
    Using 1 child processes
    
    Starting copy of cypher.table1 with columns [id, data, list1, set1].
    Processed: 2 rows; Rate:       0 rows/s; Avg. rate:       0 rows/s
    2 rows exported to 1 files in 4.358 seconds.
    cassandra@cqlsh:cypher> TRUNCATE table table1 ;
    cassandra@cqlsh:cypher> SELECT * FROM table1;
    
     id | data | list1 | set1
    ----+------+-------+------
    
    cassandra@cqlsh:cypher> COPY table1 FROM 'table1.csv';
    Using 1 child processes
    
    Starting copy of cypher.table1 with columns [id, data, list1, set1].
    Processed: 2 rows; Rate:       2 rows/s; Avg. rate:       3 rows/s
    2 rows imported from 1 files in 0.705 seconds (0 skipped).
    cassandra@cqlsh:cypher> SELECT * FROM table1  ;
    
     id | data     | list1                                                                                                                                    | set1
    ----+----------+------------------------------------------------------------------------------------------------------------------------------------------+---------------------------------------------------------------
      1 |   cypher |        [{field1: 'a', field2: null, field3: null}, {field1: '1', field2: null, field3: null}, {field1: 'b', field2: null, field3: null}] |                                {{field1: True, field2: null}}
      2 | 2_cypher | [{field1: 'amp', field2: null, field3: null}, {field1: 'avd', field2: null, field3: null}, {field1: 'ball', field2: null, field3: null}] | {{field1: False, field2: null}, {field1: True, field2: null}}
    
    (2 rows)
    cassandra@cqlsh:cypher>