最後更新: 2017-06-28
介紹
It can also be used to replicate files and directories between multiple hosts.
It uses a light-weight, web-based protocol to send and receive data, which makes it easy to work with firewalls.
Replication is done in the background asynchronously, allowing data changes in offline mode.
HomePage: https://www.symmetricds.org
功能
* 支援 partial tables replicate (filtered synchronization)
* table names and column names be re-mapped
* data transformation
* connections between instances of SymmetricDS require password authentication.
* Data can be encrypted by configuring node addresses to use HTTPS
* Extendable - Scripts and Java code can be configured to handle events, transform data, and create customized behavior.
Requirements
JRE > 7
Pro vs Free Version
Pro version 有
- web interface
- additional connectors
- log mining
- clustering
- bulk loading
- data validation
- auditing
- monitoring
原理
Node
A node is responsible for synchronizing the data from a database or file system with other nodes in the network using HTTP.
Nodes are assigned to one of the node Groups that are configured together as a unit.
The node groups are linked together with Group Links to define either a push or pull communication.
A pull causes one node to connect with other nodes and request changes that are waiting,
while a push causes one node to connect with other nodes when it has changes to send.
Each node is connected to a database with a Java Database Connectivity (JDBC) driver using a connection URL, username, and password.
At startup, SymmetricDS looks for Node Properties Files and starts a node for each file it finds, which allows multiple nodes to run in the same instance and share resources. The property file for a node contains its external ID, node group, registration server URL, and database connection information. The external ID is the name for a node used to identify it from other nodes. One node is configured as the registration server where the master configuration is stored. When a node is started for the first time, it contacts the registration server using a registration process that sends its external ID and node group. In response, the node receives its configuration and a node password that must be sent as authentication during synchronization with other nodes.
Triggers are installed in the database to guarantee that data changes are captured.
(without any special driver software.)
each change is given a sequence number
(timestamps are not used.)
Data changes are grouped together and given a batch number to route through the system.
Channels
Channels define those logical groupings of data.
i.e.
a set of tables that hold customer data might be logically grouped together in a Customer channel.
Delay
The user configures a delay in milliseconds for when the changes are replicated to the target system.
Running as a Windows Service
To install the service, run the following command as Administrator:
bin\sym_service.bat install
bin\sym_service.bat uninstall
bin\sym_service.bat start
bin\sym_service.bat stop
Running as Standalone
# starts a standalone instance of SymmetricDS using the built-in Jetty web server
bin\sym
# Configure File
conf/symmetric-server.properties
# Enable synchronization over HTTP. http.enable=true http.port=31415 .... # Enable Java Management Extensions (JMX) web console. jmx.http.enable=true jmx.http.port=31416
conf/sym_service.conf
# Initial Java Heap Size (in MB) wrapper.java.initmemory=256 # Maximum Java Heap Size (in MB) wrapper.java.maxmemory=1024 # Log level (SEVERE, WARNING, INFO, CONFIG, FINE, FINER, FINEST) wrapper.logfile.loglevel=INFO # Size of log file when it will rotate to next file wrapper.logfile.maxsize=10m # Number of log files to rotate and keep wrapper.logfile.maxfiles=3
Tools
SymmetricDS service (sym_service)
command line launcher (sym)
administrative tools (symadmin)
utilities (dbexport, dbimport, dbfill)
Demo 環境
# samples
cd samples
# Create the sample tables
..\bin\dbimport --engine corp-000 --format XML --alter-case create_sample.xml
# Next, create the SymmetricDS tables (Tables are automatically created when the server starts for the first time.)
..\bin\symadmin --engine corp-000 create-sym-tables
# Load the SymmetricDS configuration and sample item
..\bin\dbimport --engine corp-000 insert_sample.sql
# All three nodes run in the same instance.
bin\sym
Initial Load
An initial load is the process of seeding tables at a target node with data from a source node.
Instead of capturing data, data is selected from the source table using a SQL statement and then it is streamed to the client.
Initial Load by command
symadmin --engine source-000 reload-node 001
console output
.... Successfully enabled initial load for node 001
log
2017-06-28 17:28:38,453 INFO [store-001] [H2DatabasePlatform] [dataloader-corp-000-1] Tables up to date. No alters found for ITEM_SELLING_PRICE 2017-06-28 17:28:38,468 INFO [store-001] [ConfigurationChangedDatabaseWriterFilter] [dataloader-corp-000-1] About to refresh the cache of node security because new configuration came through the data loader 2017-06-28 17:28:38,468 INFO [store-001] [PullService] [store-001-pull-default-7] Pull data received from corp:000:000 on channel thread default. 9 rows and 7 batches were processed
Initial Load by SQL
update sym_node_security set initial_load_enabled = 1 where node_id = '001';
* load 後會 initial_load_id 會加 1 及 update "initial_load_time"
verify
# verify the databases by logging in
bin\dbsql --engine corp-000
# 在 corp-000 加入 Data
insert into item (item_id, name) values (110000055, 'Soft Drink'); insert into item_selling_price (item_id, store_id, price) values (110000055, '001', 0.65); insert into item_selling_price (item_id, store_id, price) values (110000055, '002', 1.00);
# 查看 console output
[store-002] - PullService - Pull data received from corp:000:000 on channel thre ad default. 2 rows and 1 batches were processed [store-001] - PullService - Pull data received from corp:000:000 on channel thre ad default. 2 rows and 1 batches were processed
# Verify Outgoing Batches (on source)
# login "corp-000"
bin\dbsql --engine corp-000
# 改用 list 形式看 BATCH
list
# sql
select * from sym_outgoing_batch order by batch_id desc limit 1;
BATCH_ID : 534
NODE_ID : -1
CHANNEL_ID : config
STATUS : OK
LOAD_ID : -1
EXTRACT_JOB_FLAG : 0
LOAD_FLAG : 0
ERROR_FLAG : 0
COMMON_FLAG : 0
IGNORE_COUNT : 0
BYTE_COUNT : 0
EXTRACT_COUNT : 0
SENT_COUNT : 0
LOAD_COUNT : 0
DATA_EVENT_COUNT : 1
RELOAD_EVENT_COUNT : 0
INSERT_EVENT_COUNT : 0
UPDATE_EVENT_COUNT : 1
DELETE_EVENT_COUNT : 0
OTHER_EVENT_COUNT : 0
ROUTER_MILLIS : 0
NETWORK_MILLIS : 0
FILTER_MILLIS : 0
LOAD_MILLIS : 0
EXTRACT_MILLIS : 0
TRANSFORM_EXTRACT_MILLIS: 0
TRANSFORM_LOAD_MILLIS : 0
TOTAL_EXTRACT_MILLIS : 0
TOTAL_LOAD_MILLIS : 0
SQL_STATE : null
SQL_CODE : 0
SQL_MESSAGE : null
FAILED_DATA_ID : 0
FAILED_LINE_NUMBER : 0
LAST_UPDATE_HOSTNAME : S08R2
LAST_UPDATE_TIME : 2017-06-28 17:28:29.779
CREATE_TIME : 2017-06-28 17:28:29.779
CREATE_BY : null
SUMMARY : sym_node_security
(1 row, 0 ms)
# Verify Incoming Batches (on destination)
bin\dbsql --engine store-001
list
select * from sym_incoming_batch order by batch_id desc limit 1;
output
BATCH_ID : 535 NODE_ID : 000 CHANNEL_ID : heartbeat STATUS : OK ERROR_FLAG : 0 NETWORK_MILLIS : 0 FILTER_MILLIS : 0 DATABASE_MILLIS : 16 FAILED_ROW_NUMBER : 0 FAILED_LINE_NUMBER : 0 BYTE_COUNT : 766 STATEMENT_COUNT : 1 FALLBACK_INSERT_COUNT: 0 FALLBACK_UPDATE_COUNT: 0 IGNORE_COUNT : 0 IGNORE_ROW_COUNT : 0 MISSING_DELETE_COUNT : 0 SKIP_COUNT : 0 SQL_STATE : null SQL_CODE : 0 SQL_MESSAGE : null LAST_UPDATE_HOSTNAME : S08R2 LAST_UPDATE_TIME : 2017-06-28 17:41:25.255 CREATE_TIME : 2017-06-28 17:41:25.239 SUMMARY : sym_node_host (1 row, 15 ms)
cli - dbsql
a sql shell for database
-e,--engine <arg>
The name of a configured engine.
The name should correspond to a engine.name setting in one of the properties files in the engines directory.
log
--no-log-console No output will be sent to the console.
--no-log-file No output will be sent to the log file. (logs/symmetric.log)
-p,--properties <arg>
The properties file with settings for the SymmetricDS engine.
If not provided, defaults are used, then overridden with first symmetric.properties in classpath,
then overridden with symmetric.properties values in user.home directory.
--sql <arg>
Run this sql statement in the shell
cli - symadmin
list-engines
symadmin list-engines
Log output will be written to C:\symmetric-server-3.8.26/logs/symmetric.log Engines directory is C:\symmetric-server-3.8.26\engines The following engines and properties files are available: corp-000 -> C:\symmetric-server-3.8.26\engines\corp-000.properties store-001 -> C:\symmetric-server-3.8.26\engines\store-001.properties store-002 -> C:\symmetric-server-3.8.26\engines\store-002.properties 0 engines returned
open-registration
Usage
symadmin --engine <engine name> open-registration <node group> <external id>
ie.
symadmin --engine source-000 open-registration dest 001
A -> B
過程
1. create source properties file
2. create properties source file
3. test properties file (dbsql -e source-000 / dbsql -e dest-000)
4. create "sym_" tables (symadmin --engine source-000 create-sym-tables)
5. create Group, Group Links, Router, Trigger, Trigger Router record
6. start engines (bin\sym)
7. open registration (symadmin --engine source-000 open-registration DST 001)
8. Initial Load (symadmin --engine source-000 reload-node 001)
[1] Connect Source Database
# This is an arbitrary name that is used to access a specific engine using an HTTP URL # The engine name is also used for the domain name of registered JMX beans. engine.name=centralNode # The node group that this node is a member of. group.id=corp # The external id for this node has meaning to the user and provides integration into the system where it is deployed. # For example, it might be a retail store number or a region number. # each node has a unique sequence number for tracking synchronization events. external.id=000 # contacted for synchronization. # At startup and during each heartbeat, the node updates its entry in the database with this URL. sync.url=http://localhost:31415/sync/centralNode # where this node can connect for registration to receive its configuration. # leave this blank for the source node. # typically equal to the value of the sync.url of the registration server. registration.url= db.driver={JDBC driver class} db.url={JDBC url for your source database} db.user=root db.password=xxxxxx
[3] Connect Target Database
engine.name={an arbitrary name for the node} group.id={this must match the name of a group inserted in step 3, ex: store} external.id={id for the node, ex: 001 } sync.url=http://{hostname}:{port}/{webcontext}/sync/{engine.name} registration.url={this should match the sync.url in your source property file} db.driver={JDBC driver class} db.url={JDBC url for your target database} db.user={JDBC username} db.password={JDBC password}
Test properties File by show tables
# Source Side (postgresql)
# Output: schemaname|tablename|tableowner|tablespace|has indexes|hasrules|hastriggers|rowsecurity
dbsql -e source-000
SELECT * FROM pg_catalog.pg_tables;
# Dest. Side (mssql)
# Output: TABLE_CATALOG | TABLE_SCHEMA | TABLE_NAME
dbsql -e dest-001
SELECT * FROM information_schema.TABLES;
[2] Configure
* minimum configuration
[1] Group
[2] Group Links <-- complete registration
[3] Channels
[4] Router
[5] Trigger
[6] Trigger Router
Groups
insert into SYM_NODE_GROUP (node_group_id, description) values ('SRC', 'A corporate node'); insert into SYM_NODE_GROUP (node_group_id, description) values ('DST', 'A retail store node');
Group Links (PUT/GET)
Group links define at a high level how data moves throughout your synchronization scenario.
insert into SYM_NODE_GROUP_LINK (source_node_group_id, target_node_group_id, data_event_action) values ('SRC', 'DST', 'W'); # 雙向連接要加以下 setting # insert into SYM_NODE_GROUP_LINK # (source_node_group_id, target_node_group_id, data_event_action) # values # ('DST', 'SRC', 'P');
Wait for Pull [W]
Indicates nodes in the source node group will wait for a node in the target node group to connect via an HTTP GET
Push [P]
Indicates that nodes in the source node group will initiate communication over an HTTP PUT
Route-only [R]
data isn’t exchanged between nodes in the source and nodes in the target node groups
# 當 link 不是相向時, 要注意 P, W
2017-07-06 11:00:50,932 WARN [source-000] [PushService] [source-000-push-default-2] Cannot push to node '001' in the group 'dest'. The sync url is blank
Router
It define more specifically which captured data (Router Type) from a source node
should be sent to which specific nodes in a target node group, all within the context of the node group link.
# defines a router that will send data from the 'corp' group to the 'store' group
insert into SYM_ROUTER (router_id, source_node_group_id, target_node_group_id, create_time, last_update_time) values ('myRouter','SRC', 'DST', current_timestamp, current_timestamp);
Router Type
- default # sends all captured data to all nodes that belong to the target node group
- java # A router that executes a Java expression in order to select nodes to route to.
- bsh # A router that executes a Bean Shell script expression in order to select nodes to route to
- ....
Other Setting
- Sync on Update
- Sync on Insert
- Sync on Delete
- Target Table (target table name is different than the source)
Channels
define logical groupings(transaction data grouped into a transaction channel)
The default channel is called 'default'. (automatically created that all tables will fall into)
* Transactions will NOT be preserved across channels
so its important to setup channels to contain all tables that participate in a given transaction.
Trigger
SymmetricDS captures synchronization data using database triggers.
Each record is used by SymmetricDS when generating database triggers.
Database triggers are only generated when a trigger is associated with a ROUTER whose source_node_group_id matches the node group id of the current node.
insert into SYM_TRIGGER (trigger_id, source_table_name, channel_id, last_update_time, create_time) values ('mytable1', 'mytable1', 'default', current_timestamp, current_timestamp);
Trigger Router
maps the router('myRouter')to the trigger('mytable1')
insert into SYM_TRIGGER_ROUTER (trigger_id, router_id, initial_load_order, create_time, last_update_time) values ('mytable1', 'myRouter', 1, current_timestamp, current_timestamp);
One way replication
Table triggers (entries in sym_trigger) specify which tables need to be synchronized.
Trigger router entries match a trigger to a router, specifying where (in which direction) the data for those tables should be routed or sent.
So, in the sample, you have a trigger entry for a table 'item' which is connected to a router corp_2_store by a trigger/router entry.
That specifies that the item table will be sent (one way) from the corp node group to the store node group.
If you added a second trigger_router entry tying the item table trigger to the 'store_2_corp' router, the the item table would be sent bi-drectionaly from corp to store and store to corp.
Connection Setting
SymmetricDS will search the following locations for database properties:
1. conf/symmetric.properties
2. CLASSPATH:symmetric-override.properties
3. engines/*.properties
* If the properties are changed in conf/symmetric.properties they will take effect across all engines deployed to the server.
Connection setting
db.driver # The class name for the JDBC driver
db.url
db.user
db.password
driver & url
mySQL (com.mysql.jdbc.Driver)
jdbc:mysql://localhost/sampleroot?tinyInt1isBit=false&zeroDateTimeBehavior=convertToNull
PostgreSQL (org.postgresql.Driver)
jdbc:postgresql://localhost/sampleroot
mssql (net.sourceforge.jtds.jdbc.Driver)
jdbc:jtds:sqlserver://[sql-server:port]/[Database-Name];useCursors=true;bufferMaxMemory=10240;lobBuffer=5242880
H2 (org.h2.Driver)
jdbc:h2:sampleroot;AUTO_SERVER=TRUE;LOCK_TIMEOUT=60000
Other properties file setting
# This is how often the routing job will be run in milliseconds (Default: 10000) job.routing.period.time.ms=5000 # This is how often the push job will be run. (Default: 60000) job.push.period.time.ms=10000 # This is how often the pull job will be run. (Default: 60000) job.pull.period.time.ms=10000
By default, an initial load will not create the table on the target if it doesn’t already exist.
# Kick off initial load (create the table on the target if it is not present) initial.load.create.first=true # delete data on the target node initial.load.delete.first=true
* The initial.load.delete.first needs enabled for the source node that sends the initial load.
It's used when the initial load first starts to create batches.
You'll see batches that contain rows in sym_data with an event_type of S that contains the delete statement.
Node
A node is represented by four tables:
- NODE (basic node information)
- NODE_IDENTITY (identifies the current node)
- NODE_SECURITY (Contains a password need to authenticate with another node)
- NODE_HOST (informational data about the node(IP,OS,Arch,CPU,Java Version ...))
Stopping a Node
Installed nodes are started automatically when the SymmetricDS server is started. An individual node instance can be stopped while other nodes continue to run.
bin/jmx --bean org.jumpmind.symmetric.<engine name>:name=Node --method stop
Uninstalling a Node
Uninstalling a node will remove all SymmetricDS database artifacts and delete the engine’s property file.
bin/symadmin --engine <engine name> uninstall
Registration
When a node is registered, it downloads its SymmetricDS configuration as well as references to the nodes that it should sync with.
A node is considered unregistered if it does not have an NODE_IDENTITY row.
open registration
# A node is considered unregistered if it does not have an "NODE_IDENTITY" row.
# Nodes are only allowed to register if rows exist for the registering node and the registration_enabled(NODE_SECURITY) flag is set to 1.
# Node registration is stored in the NODE and NODE_SECURITY tables.
symadmin --engine <engine name> <node group> <external id>
node group link
Cannot register a client node unless a node group link exists so the registering node can receive configuration updates.
Please add a group link where the source group id is source and the target group id is dest
insert into SYM_NODE_GROUP_LINK \ (source_node_group_id, target_node_group_id, data_event_action) \ values ('source', 'dest', 'P');
Tables
sym_node
EXTERNAL_ID
A domain-specific identifier for context within the local system. For example, the retail store number.
SYNC_ENABLED
Indicates whether this node should be sent synchronization.
Disabled nodes are ignored by the triggers, so no entries are made in data_event for the node.
CREATED_AT_NODE_ID
The node_id of the node where this node was created.
This is typically filled automatically with the node_id found in node_identity where registration was opened for the node.
DEPLOYMENT_TYPE
An indicator as to the type of SymmetricDS software that is running.
Possible values are, but not limited to: engine, standalone, war, professional, mobile
sym_node_security
NODE_PASSWORD
The password used by the node to prove its identity during synchronization.
The master node only needs this row if it is going to initiate communicate with another node.
REGISTRATION_ENABLED
Indicates whether registration is open for this node. Re-registration may be forced for a node if this is set back to '1' in a parent database for the node_id that should be re-registred.
INITIAL_LOAD_ENABLED (Default: 0)
Indicates whether an initial load will be sent to this node.
sym_channel
Processing Order
processed in ascending order
Batch Algorithm
Batching is the grouping of data, by channel, to be transferred and committed at the client together.
Default
All changes that happen in a transaction are guaranteed to be batched together. Multiple transactions will be batched and committed together until there is no more data to be sent or the max_batch_size is reached.
Group Link Direction
override the default group link communication.("push" / "pull" )
Enabled
Indicates whether the channel is enabled or disabled. If a channel is disabled, data is still captured for changes that occur on the source system, but it will not be routed and sent to the target until the channel is re-enabled.
Jobs
Most work done by SymmetricDS is initiated by jobs.
Jobs are tasks that are started and scheduled by a job manager.
job.<name>.period.time.ms
TRIGGER_HIST
SymmetricDS installs database triggers to capture changes in the DATA table.
A record of the triggers that were installed and
what columns are being captured is stored in the TRIGGER_HIST table.
reason a trigger was rebuilt
N New trigger that has not been created before
S Schema changes in the table were detected
C Configuration changes in Trigger
T Trigger was missing
Disabling Synchronization
# 0: Disable the extraction of all channels
# with the exception of the config channel
dataextractor.enable=1
Offline Synchronization
Configuring a node as offline will still allow changes to be captured and batched for replication.
# offline push and pull jobs
INSERT INTO sym_parameter
(external_id, node_group_id, param_key, param_value, create_time, last_update_by, last_update_time) VALUES
('ALL', 'ALL', 'start.offline.pull.job', 'true', current_timestamp, 'userid', current_timestamp);
INSERT INTO sym_parameter
(external_id, node_group_id, param_key, param_value, create_time, last_update_by, last_update_time) VALUES
('ALL', 'ALL', 'start.offline.push.job', 'true', current_timestamp, 'userid', current_timestamp);
# node that should be offline
INSERT INTO sym_parameter
(external_id, node_group_id, param_key, param_value, create_time, last_update_by, last_update_time) VALUES
('001', 'STORE', 'node.offline', 'true', current_timestamp, 'userid', current_timestamp);
outgoing batches intended for the offline node are now written as files(csv) <= in outgoing folder
Other nodes are unaffected and will continue to synchronize normally
Instead the local file system will be used for replication.
It is up to the user transport batch (*.csv) files to and from the node based on incoming or outgoing changes.
outgoing folder -> incoming folder
Conflict Strategy
source_wins
target_wins
manual
conflict table
Defines how conflicts in row data should be handled during the load process.
Outgoing batches
Outgoing batches are delivered to the target node when the source node pushes or when the target node pulls.
# show outgoing synchronization failures by node:
select count(*), node_id from sym_outgoing_batch where error_flag=1 group by node_id;
# how the number of data rows that have not been delivered to target nodes:
select sum(data_event_count), node_id from sym_outgoing_batch where status != 'OK' group by node_id;
# To locate batches in error
select * from sym_outgoing_batch where error_flag=1;
# Causing SymmetricDS to skip the batch completely.
update sym_outgoing_batch set status='OK' where batch_id='XXXXXX'
Startup / RuntimeParameters
* Runtime parameters are read periodically from properties files or the database.
Startup Parameters
auto.config.database (Default: true)
If this is true, when symmetric starts up it will try to create the necessary tables.
auto.sync.config.at.startup
If this is true, then check if configuration should be pulled from registration server at startup.
If the config version in the database does not match the software version, it will pull config.
auto.sync.configuration
Capture and send SymmetricDS configuration changes to client nodes.
engine.name
external.id
registration.url
sync.url
group.id
db.driver
db.url
db.user
db.password
sync.table.prefix (Default: sym)
Runtime parameters
initial.load.create.first (Default: false)
job.purge.period.time.ms
# Don't muddy the waters with purge logging
job.purge.period.time.ms=7200000
The Purge Outgoing Job is responsible to purging outgoing data that has successfully been loaded at the target and is older than purge.retention.minutes.
This job purges the following tables:
- DATA
- DATA_EVENT
- OUTGOING_BATCH
- EXTRACT_REQUEST
Cheat List
select * from sym_router;
select * from sym_trigger;
select * from sym_trigger_router;
update sym_node_security set initial_load_enabled = 1 where node_id = '001';
select sum(data_event_count), node_id from sym_outgoing_batch where status != 'OK' group by node_id;
Doc
https://www.symmetricds.org/doc/3.8/html/tutorials.html
https://www.symmetricds.org/doc/3.8/html/user-guide.html
https://www.symmetricds.org/docs/how-to/connect-to-database
Other