Thoughts on implementing content replication in vanilla Apache Sling

CQ5’s replication mechanism is a very useful tool in getting content pushed from one CQ5 instance to another. However, since it doesn’t exist in vanilla Apache Sling, I’m thinking how would I go about doing it outside of CQ5. Here’s just a braindump of some requirements:

1. Need it to be configurable so that any sling instance can be configured to push to any number of other sling instances
2. Need it to be be asynchronous
3. Needs to be resilient to servers or services not responding and continue on when servers/services come back up
4. Prefer to have it loosely coupled such that we can extend this mechanism to push to other servers/systems

Some approaches to do this:
1. A call to action to publish/unpublish something would make use of the Sling eventing and Job processor to raise replication events and use the job processor to find the appropriate replicator(s) to handle the data
2. Use an OSGi configuration factory to store configurations for replicator information
a. Name
b. Pick a replicator implementation (let’s call the interface Replicator)
c. Set a set of JCR parent paths this replicator should read (in CQ, this wasn’t done and an agent id was specified) and publish
d. Set some other props such as (retry period, number of retries, etc) as appropriate. This has a lot of opportunity for addition
3. Each replicator service will have its own config for
a. protocol, server, port, user information for pushing to the target system. this would be implementation specific. e.g. an http/rest based replicator could put a URL, user id, params, etc. Alternatively, it could easily use a JMS queue to push stuff to which then in turn could push to another system like a db or another sling instance. The replicator service would implement a simple interface and handle all this internally
4. The Job processor that handles the replicate/publish event would cycle through the Configurations available and find the appropriate list of replicators to deal with the payload. For simplicity’s sake see if the payload path starts with the parent path.
5. The Job processor can raise further events denoting success or failure of the replicate action. This can then be used by any event handlers like auditors etc.

Any takers for a sling contrib project?

– Sarwar Bhuiyan

Thoughts on implementing content replication in vanilla Apache Sling

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s