4

Having seen apps like Google Docs and libraries like ShareJS and EtherPad Lite, I am pretty excited about real-time collaboration, and this seems to be implemented using a very complex technique known as Operational Transformation.

My question is perhaps somewhat odd: why is OT necessary?

What I mean is, we have very low latency on the web in most settings - with tools like Google Docs, ShareJS and EtherPad, changes are almost instantly reflected on connected clients.

Why the incredibly complex solution of resolving conflicts and keeping things synchronized on the server-side?

Being familiar with the command pattern and undo/redo, it seems to me a much simpler solution would be to simply implement every change to a document as a command with an equivalent undo-command.

Let clients submit serialized commands when they make changes. Assign a serial number on the server-side to every received command. Distribute all commands applied to a document back to the clients, which also maintain a history of commands.

Each connected client receives back from the server all the commands applied to the document, now with serial numbers indicating the "correct" order, e.g. the order in which the commands were received by the server, and in which they were applied to the master document held by the server.

If a client was at command number 100, and submits a new command to the server that comes back as number 102, the client knows that it missed a command - it then simply applies the "undo" commands for the last command it submitted, applies command number 101, and then applies it's own command number 102 again, thus putting things back in order.

If it's behind by several commands, it simply rolls back as far as needed, then applies all missed commands, etc.

That sounds much simpler to me.

In what way is Operational Transformation better than that?

mindplay.dk
  • 6,381
  • 2
  • 39
  • 48
  • Okay, I think I understand now - what I described above only works if all the operations are being made on a matrix. In the case of a list, such as a string, inserting and deleting means that the data after the insert/delete point moves; which would mean that a subsequent insert/delete operation at a point after the first insert/delete point would need to be adjusted to compensate for the previous change. Is that the only difference between OT and the command-pattern approach described above? – mindplay.dk Mar 22 '13 at 23:37
  • Indeed, the key to OT is the ability to "transform" the commands, so that they can be applied even when the document they are being applied to has changed. The goal is to allow everyone to keep performing operations happily without any notion of locking or concurrency control, and the system should cause all clients to end up with the same final state. – fabspro Feb 07 '14 at 05:09
  • Have a read of this https://neil.fraser.name/writing/sync/ – fabspro Feb 07 '14 at 05:10

0 Answers0