Custom akka distributed data type: should I extend ReplicatedDataSerialization?

Question

According to the doc, it is recommended to implement efficient serialization with Protobuf or similar for our custom data type. However, I also find the built-in data types (e.g., GCounter) extends ReplicatedDataSerialization (see code), which according to scaladoc,

Marker trait for ReplicatedData serialized by akka.cluster.ddata.protobuf.ReplicatedDataSerializer.

I wonder whether I should implement my own serializer implementation or simply use the one from akka. What's the benefit of implementing my own? Since my custom data type implementation (see code or below) is really similar to a PNCounter I feel the Akka one would work for my case well.

import akka.cluster.ddata.{GCounter, Key, ReplicatedData, ReplicatedDataSerialization, SelfUniqueAddress}

/**
  * Denote a fraction whose numerator and denominator are always growing
  * Prefer such a custom ddata structure over using 2 GCounter separately is to get best of both worlds:
  * As lightweight as a GCounter, and can update/get both values at the same time like a PNCounterMap
  * Implementation-wise, it borrows from PNCounter a lot
  */
case class FractionGCounter(
  private val numerator: GCounter = GCounter(),
  private val denominator: GCounter = GCounter()
) extends ReplicatedData
    with ReplicatedDataSerialization {
  type T = FractionGCounter

  def value: (BigInt, BigInt) = (numerator.value, denominator.value)
  def incrementNumerator(n: Int)(implicit node: SelfUniqueAddress): FractionGCounter = copy(numerator = numerator :+ n)
  def incrementDenominator(n: Int)(implicit node: SelfUniqueAddress): FractionGCounter =
    copy(denominator = denominator :+ n)

  override def merge(that: FractionGCounter): FractionGCounter =
    copy(numerator = this.numerator.merge(that.numerator), denominator = this.denominator.merge(that.denominator))
}

final case class FractionGCounterKey(_id: String) extends Key[FractionGCounter](_id) with ReplicatedDataSerialization

score 2 · Accepted Answer · answered Feb 11 '20 at 08:22

You could definitely use the built-in ReplicatedDataSerializer to serialize the GCounters that are at the core of your custom CRDT.

However, as you can see when looking at that class, it explicitly enumerates the types it can serialize, meaning it won't be able to serialize your FractionGCounter objects.

You'll still need your own serializer that understands FractionGCounter objects (and which may use the built-in ReplicatedDataSerializer 'inside').

score 1 · Answer 2 · answered Feb 11 '20 at 14:07

1

In addition what has been said by Arnout, one important aspect of serialization is schema migration. Obviously the Akka internal ones are bound to the life cycle of the respective Akka modules. Hence I would definitely write my own.

answered Feb 11 '20 at 14:07

Heiko Seeberger

3,648
19
20

score 0 · Answer 3 · answered Feb 10 '20 at 22:05

AFAIU they don't suggest implementing own serialization mechanism but rather using one of existing solutions from the market, that simply aren't part of akka as not being akka-specific. However they can easily be incorporated and for sure you can find 3rd party libs that integrate them into akka.

There is no simple answer for the question which one will be the best as it depends heavily on the specific use case. Here you have a discussion about performance of several of more popular options:
Performance comparison of Thrift, Protocol Buffers, JSON, EJB, other?

You can start by using akka built-in serialization and replace it later with something more suitable.

Custom akka distributed data type: should I extend ReplicatedDataSerialization?

3 Answers3