client side GUID generation without Table fragmentation cost

Question

I want to use client side generation GUID without the fragmentation cost it has in the DB side (I don't want to consider generating the primary key at the database side, as I want to leverage DDD and unit testing).

I am using SQL Server 2012 and Entity Framework 6.

For that, i have created a table in sql server with:

And the entity

public partial class Mac
{
    public Mac()
    {
        this.Id = Guid.NewGuid()
    }
    public System.Guid Id { get; set; }
    public string direccionMac { get; set; }
}

I understand this design will lead to table fragmentation. Therefore, as in here: https://stackoverflow.com/a/604842/310107 I added another column as clustered index named MyClusterKey (identity(1,1)).

So, Id, non clustered, and the primary key. MyClusterKey, clustered index, identity(1,1)

And the entity:

public partial class Mac
{
    public Mac()
    {
        this.Id = Guid.NewGuid();
    }

    public System.Guid Id { get; set; }
    public long MyClusterKey { get; set; }
    public string direccionMac { get; set; }
}

MyClusteredKey added as DatabaseGeneratedOption.Identity:

   public class MacMap : EntityTypeConfiguration<Mac>
    {
        public MacMap()
        {
            // Primary Key
            this.HasKey(t => t.Id);

            // Properties
            this.Property(t => t.MyClusterKey)
                .HasDatabaseGeneratedOption(DatabaseGeneratedOption.Identity);

            this.Property(t => t.direccionMac)
                .IsRequired()
                .IsFixedLength()
                .HasMaxLength(16);

            // Table & Column Mappings
            this.ToTable("Macs");
            this.Property(t => t.Id).HasColumnName("Id");
            this.Property(t => t.MyClusterKey).HasColumnName("MyClusterKey");
            this.Property(t => t.direccionMac).HasColumnName("direccionMac");
        }
    }

Another choice will be using option 1 (without the other clustered index), but with sequencial GuidComb as in https://github.com/nhibernate/nhibernate-core/blob/master/src/NHibernate/Id/GuidCombGenerator.cs

This will generate Guids in some order:

Questions:

Is option 1 correctly implemented?
What is the preferred option: 1, 2, 3 or neither and why?
Option 3 is generating guid sorted only within the last part. This won't generate table fragmentation (if using the id as clustered key)?
In option 2, the column MyClusterKey is used only to provide a clustered index and avoid table fragmentation ? Or it can have another use?

Is MyClusterKey used in the business layer ? It is only to prevent fragmentation and won't be used at all ? — nerlijma, Sep 05 '15 at 04:03
Just keep option 1. The very important thing to remember is *do not define guid PKs as clustered, but guid PKs are fine*. That's it. Check this on SO: http://stackoverflow.com/questions/1151625/int-vs-unique-identifier-for-id-field-in-database — Simon Mourier, Sep 05 '15 at 06:50
option 1 without the guid as clustered key is option 2, right? still didn't answer my question. Ís MyClusterKey used anywhere? — nerlijma, Sep 05 '15 at 13:47

diegogravisaco · Answer 1 · 2015-09-05T04:00:56.870

I'd go with option 2 as in option 3 you will understand the logic behind it but there is no point on the effort as there will be fragmentation anyway. Even so, performance degradation will be the same or even worst as more workprocess should be done in client side to organize it as in opt 2 as for sequencial logic does not apply to the pages in the scenario. Again, i'd rather go for second option but it also depends on how the table will be consumed or served and what you are trying to obtain out of it. Let me go deeper into this, if you are sharing a context on a threadlocal the sequential id should be the definitive approach as you will have control over the next key created (therefore opt3 will be your candidate) and you will avoid table fragmentation. However, in some scenarios, it eill be better NOT to control thread safety in EF and let the object context handle the way it goes to the db in any case. For that scenario, you will be doing two roundtrips to the db as equal to having just a regular identity key controlled to confirm what you are getting. But if you must control threading, opt 3 is the way to go although you will not get rid of table fragmentation (at least from some perspective from other sources) but you can reuse other variables (to achieve higher throughput behind a multiple context wcf). In any case, opt 2 is most suitable to several scenarios that will fulfill the balanced give or take of what you want to achieve of architecting your solution.

Is MyClusterKey used in the business layer ? It is only to prevent fragmentation and won't be used at all ? — nerlijma, Sep 05 '15 at 13:47
As long as you don't use the MyClusterKey (not even appearing in the code) the table fragmentation should not occur and you will be safe — diegogravisaco, Sep 05 '15 at 23:54

client side GUID generation without Table fragmentation cost

1 Answers1