2

Often I see databases where primary keys are of uniqueidentifier type.

When joining such tables together, did this data type impact on performance of query?

Why prefer uniqueidentifier over integer?

How even SQL represent uniqueidentifier in JOIN clause, is it like varchar with specific length or has some other representation?

Often I am confused with this data type, why use and how is it hard for performance...

Thanks!

veljasije
  • 5,950
  • 9
  • 43
  • 75
  • 2
    Take a look at this post :[INT vs Unique-Identifier for ID field in database](http://stackoverflow.com/questions/1151625/int-vs-unique-identifier-for-id-field-in-database) – Maryam Arshi Sep 19 '13 at 12:20

2 Answers2

3

A quite lengthy (but still worth to read) description from MSDN:

Is a 16-byte GUID.

It's indented to be globally unique, i. e. you may be pretty sure that, unlike integer, it's unique across databases (or across whatever else in the universe), not just across your table's records.

Integers are 4 bytes, so joins on GUID could be marginally slower. However, in real world scenarios, you'll have much more data than this, and, hence, it's usually not something you'd worry about.

Indexes could be created on uniqueidentifier as well.

Quassnoi
  • 381,935
  • 83
  • 584
  • 593
1

Having a unique identifier as a join condition should perform almsot as quickly as an integer assuming that the fields that you are joining on are indexed appropriately in both tables involved in the join. An integer will be generally quicker but in most real world scenarios this wont make any difference.

There is however another issue with using unique identifiers as a primary key in a table in that it creates page fragmentation as the underlying data pages have data inserted in a random order and a primary key over a clustered index (sql server default) will want to store them in lexicographical order which it will find difficult because new indexes can be created with a value anywhere in the data pages as opposed to an int which will usually be sequential and increasing.

My advice would be to use a surrogate primary key as an integer (big int if necessary) and have the unique identifier as another column in the table with an appropriate covering index.

Richard Newman
  • 590
  • 2
  • 6
  • 17
  • 2
    1. Join by GUIDs is slower 2. GUID column can be a primary key. But this key shouldn't be clustered in most cases. – Roman Sergeev Sep 19 '13 at 12:31
  • 1
    I agree with @RomanSergeev on both of his points, however if I have the need for a unique identifier I still tend towards the practice suggested here of a surrogate key, and then a unique identifier field with a unique non clustered index on. This will at initially take up more space, but assuming you have 2 tables that refer to this main table, each with the foreign key column indexed then you will save the additional space in the foreign keys and their indices. – GarethD Sep 19 '13 at 12:39