10

Is there any reason why I should not use an Integer as primary key for my tables?

Database is SQL-CE, two main tables of approx 50,000 entries per year, and a few minor tables. Only two connections will exist constantly open to the database. But updates will be triggered through multiple TCP socket connections, so it will be many cross threads that access and use the same database-connection. Although activity is very low, so simultanous updates are quite unlikely, but may occur maybe a couple of times per day max.

Will probably use LINQ2SQL for DAL, or typed datasets.

Not sure if this info is relevant, but that's why I'm asking, since I don't know :)

eriksv88
  • 3,260
  • 3
  • 25
  • 49
bretddog
  • 5,103
  • 8
  • 55
  • 108
  • You might also find some helpful input from these popular questions here: [How do you like your primary keys?](http://stackoverflow.com/questions/404040/how-do-you-like-your-primary-keys) [Is it a bad idea to use GUIDs as primary keys in MS SQL?](http://stackoverflow.com/questions/537145/is-it-a-bad-idea-to-use-guids-as-primary-keys-in-ms-sql) – DOK Dec 29 '10 at 20:14

5 Answers5

10

You should use an integer - it is smaller, meaning less memory, less IO (disk and network), less work to join on.

The database should handle the concurrency issues, regardless of the type of PK.

Oded
  • 463,167
  • 92
  • 837
  • 979
9

The advantage of using GUID primkey is that it should be unique in the world, such as whether to move data from one database to another. So you know that the row is unique.

But if we are talking about a small db, so I prefer integer.

Edit:

If you using SQL Server 2005++, can you also use NEWSEQUENTIALID(), this generates a GUID based on the row above.Allows the index problem with newid() is not there anymore.

eriksv88
  • 3,260
  • 3
  • 25
  • 49
  • It doesn;t completely solve the indexing problem just one part of using a clustered index. It will still create an index that will have performance problems (and muich bigger storage size for all indexes as they use the PK) compared to an int since it is so much wider, it is best to only use a GUID if you know you will need to move data to another database in the future or if using replication. – HLGEM Dec 29 '10 at 22:40
  • there are 16 bytes vs. 4 bytes then ... 1 million records will be slightly over 11MB – eriksv88 Dec 29 '10 at 23:11
  • Thanks! wasn't aware of the purpose of GUID being "that" unique, but that makes sense, and hence obviously nothing for me to worry about. It will not move out of my computer :) – bretddog Dec 30 '10 at 03:53
  • 4
    -1 for suggesting GUID against all evidence otherwise. What about non-clustered indexes that refer to the clustered index? – gbn Dec 30 '10 at 06:20
  • He does advice integer in 2nd paragraph. I accepted this answer as understanding the purpose of GUID makes me see why I don't need it. But all answers here were helpful and to the point. Thanks! – bretddog Dec 30 '10 at 19:11
  • It's not necessarily unique, just likely to be unique – Anthony Johnston May 15 '11 at 23:15
  • @gbn: I know you commonly advocate against using guids. However, we use them extensively in databases containing 100 million+ records. Guess what.. the db is extremely fast. The "evidence" I've seen against it just doesn't pan out while the flexibility GUIDs have provided has paid dividends in making my life easier. – NotMe Aug 22 '13 at 00:43
  • @ChrisLively: I do use GUIDs on occasion. Just rarely as the clustered index, and never at very high write volumes (millions of rows per minute). The overhead of page splits and increased width *generally* mean "don't use GUIDs". I'm not the only advocate either: see the other answers and comments... – gbn Aug 22 '13 at 09:41
5

Is there any reason why I should not use an Integer as primary key for my tables?

Nope, as long as each one is unique, integers are fine. Guids sounds like a good idea at first, but in reality they are much too large. Most of the time, it's using a sledgehammer to kill a fly, and the size of the Guid makes it much slower than using an integer.

kemiller2002
  • 107,653
  • 27
  • 187
  • 244
5

I see no reason not to use an auto-increment integer in this scenario. If you ever get to the point where an integer can't handle the volume of data then you're talking about an application scaled up to the point that a lot more work is involved anyway.

Keep in mind a few things:

  1. An integer is the native word size for the hardware. It's about as fast and simple and easy on the computer as a data type gets.
  2. When considering possibly using a GUID, know that they make for terrible primary keys. Relational databases in general (I can't speak for all, but MS SQL is a good example) don't index GUIDs well. There are hacks out there to try to make more index-friendly GUIDs, take them or leave them. But in general a GUID should be avoided as a PK for performance reasons.
David
  • 176,566
  • 33
  • 178
  • 245
  • 2
    It's funny, because I see Microsoft using GUID's for primary keys all the time. For instance, Microsoft CRM uses GUID's for it's primary keys. One advantage of GUIDs is that you don't have to worry about sequencing, and it's impossible to ever get a key violation (unless you're reusing a previous key somehow). – Erik Funkenbusch Dec 29 '10 at 20:16
  • 1
    @Mystere Man: Well, it's _unlikely_ to ever get a key violation :) But your point remains, and it's something I've never understood about some of Microsoft's tools. Their membership/role provider stuff for ASP .NET is a common example. Something like that may work fine for small projects where it's not a concern, but it won't scale. – David Dec 29 '10 at 20:19
4

Definitely use an integer, you do not want to use a GUID in a clustered index (PK) as it will cause the table to unnecessarily fragment.

Josh Weatherly
  • 1,628
  • 1
  • 13
  • 27