GUID is not Always GOOD !!!!!!! (a true RTFM story)

By patrickwellink
March 15, 2004
1

A collegue of mine asked me some questions about a clustered index. He was preparing for a SQL Exam and he had trouble understaning some concepts. He had trouble with the concepts not because they were too difficult for him, but they were completely opposite to what he experienced in everyday programming life !!!!


To make a long story short.


A lot of programmers have a GUID as a Primary key.


In the Books online that come with SQL Server there is the following statement :


PRIMARY KEY constraints create clustered indexes automatically if no clustered index already exists on the table and a nonclustered index is not specified when you create the PRIMARY KEY constraint


So if there is no other index on a table, the primary key will be clustered. (so far nothing wrong) But if you have a GUID as a PK then let’s have a look at what the books online say about the UniqueIdentifier………


The uniqueidentifier data type has several disadvantages:



  1. The values are long and obscure. This makes them difficult for users to type correctly, and more difficult for users to remember.

  2. The values are random and cannot accept any patterns that may make them more meaningful to users.

  3. There is no way to determine the sequence in which uniqueidentifier values were generated. They are not suited for existing applications that depend on incrementing key values serially.

  4. At 16 bytes, the uniqueidentifier data type is relatively large compared to other data types such as 4-byte integers. This means indexes built using uniqueidentifier keys may be relatively slower than implementing the indexes using an int key. Consider using the IDENTITY property when global uniqueness is not necessary, or when having a serially incrementing key is desirable.

Besides That. There is also another Disadvantage on having a GUID as the primary key. The data type is relatively large……(also from the books online)


Wide keys , The key values from the clustered index are used by all nonclustered indexes as lookup keys and therefore are stored in each nonclustered index leaf entry.


So every record refers to it’s position in the table with the GUID. Since a guid is 16 bytes and an ID is 4 the index with the GUID is much larger and will there therefore be slower…..


Hope this is usefull for someone……



Patrick



Comments: 1

  1. Ross says:

    Though this is an old thread, I am compelled to add to it for anyone that does a web search and reads through attempting to make a decision on GUID vs. INT.

    I’d like to summarize all of the above into a simple, concise logic fork as a guideline to GUID vs. INT (disregarding natural keys).

    1. For a non-replicated, non-clustered database, a clustered INT PK will most often be the better choice, giving performance and smaller data file sizes.

    2. For a replicated, clustered, or otherwise distributed database where synchronization, data merging, etc is needed, a NON-CLUSTERED GUID PK will grant reliable record uniqueness, providing data integrity and simple data merging and updating.

    That’s it. Apply this simple fork to each table and you will achieve an acceptable solution for your needs.

    One cautionary caveat, however: Look to the future. Maybe you requirements now don’t include taking a client system off-line to go remote, then come back and merge changes, but what about in a year or two when you want to release version 2? I can say from personal experience that designing and implementing an upgrade strategy to convert your complex INT-based database to be GUID-based to support the data merge is no trivial task.

Comments are closed.

  • Recent Posts
  • Recent Comments
  • Archives
  • Categories
  • Meta