405

Can I have multiple primary keys in a single table?

Angel Politis
  • 9,949
  • 12
  • 43
  • 62

12 Answers12

588

A Table can have a Composite Primary Key which is a primary key made from two or more columns. For example:

CREATE TABLE userdata (
  userid INT,
  userdataid INT,
  info char(200),
  primary key (userid, userdataid)
);

Update: Here is a link with a more detailed description of composite primary keys.

Sarfaraaz
  • 476
  • 5
  • 16
Adam Pierce
  • 30,917
  • 21
  • 66
  • 86
  • 3
    In this example, BOTH userid and userdataid are needed to identify/find a unique row. Not sure what the OP's intention was, but I came here looking to see if I could uniquely identify a row with one of a set of keys. For instance, I would like to identify a unique user with either a username OR a userid, without needing both. I guess RB's answer of unique indexes would do the trick there. – Burrito Dec 18 '17 at 16:30
  • 1
    @Benitok As mentioned in RB.'s [answer](https://stackoverflow.com/a/217948/206552), you can use Unique Indexes to do what you are looking for (a unique, indexed column independent of other unique, indexed columns on the same table). Be sure to consult your specific flavor of SQL's manual for details on the exact language syntax used. – 4AM Jan 25 '18 at 17:16
209

You can only have one primary key, but you can have multiple columns in your primary key.

You can also have Unique Indexes on your table, which will work a bit like a primary key in that they will enforce unique values, and will speed up querying of those values.

RB.
  • 33,692
  • 12
  • 79
  • 121
41

A table can have multiple candidate keys. Each candidate key is a column or set of columns that are UNIQUE, taken together, and also NOT NULL. Thus, specifying values for all the columns of any candidate key is enough to determine that there is one row that meets the criteria, or no rows at all.

Candidate keys are a fundamental concept in the relational data model.

It's common practice, if multiple keys are present in one table, to designate one of the candidate keys as the primary key. It's also common practice to cause any foreign keys to the table to reference the primary key, rather than any other candidate key.

I recommend these practices, but there is nothing in the relational model that requires selecting a primary key among the candidate keys.

Walter Mitty
  • 16,496
  • 2
  • 24
  • 53
  • 5
    Agreed. All keys are equal (none is 'primary') in the logical model. The choice of which key in the physical implementation gets the PRIMARY KEY designation is arbitray and vendor/product dependent. – onedaywhen Oct 22 '08 at 07:49
  • 3
    I would say that it's database designer dependent. – Walter Mitty Jun 20 '13 at 19:41
  • I've just come across a use case where this is required. I have a table which will be created/managed by Entity Framework - which as far as I can gather does not support non primary key unique composite constraints at present. However it does support composite Primary Keys. The data is also going to be linked to a remote database system that doesn't support composite keys at all. I've gone with creating a Composite PK in EF but also adding a non nullable GUID column that the other system can use to uniquely identify against. – Chris Nevill May 22 '14 at 10:15
  • 2
    Chris, I said that the relational model doesn't require primary keys. I didn't say anything about whether some tool might require them. But I take your point. – Walter Mitty May 22 '14 at 10:29
  • I think there is a requirement that the PK be minimal, i.e., uses the least number of columns to uniquely identify each record. – gary Mar 26 '18 at 22:17
  • This is areqirement for a candidate key. Thus, it's a requirement for a primary key as I've stated it. But there are databases that allow you to declare a PK that is not minimal. – Walter Mitty Mar 26 '18 at 23:20
  • SQL PK means UNIQUE NOT NULL & neither PK nor UNIQUE need to be minimal. – philipxy Dec 21 '18 at 01:40
15

This is the answer for both the main question and for @Kalmi's question of

What would be the point of having multiple auto-generating columns?

This code below has a composite primary key. One of its columns is auto-incremented. This will work only in MyISAM. InnoDB will generate an error "ERROR 1075 (42000): Incorrect table definition; there can be only one auto column and it must be defined as a key".

DROP TABLE IF EXISTS `test`.`animals`;
CREATE TABLE  `test`.`animals` (
  `grp` char(30) NOT NULL,
  `id` mediumint(9) NOT NULL AUTO_INCREMENT,
  `name` char(30) NOT NULL,
  PRIMARY KEY (`grp`,`id`)
) ENGINE=MyISAM;

INSERT INTO animals (grp,name) VALUES
    ('mammal','dog'),('mammal','cat'),
    ('bird','penguin'),('fish','lax'),('mammal','whale'),
    ('bird','ostrich');

SELECT * FROM animals ORDER BY grp,id;

Which returns:

+--------+----+---------+
| grp    | id | name    |
+--------+----+---------+
| fish   |  1 | lax     |
| mammal |  1 | dog     |
| mammal |  2 | cat     |
| mammal |  3 | whale   |
| bird   |  1 | penguin |
| bird   |  2 | ostrich |
+--------+----+---------+
Eye
  • 7,902
  • 7
  • 40
  • 68
  • 3
    This works if you specify the auto-incrementing column first in the primary key definition. (Maybe this changed, I just tested it in 5.6) – CTarczon May 04 '15 at 16:04
13

(Have been studying these, a lot)

Candidate keys - A minimal column combination required to uniquely identify a table row.
Compound keys - 2 or more columns.

  • Multiple Candidate keys can exist in a table.
    • Primary KEY - Only one of the candidate keys that is chosen by us
    • Alternate keys - All other candidate keys
      • Both Primary Key & Alternate keys can be Compound keys

Sources:
https://en.wikipedia.org/wiki/Superkey
https://en.wikipedia.org/wiki/Candidate_key
https://en.wikipedia.org/wiki/Primary_key
https://en.wikipedia.org/wiki/Compound_key

Manohar Reddy Poreddy
  • 16,412
  • 7
  • 111
  • 98
6

Primary Key is very unfortunate notation, because of the connotation of "Primary" and the subconscious association in consequence with the Logical Model. I thus avoid using it. Instead I refer to the Surrogate Key of the Physical Model and the Natural Key(s) of the Logical Model.

It is important that the Logical Model for every Entity have at least one set of "business attributes" which comprise a Key for the entity. Boyce, Codd, Date et al refer to these in the Relational Model as Candidate Keys. When we then build tables for these Entities their Candidate Keys become Natural Keys in those tables. It is only through those Natural Keys that users are able to uniquely identify rows in the tables; as surrogate keys should always be hidden from users. This is because Surrogate Keys have no business meaning.

However the Physical Model for our tables will in many instances be inefficient without a Surrogate Key. Recall that non-covered columns for a non-clustered index can only be found (in general) through a Key Lookup into the clustered index (ignore tables implemented as heaps for a moment). When our available Natural Key(s) are wide this (1) widens the width of our non-clustered leaf nodes, increasing storage requirements and read accesses for seeks and scans of that non-clustered index; and (2) reduces fan-out from our clustered index increasing index height and index size, again increasing reads and storage requirements for our clustered indexes; and (3) increases cache requirements for our clustered indexes. chasing other indexes and data out of cache.

This is where a small Surrogate Key, designated to the RDBMS as "the Primary Key" proves beneficial. When set as the clustering key, so as to be used for key lookups into the clustered index from non-clustered indexes and foreign key lookups from related tables, all these disadvantages disappear. Our clustered index fan-outs increase again to reduce clustered index height and size, reduce cache load for our clustered indexes, decrease reads when accessing data through any mechanism (whether index scan, index seek, non-clustered key lookup or foreign key lookup) and decrease storage requirements for both clustered and nonclustered indexes of our tables.

Note that these benefits only occur when the surrogate key is both small and the clustering key. If a GUID is used as the clustering key the situation will often be worse than if the smallest available Natural Key had been used. If the table is organized as a heap then the 8-byte (heap) RowID will be used for key lookups, which is better than a 16-byte GUID but less performant than a 4-byte integer.

If a GUID must be used due to business constraints than the search for a better clustering key is worthwhile. If for example a small site identifier and 4-byte "site-sequence-number" is feasible then that design might give better performance than a GUID as Surrogate Key.

If the consequences of a heap (hash join perhaps) make that the preferred storage then the costs of a wider clustering key need to be balanced into the trade-off analysis.

Consider this example::

ALTER TABLE Persons
ADD CONSTRAINT pk_PersonID PRIMARY KEY (P_Id,LastName)

where the tuple "(P_Id,LastName)" requires a uniqueness constraint, and may be a lengthy Unicode LastName plus a 4-byte integer, it would be desirable to (1) declaratively enforce this constraint as "ADD CONSTRAINT pk_PersonID UNIQUE NONCLUSTERED (P_Id,LastName)" and (2) separately declare a small Surrogate Key to be the "Primary Key" of a clustered index. It is worth noting that Anita possibly only wishes to add the LastName to this constraint in order to make that a covered field, which is unnecessary in a clustered index because ALL fields are covered by it.

The ability in SQL Server to designate a Primary Key as nonclustered is an unfortunate historical circumstance, due to a conflation of the meaning "preferred natural or candidate key" (from the Logical Model) with the meaning "lookup key in storage" from the Physical Model. My understanding is that originally SYBASE SQL Server always used a 4-byte RowID, whether into a heap or a clustered index, as the "lookup key in storage" from the Physical Model.

Pieter Geerkens
  • 11,399
  • 2
  • 25
  • 52
6

As noted by the others it is possible to have multi-column primary keys. It should be noted however that if you have some functional dependencies that are not introduced by a key, you should consider normalizing your relation.

Example:

Person(id, name, email, street, zip_code, area)

There can be a functional dependency between id -> name,email, street, zip_code and area But often a zip_code is associated with a area and thus there is an internal functional dependecy between zip_code -> area.

Thus one may consider splitting it into another table:

Person(id, name, email, street, zip_code)
Area(zip_code, name)

So that it is consistent with the third normal form.

Yet Another Geek
  • 4,041
  • 1
  • 26
  • 40
3

A primary key is the key that uniquely identifies a record and is used in all indexes. This is why you can't have more than one. It is also generally the key that is used in joining to child tables but this is not a requirement. The real purpose of a PK is to make sure that something allows you to uniquely identify a record so that data changes affect the correct record and so that indexes can be created.

However, you can put multiple fields in one primary key (a composite PK). This will make your joins slower (espcially if they are larger string type fields) and your indexes larger but it may remove the need to do joins in some of the child tables, so as far as performance and design, take it on a case by case basis. When you do this, each field itself is not unique, but the combination of them is. If one or more of the fields in a composite key should also be unique, then you need a unique index on it. It is likely though that if one field is unique, this is a better candidate for the PK.

Now at times, you have more than one candidate for the PK. In this case you choose one as the PK or use a surrogate key (I personally prefer surrogate keys for this instance). And (this is critical!) you add unique indexes to each of the candidate keys that were not chosen as the PK. If the data needs to be unique, it needs a unique index whether it is the PK or not. This is a data integrity issue. (Note this is also true anytime you use a surrogate key; people get into trouble with surrogate keys because they forget to create unique indexes on the candidate keys.)

There are occasionally times when you want more than one surrogate key (which are usually the PK if you have them). In this case what you want isn't more PK's, it is more fields with autogenerated keys. Most DBs don't allow this, but there are ways of getting around it. First consider if the second field could be calculated based on the first autogenerated key (Field1 * -1 for instance) or perhaps the need for a second autogenerated key really means you should create a related table. Related tables can be in a one-to-one relationship. You would enforce that by adding the PK from the parent table to the child table and then adding the new autogenerated field to the table and then whatever fields are appropriate for this table. Then choose one of the two keys as the PK and put a unique index on the other (the autogenerated field does not have to be a PK). And make sure to add the FK to the field that is in the parent table. In general if you have no additional fields for the child table, you need to examine why you think you need two autogenerated fields.

HLGEM
  • 88,902
  • 11
  • 105
  • 171
3

Some people use the term "primary key" to mean exactly an integer column that gets its values generated by some automatic mechanism. For example AUTO_INCREMENT in MySQL or IDENTITY in Microsoft SQL Server. Are you using primary key in this sense?

If so, the answer depends on the brand of database you're using. In MySQL, you can't do this, you get an error:

mysql> create table foo (
  id int primary key auto_increment, 
  id2 int auto_increment
);
ERROR 1075 (42000): Incorrect table definition; 
there can be only one auto column and it must be defined as a key

In some other brands of database, you are able to define more than one auto-generating column in a table.

Bill Karwin
  • 462,430
  • 80
  • 609
  • 762
  • 5
    What would be the point of having multiple auto-generating columns? – Tarnay Kálmán Aug 07 '09 at 16:34
  • I don't have a use case in mind, but if there ever were a need, some brands of database would support this and some would not. That's all I'm saying. – Bill Karwin Aug 07 '09 at 17:28
  • 1
    Here's a case : in an orders table, I have both an ID (auto incremented) and an external ID (hash-like strings), both should be unique so theoretically you could say that they are both "primary". of course this can be done with a secondary unique index but still it's a legit case (IMHO) – Nir Aug 24 '18 at 14:08
3

Having two primary keys at the same time, is not possible. But (assuming that you have not messed the case up with composite key), may be what you might need is to make one attribute unique.

CREATE t1(
c1 int NOT NULL,
c2 int NOT NULL UNIQUE,
...,
PRIMARY KEY (c1)
);

However note that in relational database a 'super key' is a subset of attributes which uniquely identify a tuple or row in a table. A 'key' is a 'super key' that has an additional property that removing any attribute from the key, makes that key no more a 'super key'(or simply a 'key' is a minimal super key). If there are more keys, all of them are candidate keys. We select one of the candidate keys as a primary key. That's why talking about multiple primary keys for a one relation or table is being a conflict.

  • Wikipedia does not have a definition for 'key'. Also the "removing any attribute from the key, makes that key no more a 'super key'" didn't mean anything for me, as when removing an attribute from super key can still be super key. – Manohar Reddy Poreddy Apr 08 '18 at 05:08
  • @ManoharReddyPoreddy Yes, in that case, your set of attributes is not a 'key' but a 'super key'. What I mean is if a set of attributes to be a 'key', the set should be minimal, or the set should have an additional property that removing any attribute from the set makes the resulting set not more a 'super key'. – Rusiru Adithya Samarasinghe Apr 10 '18 at 20:01
  • Looks like, your actual meaning of 'key' is Candidate_key (https://en.wikipedia.org/wiki/Candidate_key), may be it should be mentioned so. – Manohar Reddy Poreddy Apr 12 '18 at 06:58
  • @ManoharReddyPoreddy Yes I already have mentioned it in my answer. "If there are more keys, all of them are candidate keys". Anyway thanks for your review. – Rusiru Adithya Samarasinghe Apr 12 '18 at 14:31
  • 1. When you mention "If there are more keys, all of them are candidate keys", ... Do you mean otherwise/else, they are not candidate keys? ... 2. Where is the else-part? ... Are we same page at all? – Manohar Reddy Poreddy Apr 13 '18 at 02:38
  • @ManoharReddyPoreddy Otherwise means we have only one key, then we have no choice other than selecting it as the primary key(It has become a primary key even before we call it a candidate). Your hesitation point is true. It is also a candidate key. What I thought was the true meaning of the word 'candidate', comes into play when there are others to compete with. – Rusiru Adithya Samarasinghe Apr 15 '18 at 14:06
0

Good technical answers were given in better way than I can do. I am only can add to this topic:

If you want something that not allowed/acceptable it is good reason to take step back.

  1. Understand the core of why it's not acceptable.
  2. Dig more in documentation/journal articles/web and etc.
  3. Analyze/review current design and point major flaws.
  4. Consider and test every step during new design.
  5. Always look forward and try to create adaptive solution.

Hope it will helps someone.

Tom Lime
  • 984
  • 10
  • 12
-3

Yes, Its possible in SQL, but we can't set more than one primary keys in MsAccess. Then, I don't know about the other databases.

CREATE TABLE CHAPTER (
    BOOK_ISBN VARCHAR(50) NOT NULL,
    IDX INT NOT NULL,
    TITLE VARCHAR(100) NOT NULL,
    NUM_OF_PAGES INT,
    PRIMARY KEY (BOOK_ISBN, IDX)
);
stema
  • 80,307
  • 18
  • 92
  • 121