Communities

Writing
Writing
Codidact Meta
Codidact Meta
The Great Outdoors
The Great Outdoors
Photography & Video
Photography & Video
Scientific Speculation
Scientific Speculation
Cooking
Cooking
Electrical Engineering
Electrical Engineering
Judaism
Judaism
Languages & Linguistics
Languages & Linguistics
Software Development
Software Development
Mathematics
Mathematics
Christianity
Christianity
Code Golf
Code Golf
Music
Music
Physics
Physics
Linux Systems
Linux Systems
Power Users
Power Users
Tabletop RPGs
Tabletop RPGs
Community Proposals
Community Proposals
tag:snake search within a tag
answers:0 unanswered questions
user:xxxx search by author id
score:0.5 posts with 0.5+ score
"snake oil" exact phrase
votes:4 posts with 4+ votes
created:<1w created < 1 week ago
post_type:xxxx type of post
Search help
Notifications
Mark all as read See all your notifications »
Q&A

Welcome to Software Development on Codidact!

Will you help us build our independent community of developers helping developers? We're small and trying to grow. We welcome questions about all aspects of software development, from design to code to QA and more. Got questions? Got answers? Got code you'd like someone to review? Please join us.

Post History

83%
+8 −0
Q&A What are the pros and cons of a composite primary key versus a unique constraint?

Facts In relational theory, each row in a relation is uniquely identified by a primary key. That's why some purists say that every table should have a primary key. Foreign keys usually reference a ...

posted 4y ago by meriton‭  ·  edited 4y ago by meriton‭

Answer
#2: Post edited by user avatar meriton‭ · 2020-08-24T19:26:23Z (over 4 years ago)
Formatting and TL;DR
  • In relational theory, each row in a relation is uniquely identified by a primary key. That's why some purists say that every table should have a primary key.
  • Foreign keys usually reference a primary key. In particular, if you do not specify a column when creating a foreign key, the primary key of the referenced table is assumed. That's why primary keys should be immutable, and benefit from being short, so they're easy to compare and to store in indexes.
  • Primary keys must not contain NULL values.
  • There can only be a single primary key per table. Most databases create a clustered index for a primary key, which means that the table is stored in index order, making table accesses in index order sequential and therefore faster.
  • That is, you should only consider a primary key on columns that are immutable and not-null.
  • In that case, the only technical difference between a primary and a unique key is that the index is clustered. This only matters if the table contains additional columns not present in the constraint. Most m:n join tables do not have additional columns, so it doesn't matter whether the index is clustered or not. (And even if you do benefit some from a clustered index, you can often achieve the same effect by creating a non-clustered index with all relevant columns, so it's questionable whether we should let this drive our decision).
  • That leaves maintainability as our only concern. What makes the schema easier to understand for our fellow developers? What captures our intent better?
  • To me, a primary key constraint designates the primary way of referencing and accessing rows in that table. That is, if I had no need to reference that individual row anywhere else in my schema or application, for instance because it is just a "part" of another object (UML calls this *composition*), or because it represents a mere value rather than an entity with changing state (i.e. a *value type* rather than a *reference type*), I'd not bother with a primary key and use a plain unique constraint instead.
  • On the other hand, if I did want to reference that row, I'd probably prefer an *easy* way to reference it, so I could say "comment #5421 was flagged" rather than "Johns 3rd comment on the 4th answer to question #874 was flagged". That is, I'd create a new primary key column in addition to the unique foreign key pair.
  • Off hand, I can't think of a case where I'd use a composite primary key - though in part, this might be because my favorite object relational mapper makes composite primary keys quite painful to use.
  • # Facts
  • In relational theory, each row in a relation is uniquely identified by a primary key. That's why some purists say that every table should have a primary key.
  • Foreign keys usually reference a primary key. In particular, if you do not specify a column when creating a foreign key, the primary key of the referenced table is assumed. That's why primary keys should be immutable, and benefit from being short, so they're easy to compare and to store in indexes.
  • Primary keys must not contain NULL values.
  • There can only be a single primary key per table. Most databases create a clustered index for a primary key, which means that the table is stored in index order, making table accesses in index order sequential and therefore faster.
  • # Recommendation
  • That is, you should only consider a primary key on columns that are immutable and not-null.
  • In that case, the only technical difference between a primary and a unique key is that the index is clustered. This only matters if the table contains additional columns not present in the constraint. Most m:n join tables do not have additional columns, so it doesn't matter whether the index is clustered or not. (And even if you do benefit some from a clustered index, you can often achieve the same effect by creating a non-clustered index with all relevant columns, so it's questionable whether we should let this drive our decision).
  • That leaves maintainability as our only concern. What makes the schema easier to understand for our fellow developers? What captures our intent better?
  • To me, a primary key constraint designates the primary way of referencing and accessing rows in that table. That is, if I had no need to reference that individual row anywhere else in my schema or application, for instance because it is just a "part" of another object (UML calls this *composition*), or because it represents a mere value rather than an entity with changing state (i.e. a *value type* rather than a *reference type*), I'd not bother with a primary key and use a plain unique constraint instead.
  • On the other hand, if I did want to reference that row, I'd probably prefer an *easy* way to reference it, so I could say "comment #5421 was flagged" rather than "Johns 3rd comment on the 4th answer to question #874 was flagged". That is, I'd create a new primary key column in addition to the unique foreign key pair.
  • Off hand, I can't think of a case where I'd use a composite primary key - though in part, this might be because my favorite object relational mapper makes composite primary keys quite painful to use.
  • # TL;DR
  • Use constraints to constrain your data, use primary keys to identify rows to keep track of them even as their data changes.
#1: Initial revision by user avatar meriton‭ · 2020-08-24T19:14:14Z (over 4 years ago)
In relational theory, each row in a relation is uniquely identified by a primary key. That's why some purists say that every table should have a primary key. 

Foreign keys usually reference a primary key. In particular, if you do not specify a column when creating a foreign key, the primary key of the referenced table is assumed. That's why primary keys should be immutable, and benefit from being short, so they're easy to compare and to store in indexes.

Primary keys must not contain NULL values.

There can only be a single primary key per table. Most databases create a clustered index for a primary key, which means that the table is stored in index order, making table accesses in index order sequential and therefore faster. 

That is, you should only consider a primary key on columns that are immutable and not-null. 

In that case, the only technical difference between a primary and a unique key is that the index is clustered. This only matters if the table contains additional columns not present in the constraint. Most m:n join tables do not have additional columns, so it doesn't matter whether the index is clustered or not. (And even if you do benefit some from a clustered index, you can often achieve the same effect by creating a non-clustered index with all relevant columns, so it's questionable whether we should let this drive our decision). 

That leaves maintainability as our only concern. What makes the schema easier to understand for our fellow developers? What captures our intent better?

To me, a primary key constraint designates the primary way of referencing and accessing rows in that table. That is, if I had no need to reference that individual row anywhere else in my schema or application, for instance because it is just a "part" of another object (UML calls this *composition*), or because it represents a mere value rather than an entity with changing state (i.e. a *value type* rather than a *reference type*), I'd not bother with a primary key and use a plain unique constraint instead.

On the other hand, if I did want to reference that row, I'd probably prefer an *easy* way to reference it, so I could say "comment #5421 was flagged" rather than "Johns 3rd comment on the 4th answer to question #874 was flagged". That is, I'd create a new primary key column in addition to the unique foreign key pair.

Off hand, I can't think of a case where I'd use a composite primary key - though in part, this might be because my favorite object relational mapper makes composite primary keys quite painful to use.