Welcome to Software Development on Codidact!
Will you help us build our independent community of developers helping developers? We're small and trying to grow. We welcome questions about all aspects of software development, from design to code to QA and more. Got questions? Got answers? Got code you'd like someone to review? Please join us.
What are the pros and cons of a composite primary key versus a unique constraint?
Let's say we have two tables A and B and a join table C that has foreign keys to both A and B and the combination of those foreign keys is unique.
One could either do a unique constraint or a composite primary key on those columns.
What would the pros and cons of either solution to this be?
2 answers
One could either do a unique constraint on those foreign keys or a composite primary key on those columns.
For elegance, canonicity, a primary key is necessary. You would initially have a primary key on those columns.
However, for sortability reasons, you would eventually use just a primary key on a single column ad-hoc created to be the id/key, so no, in the end you don't have a primary key on those columns, but a unique constraint.
Facts
In relational theory, each row in a relation is uniquely identified by a primary key. That's why some purists say that every table should have a primary key.
Foreign keys usually reference a primary key. In particular, if you do not specify a column when creating a foreign key, the primary key of the referenced table is assumed. That's why primary keys should be immutable, and benefit from being short, so they're easy to compare and to store in indexes.
Primary keys must not contain NULL values.
There can only be a single primary key per table. Most databases create a clustered index for a primary key, which means that the table is stored in index order, making table accesses in index order sequential and therefore faster.
Recommendation
That is, you should only consider a primary key on columns that are immutable and not-null.
In that case, the only technical difference between a primary and a unique key is that the index is clustered. This only matters if the table contains additional columns not present in the constraint. Most m:n join tables do not have additional columns, so it doesn't matter whether the index is clustered or not. (And even if you do benefit some from a clustered index, you can often achieve the same effect by creating a non-clustered index with all relevant columns, so it's questionable whether we should let this drive our decision).
That leaves maintainability as our only concern. What makes the schema easier to understand for our fellow developers? What captures our intent better?
To me, a primary key constraint designates the primary way of referencing and accessing rows in that table. That is, if I had no need to reference that individual row anywhere else in my schema or application, for instance because it is just a "part" of another object (UML calls this composition), or because it represents a mere value rather than an entity with changing state (i.e. a value type rather than a reference type), I'd not bother with a primary key and use a plain unique constraint instead.
On the other hand, if I did want to reference that row, I'd probably prefer an easy way to reference it, so I could say "comment #5421 was flagged" rather than "Johns 3rd comment on the 4th answer to question #874 was flagged". That is, I'd create a new primary key column in addition to the unique foreign key pair.
Off hand, I can't think of a case where I'd use a composite primary key - though in part, this might be because my favorite object relational mapper makes composite primary keys quite painful to use.
TL;DR
Use constraints to constrain your data, use primary keys to identify rows to keep track of them even as their data changes.
0 comment threads