Communities

Writing
Writing
Codidact Meta
Codidact Meta
The Great Outdoors
The Great Outdoors
Photography & Video
Photography & Video
Scientific Speculation
Scientific Speculation
Cooking
Cooking
Electrical Engineering
Electrical Engineering
Judaism
Judaism
Languages & Linguistics
Languages & Linguistics
Software Development
Software Development
Mathematics
Mathematics
Christianity
Christianity
Code Golf
Code Golf
Music
Music
Physics
Physics
Linux Systems
Linux Systems
Power Users
Power Users
Tabletop RPGs
Tabletop RPGs
Community Proposals
Community Proposals
tag:snake search within a tag
answers:0 unanswered questions
user:xxxx search by author id
score:0.5 posts with 0.5+ score
"snake oil" exact phrase
votes:4 posts with 4+ votes
created:<1w created < 1 week ago
post_type:xxxx type of post
Search help
Notifications
Mark all as read See all your notifications »
Q&A

Welcome to Software Development on Codidact!

Will you help us build our independent community of developers helping developers? We're small and trying to grow. We welcome questions about all aspects of software development, from design to code to QA and more. Got questions? Got answers? Got code you'd like someone to review? Please join us.

Post History

71%
+3 −0
Q&A Would a MySQL database run more efficiently with smaller varchar lengths?

YES or NO: It all depends on the storage engine Fairly universally, though IIRC from looking at PostgreSQL a while back, PostgreSQL may not even do that, there is a difference between CHAR/VARCHAR/...

posted 4y ago by manassehkatz‭  ·  edited 4y ago by manassehkatz‭

Answer
#2: Post edited by user avatar manassehkatz‭ · 2020-10-22T01:29:17Z (about 4 years ago)
  • ### It all depends on the storage engine
  • Fairly universally, though IIRC from looking at PostgreSQL a while back, PostgreSQL may not even do that, there is a difference between CHAR/VARCHAR/BINARY/BLOB/TEXT etc. types based on declared size, where 1, 2 or 4 bytes are used to store the actual length of the object:
  • * 1 - 255 = 1 byte
  • * 256 - 65,535 = 2 bytes
  • * 65,536 - 2^24-1 = 3 bytes
  • * 2^24 - 2^32-1 (4 Gig.) = 4 bytes
  • In MySQL (some other databases handle this differently) there are even different names, at least for some of the data types:
  • https://dev.mysql.com/doc/refman/8.0/en/storage-requirements.html#data-types-storage-reqs-strings
  • * TINYTEXT
  • * TEXT
  • * MEDIUMTEXT
  • * LONGTEXT
  • But the data has to be stored *somewhere* too. That can be as part of the actual record, which is how databases used to typically work. That is still the case for MySQL when using MyISAM Static Tables:
  • https://dev.mysql.com/doc/refman/8.0/en/static-format.html
  • With MyISAM Static Tables, the declared field length (within certain limits) determines the row size, which determines the on-disk table size. Create a table where each record has an integer ID (typically 4 bytes) and 4 VARCHAR(255) then each record will take just over 1 Kilobyte on disk - for 1,000 records that's a Megabyte (not counting overhead and indexes) and for 1,000,000 records that's a Gigabyte. Create the same table with VARCHAR(25) and you've cut the on-disk size by 90%. If the table has 100 records, that makes no real difference. If it has 1,000,000 records then it makes a big difference - possibly the difference between "entire table read into memory" vs. "lots of reads every time you access a different record".
  • While storing full VARCHAR capacity may seem like a total waste, the table manipulation is incredibly simple - no pointers, just read a fixed size row and chop it up. This is, of course, no longer the default engine for MySQL, and InnoDB (and even other variants of MyISAM) have many other advantages, but for this specific MyISAM Static, the answer to the original question is a clear **yes**.
  • Other database engines work differently. For example, InnoDB (and by inference in the documentation about it, also some of the non-Static variants of MyISAM) store VARCHAR strings using just the amount of space needed for the actual text + the byte count + some overhead (e.g., InnoDB uses a 4-byte alignment, which means every string gets to start on a typical 32-bit word boundary).
  • On balance, with *typical* current installations, the answer is either **no** or "so little difference that it just doesn't matter". But I am almost always in favor of optimizing the stated structure to match the actual usage. I often process a new batch of data using arbitrarily long fields and then analyze it to see what is really needed, and adjust my production schema accordingly.
  • ### YES or NO: It all depends on the storage engine
  • Fairly universally, though IIRC from looking at PostgreSQL a while back, PostgreSQL may not even do that, there is a difference between CHAR/VARCHAR/BINARY/BLOB/TEXT etc. types based on declared size, where 1, 2 or 4 bytes are used to store the actual length of the object:
  • * 1 - 255 = 1 byte
  • * 256 - 65,535 = 2 bytes
  • * 65,536 - 2^24-1 = 3 bytes
  • * 2^24 - 2^32-1 (4 Gig.) = 4 bytes
  • In MySQL (some other databases handle this differently) there are even different names, at least for some of the data types:
  • https://dev.mysql.com/doc/refman/8.0/en/storage-requirements.html#data-types-storage-reqs-strings
  • * TINYTEXT
  • * TEXT
  • * MEDIUMTEXT
  • * LONGTEXT
  • But the data has to be stored *somewhere* too. That can be as part of the actual record, which is how databases used to typically work. That is still the case for MySQL when using MyISAM Static Tables:
  • https://dev.mysql.com/doc/refman/8.0/en/static-format.html
  • With MyISAM Static Tables, the declared field length (within certain limits) determines the row size, which determines the on-disk table size. Create a table where each record has an integer ID (typically 4 bytes) and 4 VARCHAR(255) then each record will take just over 1 Kilobyte on disk - for 1,000 records that's a Megabyte (not counting overhead and indexes) and for 1,000,000 records that's a Gigabyte. Create the same table with VARCHAR(25) and you've cut the on-disk size by 90%. If the table has 100 records, that makes no real difference. If it has 1,000,000 records then it makes a big difference - possibly the difference between "entire table read into memory" vs. "lots of reads every time you access a different record".
  • While storing full VARCHAR capacity may seem like a total waste, the table manipulation is incredibly simple - no pointers, just read a fixed size row and chop it up. This is, of course, no longer the default engine for MySQL, and InnoDB (and even other variants of MyISAM) have many other advantages, but for this specific MyISAM Static, the answer to the original question is a clear **yes**.
  • Other database engines work differently. For example, InnoDB (and by inference in the documentation about it, also some of the non-Static variants of MyISAM) store VARCHAR strings using just the amount of space needed for the actual text + the byte count + some overhead (e.g., InnoDB uses a 4-byte alignment, which means every string gets to start on a typical 32-bit word boundary).
  • On balance, with *typical* current installations, the answer is either **no** or "so little difference that it just doesn't matter". But I am almost always in favor of optimizing the stated structure to match the actual usage. I often process a new batch of data using arbitrarily long fields and then analyze it to see what is really needed, and adjust my production schema accordingly.
#1: Initial revision by user avatar manassehkatz‭ · 2020-10-21T18:21:20Z (about 4 years ago)
### It all depends on the storage engine

Fairly universally, though IIRC from looking at PostgreSQL a while back, PostgreSQL may not even do that, there is a difference between CHAR/VARCHAR/BINARY/BLOB/TEXT etc. types based on declared size, where 1, 2 or 4 bytes are used to store the actual length of the object:

* 1 - 255 = 1 byte
* 256 - 65,535 = 2 bytes
* 65,536 - 2^24-1 = 3 bytes
* 2^24 - 2^32-1 (4 Gig.) = 4 bytes

In MySQL (some other databases handle this differently) there are even different names, at least for some of the data types:

https://dev.mysql.com/doc/refman/8.0/en/storage-requirements.html#data-types-storage-reqs-strings

* TINYTEXT
* TEXT
* MEDIUMTEXT
* LONGTEXT

But the data has to be stored *somewhere* too. That can be as part of the actual record, which is how databases used to typically work. That is still the case for MySQL when using MyISAM Static Tables:

https://dev.mysql.com/doc/refman/8.0/en/static-format.html

With MyISAM Static Tables, the declared field length (within certain limits) determines the row size, which determines the on-disk table size. Create a table where each record has an integer ID (typically 4 bytes) and 4 VARCHAR(255) then each record will take just over 1 Kilobyte on disk - for 1,000 records that's a Megabyte (not counting overhead and indexes) and for 1,000,000 records that's a Gigabyte. Create the same table with VARCHAR(25) and you've cut the on-disk size by 90%. If the table has 100 records, that makes no real difference. If it has 1,000,000 records then it makes a big difference - possibly the difference between "entire table read into memory" vs. "lots of reads every time you access a different record".

While storing full VARCHAR capacity may seem like a total waste, the table manipulation is incredibly simple - no pointers, just read a fixed size row and chop it up. This is, of course, no longer the default engine for MySQL, and InnoDB (and even other variants of MyISAM) have many other advantages, but for this specific MyISAM Static, the answer to the original question is a clear **yes**.

Other database engines work differently. For example, InnoDB (and by inference in the documentation about it, also some of the non-Static variants of MyISAM) store VARCHAR strings using just the amount of space needed for the actual text + the byte count + some overhead (e.g., InnoDB uses a 4-byte alignment, which means every string gets to start on a typical 32-bit word boundary).

On balance, with *typical* current installations, the answer is either **no** or "so little difference that it just doesn't matter". But I am almost always in favor of optimizing the stated structure to match the actual usage. I often process a new batch of data using arbitrarily long fields and then analyze it to see what is really needed, and adjust my production schema accordingly.