Communities

Writing
Writing
Codidact Meta
Codidact Meta
The Great Outdoors
The Great Outdoors
Photography & Video
Photography & Video
Scientific Speculation
Scientific Speculation
Cooking
Cooking
Electrical Engineering
Electrical Engineering
Judaism
Judaism
Languages & Linguistics
Languages & Linguistics
Software Development
Software Development
Mathematics
Mathematics
Christianity
Christianity
Code Golf
Code Golf
Music
Music
Physics
Physics
Linux Systems
Linux Systems
Power Users
Power Users
Tabletop RPGs
Tabletop RPGs
Community Proposals
Community Proposals
tag:snake search within a tag
answers:0 unanswered questions
user:xxxx search by author id
score:0.5 posts with 0.5+ score
"snake oil" exact phrase
votes:4 posts with 4+ votes
created:<1w created < 1 week ago
post_type:xxxx type of post
Search help
Notifications
Mark all as read See all your notifications »
Q&A

Welcome to Software Development on Codidact!

Will you help us build our independent community of developers helping developers? We're small and trying to grow. We welcome questions about all aspects of software development, from design to code to QA and more. Got questions? Got answers? Got code you'd like someone to review? Please join us.

Comments on Would a MySQL database run more efficiently with smaller varchar lengths?

Parent

Would a MySQL database run more efficiently with smaller varchar lengths?

+4
−0

I have a database with quite a few VARCHAR fields. When the database was first built the lengths of the columns were set a bit larger than absolutely necessary.

Now after, having used the DB for a while and run a lot of data through it, I have a better idea of how long the fields need to be and am wondering about if reducing the VARCHAR lengths would make it run better.

If I set the lengths to say 10 characters plus what is currently the max length would that help the select and join times?

History
Why does this post require moderator attention?
You might want to add some details to your flag.
Why should this post be closed?

1 comment thread

General comments (1 comment)
Post
+3
−0

YES or NO: It all depends on the storage engine

Fairly universally, though IIRC from looking at PostgreSQL a while back, PostgreSQL may not even do that, there is a difference between CHAR/VARCHAR/BINARY/BLOB/TEXT etc. types based on declared size, where 1, 2 or 4 bytes are used to store the actual length of the object:

  • 1 - 255 = 1 byte
  • 256 - 65,535 = 2 bytes
  • 65,536 - 2^24-1 = 3 bytes
  • 2^24 - 2^32-1 (4 Gig.) = 4 bytes

In MySQL (some other databases handle this differently) there are even different names, at least for some of the data types:

https://dev.mysql.com/doc/refman/8.0/en/storage-requirements.html#data-types-storage-reqs-strings

  • TINYTEXT
  • TEXT
  • MEDIUMTEXT
  • LONGTEXT

But the data has to be stored somewhere too. That can be as part of the actual record, which is how databases used to typically work. That is still the case for MySQL when using MyISAM Static Tables:

https://dev.mysql.com/doc/refman/8.0/en/static-format.html

With MyISAM Static Tables, the declared field length (within certain limits) determines the row size, which determines the on-disk table size. Create a table where each record has an integer ID (typically 4 bytes) and 4 VARCHAR(255) then each record will take just over 1 Kilobyte on disk - for 1,000 records that's a Megabyte (not counting overhead and indexes) and for 1,000,000 records that's a Gigabyte. Create the same table with VARCHAR(25) and you've cut the on-disk size by 90%. If the table has 100 records, that makes no real difference. If it has 1,000,000 records then it makes a big difference - possibly the difference between "entire table read into memory" vs. "lots of reads every time you access a different record".

While storing full VARCHAR capacity may seem like a total waste, the table manipulation is incredibly simple - no pointers, just read a fixed size row and chop it up. This is, of course, no longer the default engine for MySQL, and InnoDB (and even other variants of MyISAM) have many other advantages, but for this specific MyISAM Static, the answer to the original question is a clear yes.

Other database engines work differently. For example, InnoDB (and by inference in the documentation about it, also some of the non-Static variants of MyISAM) store VARCHAR strings using just the amount of space needed for the actual text + the byte count + some overhead (e.g., InnoDB uses a 4-byte alignment, which means every string gets to start on a typical 32-bit word boundary).

On balance, with typical current installations, the answer is either no or "so little difference that it just doesn't matter". But I am almost always in favor of optimizing the stated structure to match the actual usage. I often process a new batch of data using arbitrarily long fields and then analyze it to see what is really needed, and adjust my production schema accordingly.

History
Why does this post require moderator attention?
You might want to add some details to your flag.

1 comment thread

General comments (2 comments)
General comments
meriton‭ wrote over 3 years ago · edited over 3 years ago

Erm, are you sure that the declared length of a VARCHAR determines row size in the table file for MyISAM static tables? Because the documentation you link to says that "It [the static format] is used when the table contains no variable-length columns (VARCHAR, VARBINARY, BLOB, or TEXT)", which implies that VARCHAR data is never stored in static format to begin with.

manassehkatz‭ wrote over 3 years ago

About 99% sure. I think this is a semantic issue. I believe that "is used when" is not "this only functions when" but rather "this is the recommended use case for". Note later (a) it excludes BLOB and TEXT (but not VARCHAR, etc.) and (b) it states "CHAR and VARCHAR columns are space-padded to the specified column width, although the column type is not altered. BINARY and VARBINARY columns are padded with 0x00 bytes to the column width." which would only make sense if VARCHAR is allowed.