Welcome to Software Development on Codidact!
Will you help us build our independent community of developers helping developers? We're small and trying to grow. We welcome questions about all aspects of software development, from design to code to QA and more. Got questions? Got answers? Got code you'd like someone to review? Please join us.
In MySQL is there a limit to the number of keys in a IN() clause?
I have a PHP program that does a SELECT and then updates some of the values based on an algorithm.
Rather than updating one row at a time
UPDATE example_table
SET COLUMN_A = 1
WHERE primary_key_column = 10;
I was thinking of doing many updates at once like
UPDATE example_table
SET COLUMN_A = 1
WHERE primary_key_column IN(1,2,3,4,5);
The SELECTS are pulling in 100,000 rows at a time and to start with every single one of them may need to be updated, am I going to run into a limit of values in the IN() clause?
2 answers
According to the documentation for the MySQL IN function:
The number of values in the IN() list is only limited by the max_allowed_packet value.
The default value for it is 67108864.
So, you should be able to squeeze quite a big number of identifiers, but you should definitely try it out to see how it behaves under a normal server load.
I am not sure how MySQL sees such queries though (I only have experience with such queries in MSSQL where it can complain that they are too complex).
0 comment threads
I'll readily admit I'm not too familiar with MySQL specifically, but personally, I would try to avoid listing all the primary key values in an ad-hoc query.
What I would rather do personally is to run a separate query to select the rows to update, and then include a condition that the rows to be updated are those that exist in that set of rows.
Something not entirely dissimilar to
UPDATE example_table
SET column_A = 1
WHERE primary_key_column IN (
SELECT primary_key_column
FROM example_table
WHERE <bunch of complex conditions>
)
Of course, in this particular case, you could just as well stick the WHERE
clause from the subquery in the UPDATE
statement itself, but that gets somewhat more complex if the query to find the rows to update is more complex than a plain SELECT from the same table that you're updating; say, it's actually selecting from a different table, with its own joins, maybe a union or two, and so on.
If you need to use the same set of key values multiple times and the query to find them is non-trivial, stick them into a temporary table that you SELECT
from (or join against) instead, and put all of that into a stored procedure.
1 comment thread