Welcome to Software Development on Codidact!
Will you help us build our independent community of developers helping developers? We're small and trying to grow. We welcome questions about all aspects of software development, from design to code to QA and more. Got questions? Got answers? Got code you'd like someone to review? Please join us.
Post History
LIMIT/OFFSET, fundamentally, is slow for large datasets. With a clause like LIMIT 10000 OFFSET 150000, you're asking the query engine to find 160,000 rows, then throw away 150,000 of them. Hardly e...
Answer
#1: Initial revision
LIMIT/OFFSET, fundamentally, _is_ slow for large datasets. With a clause like `LIMIT 10000 OFFSET 150000`, you're asking the query engine to find 160,000 rows, then throw away 150,000 of them. Hardly efficient. It works... _okay_, for small page sizes and not-too-huge datasets, but when you're asking for lots of data in a large dataset, you're going to hit problems fast. The most common way past is simply to use a WHERE condition to remove the need for the OFFSET. Your data does need to have a unique index column for this to work, which I'm assuming here is called `id`. When you query for the first 10k rows, use `LIMIT 10000`, and save the greatest ID from those 10k rows. On the next query for the next 10k, use the same `LIMIT 10000`, but add a `WHERE id > your_previous_max_id` to the query. This is faster, because you're only fetching 10k rows not 20k-then-throw-away-10k, and the query engine can use the unique index to scan straight to the start of the relevant data. So, your first query looks like this: ```sql SELECT * FROM example_table LIMIT 10000 ``` then, assuming the greatest ID in that set is 10000, your second query: ```sql SELECT * FROM example_table WHERE id > 10000 LIMIT 10000 ``` and so on.