Welcome to Software Development on Codidact!
Will you help us build our independent community of developers helping developers? We're small and trying to grow. We welcome questions about all aspects of software development, from design to code to QA and more. Got questions? Got answers? Got code you'd like someone to review? Please join us.
Write to same file from multiple threads
I want to write a text file from multiple threads. The file structure is line-oriented. This means writing of lines should be atomic. I am using Qt 5.15.2.
Is it enough to protect a shared QTextStream/QFile pair using a QMutex?
EDIT:
For now using dmckee's solution I have solved it (well, a bit more complex but it boils down exactly to your idea). However moving the file work into a single writer thread does not scale.
I have a lot of threads writing very big amounts of data. This means the writer thread receives a big load and (more important) queued events need to carry a lot of data (which means a lot of RAM consumption). This gets even more nasty in case the writer thread is too slow. It will pile up events consuming virtually all RAM in the system. This is why I would prefer a mutex like solution very much. Alas, is that possible?
2 answers
Because you are using Qt in particular there is something to be said for not solving this problem yourself.
Instead, create a single object that owns the stream or file and offers a writeLine
slot. In the simplest case the signature might be writeLine(const QString & line)
. Then your threads simple signal the owner with the data they want to write and the Qt engine takes care of the locking for you.
Writing to the file on the HD is your massive bottleneck no matter how many threads you throw around. The limit is the physical memory access speed, not processing power. And since it is such a bottleneck, you should have a thread solely focusing on this job, similar to what @dmckee suggested.
Now what you can do is to have the file writer thread work with large chunks of fixed sizes. Don't just write a few lines each time, write a large chunk. You can have other threads preparing the data in advance.
Suppose you have some logging function where you pass on one string at a time, in some icky inconvenient format like std::string
or some Qt class. Instead of writing 5 strings each one at a time, with a length of some 10 to 100 bytes, show these into a raw byte buffer and let it build up to a certain size. Computers love multiples of 8, so maybe work with chunks of 256 or 512 bytes at a time. And yes we are talking about raw C strings here, forget all about "overloading ofstream", "type generic logging" and other such time-consuming fluff.
As a positive little side-effect, these raw chunks will also be very cache-friendly, unlike a bunch of heap allocated fragments from std::string
/std::vector
etc. But RAM access speed is a minor concern compared to HD access speed.
This gets even more nasty in case the writer thread is too slow: It will pile up events consuming virtually all RAM in the system.
Yeah that's the thing with queues: if your real-time specification doesn't add up, so that you never end up with an empty queue, then no amount of queueing will save you. The problem could simply be that you are saving too much data too frequently.
Make sure to benchmark on an old SATA/SCSI HD and not on a SSD.
1 comment thread