Welcome to Software Development on Codidact!
Will you help us build our independent community of developers helping developers? We're small and trying to grow. We welcome questions about all aspects of software development, from design to code to QA and more. Got questions? Got answers? Got code you'd like someone to review? Please join us.
Comments on Handling high frequency requests with cancellations in an ASP.NET Core application
Parent
Handling high frequency requests with cancellations in an ASP.NET Core application
Issue
I have recently discussed with a friend a performance issue he and his colleagues have encountered in an ASP.NET Core 5 application (a checkout app, microservices architecture).
The problematic flow is related to computing the prices of basket items. The computation may take more than it takes the operator to scan the next item. This may lead the Web application (a SPA) to receive an outdated computed basket, due to out-of-order responses.
The computation itself is hard to optimize (complex discount computations, the algorithm is O(#items), so it can go up to 1 second for large baskets.
Currently, they are exploring optimizing the computation time by replacing the RDMBS with Redis and also identifying and discarding the outdated computed baskets. The solution got pretty complex and has some non-systematic issues.
My proposal
I am wondering if a simpler solution can be used in this case:
- debounce the calls from the UI - the SPA will not immediately reach to API for basket computation. Instead, it will a period of time (e.g. 500ms) from the last item scanned
- cancellation - this article shows how a canceled browser request can lead to a SQL query cancellation. This requires that awaitable calls (the project already uses async/await almost everywhere) to use CancellationTokens.
This should reduce the strain on the API (no useless computations, since many are discarded anyway) and cancel obsolete computations (if there are some left).
As I have no experience with such performance issues, I am wondering about the possible pitfalls of this proposal.
The only downsides I have identified so far are:
- SPA changes - the app must be changed to introduce debounce and cancellation (none of these are supported now)
- API changes - cancellation tokens are not yet used. However, this does not require API breaking changes and could be easily added to the appropriate methods
I do not know if the microservices architecture itself introduces risk in this case since there are multiple instances for each service.
Post
There are two issues here. A performance problem and a correctness problem. The approach you suggest seems like it will help mitigate the performance problem while doing nothing for the correctness problem.
Having some way of cancelling the computation if the result is not going to be needed seems desirable. Even better would be having the computation be incremental so later requests don't need to redo earlier work, but I'll assume that option is off the table for the purposes of this post.
Your approach seems a bit over-complicated and has some undesirable aspects. Debouncing adds latency which is undesirable, and it's not clear that it will do anything at all to help. It's not clear that load on the servers is the source of the problem at all. In other words, debouncing could have no effect at all except for turning 1 second of latency into 1.5 seconds of latency.
As far as cancellation, a new request for the same logical basket should already be enough indication that any outstanding computations for the same logical basket should be cancelled. It shouldn't be necessary to have the front-end have any notion of "cancelling" a request.
This segues into the correctness issues though. Out-of-order responses or even out-of-order requests should not lead to any problems at all. That they do is a significant issue itself. Each distinct request should have a unique identifier^[The identifier can be randomly generated or, if you're hardcore, you could have the identifier be a secure hash of the message contents.] that is passed through end-to-end so that the SPA can correlate responses to requests. This also allows deduplicating (on the server and the client) duplicate requests that might be produced by retries. Another useful thing to include in requests is a local logical time i.e. a very simple case of a Lamport clock. This allows the server to know the ordering of messages so it can easily ignore obsolete messages. This isn't necessary if the server's endpoints are stateless. Both of these are standard techniques in distributed computing. The end-to-end unique identifiers allow turning at-least-once semantics into exactly-once semantics, e.g. as described in section 8 of the Raft paper.
The above should fix the correctness issues. The approach I suggested does have a bit of statefulness to it (which is a downside), and so the logical times would be useful. That said, they are technically not necessary as the retry logic you'd technically need anyway would resolve it. (The issue is out of order requests [without logical timestamps] could lead to earlier requests cancelling later requests and thus the client never gets a response to their query. However, that could happen just due to network glitches anyway so retry logic is already required.)
One potential downside to any approach to cancellation is, depending on how things are configured, the server handling the later requests may be different from the server handling the earlier requests. In this case, we either can't/don't cancel the earlier request, or the server instances need to communicate amongst themselves which may be costly. For this scenario, (with the above correctness changes) I would probably only bother to opportunistically cancel obsolete computations locally. However, depending on how the system is setup it may be easy/cheap to coordinate the cancellation.
0 comment threads