qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH for-6.1? v2 5/7] job: Add job_cancel_requested()


From: Max Reitz
Subject: Re: [PATCH for-6.1? v2 5/7] job: Add job_cancel_requested()
Date: Wed, 4 Aug 2021 10:07:15 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.11.0

On 03.08.21 16:25, Kevin Wolf wrote:
Am 26.07.2021 um 16:46 hat Max Reitz geschrieben:
Most callers of job_is_cancelled() actually want to know whether the job
is on its way to immediate termination.  For example, we refuse to pause
jobs that are cancelled; but this only makes sense for jobs that are
really actually cancelled.

A mirror job that is cancelled during READY with force=false should
absolutely be allowed to pause.  This "cancellation" (which is actually
a kind of completion) may take an indefinite amount of time, and so
should behave like any job during normal operation.  For example, with
on-target-error=stop, the job should stop on write errors.  (In
contrast, force-cancelled jobs should not get write errors, as they
should just terminate and not do further I/O.)

Therefore, redefine job_is_cancelled() to only return true for jobs that
are force-cancelled (which as of HEAD^ means any job that interprets the
cancellation request as a request for immediate termination), and add
job_cancel_request() as the general variant, which returns true for any
jobs which have been requested to be cancelled, whether it be
immediately or after an arbitrarily long completion phase.

Buglink: https://gitlab.com/qemu-project/qemu/-/issues/462
Signed-off-by: Max Reitz <mreitz@redhat.com>
---
  include/qemu/job.h |  8 +++++++-
  block/mirror.c     | 10 ++++------
  job.c              |  7 ++++++-
  3 files changed, 17 insertions(+), 8 deletions(-)

diff --git a/include/qemu/job.h b/include/qemu/job.h
index 8aa90f7395..032edf3c5f 100644
--- a/include/qemu/job.h
+++ b/include/qemu/job.h
@@ -436,9 +436,15 @@ const char *job_type_str(const Job *job);
  /** Returns true if the job should not be visible to the management layer. */
  bool job_is_internal(Job *job);
-/** Returns whether the job is scheduled for cancellation. */
+/** Returns whether the job is being cancelled. */
  bool job_is_cancelled(Job *job);
+/**
+ * Returns whether the job is scheduled for cancellation (at an
+ * indefinite point).
+ */
+bool job_cancel_requested(Job *job);
I don't think non-force blockdev-cancel for mirror should actually be
considered cancellation, so what is the question that this function
answers?

"Is this a cancelled job, or a mirror block job that is supposed to
complete soon, but only if it doesn't switch over the users to the
target on completion"?

Well, technically yes, but it was more intended as “Has the user ever invoked (block-)job-cancel on this job?”.

Is this ever a reasonable question to ask, except maybe inside the
mirror implementation itself?

I asked myself the same for v3, but found two places in job.c where I would like to keep it:

First, there’s an assertion in job_completed_txn_abort().  All jobs other than @job have been force-cancelled, and so job_is_cancelled() would be true for them.  As for @job itself, the function is mostly called when the job’s return value is not 0, but a soft-cancelled mirror does have a return value of 0 and so would not end up in that function. But job_cancel() invokes job_completed_txn_abort() if the job has been deferred to the main loop, which mostly correlates with the job having been completed (in which case the assertion is skipped), but not 100 % (there’s a small window between setting deferred_to_main_loop and the job changing to a completed state). So I’d prefer to keep the assertion as-is functionally, i.e. to only check job->cancelled.

Second, job_complete() refuses to let a job complete that has been cancelled.  This function is basically only invoked by the user (through qmp_block_job_complete()/qmp_job_complete(), or job_complete_sync(), which comes from qemu-img), so I believe that it should correspond to the external interface we have right now; i.e., if the user has invoked (block-)job-cancel at one point, job_complete() should generally return an error.

job_complete() is the only function outside of mirror that seems to use
it. But even there, it feels wrong to make a difference. Either we
accept redundant completion requests, or we don't. It doesn't really
matter how the job reconfigures the graph on completion. (Also, I feel
this should really have been part of the state machine, but I'm not sure
if we want to touch it now...)

Well, yes, I don’t think it makes a difference because I don’t think anyone will first tell the job via block-job-cancel to complete without pivoting, and then change their mind and call block-job-complete after all.  (Not least because that’s an error pre-series.)

Also, I’m not even sure whether completing after a soft cancel request works.  I don’t think any of our code accounts for such a case, so I’d rather avoid allowing it if there’s no need to allow it anyway.

Max




reply via email to

[Prev in Thread] Current Thread [Next in Thread]