Skip to content

mjob create -w could clarify where it saw a ServiceUnavailableError error #357

@chudley

Description

@chudley

When creating a new job via mjob create -w ..., if the job/storage shard for some reason returns a ServiceUnavailableError (or any other error that would hit this condition) after job creation, mjob will stop watching and report the error to the client. It's quite easy to think that the job itself has failed and the user may be inclined to try again, but the job may in fact still be running or even completed.

Here's some example output of what this would look like. In this case the job did in fact run to completion:

$ mfind -t o /richard/stor/path/to/log/files | mjob create -w -m "grep something"
62171e93-bc55-6348-e35e-fc812b2ee1f0
mjob: ServiceUnavailableError: manta is unable to serve this request

I believe the UUID being reported here means we've at least successfully created the job. The following line reporting the ServiceUnavailableError may be from mjob's poll against Manta (via lib/client.js' job method) for completion some time after creation.

We could probably make this a little more clear in a few different ways.

  1. Report that the UUID is actually a successfully created job. Consumers might rely on the first response being just a UUID, however
  2. If it is in fact that the poll has failed, we could retry a couple of times and/or expand on the error message to explain that we failed after job creation

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions