Skip to main content

Week 6 meeting and activities

(July 02,2024)

Attendees

Discussion

  • Gave updates and demo on previous week's work.
  • Discussed on the endpoint requirements for the Jobs API.

Activities

  • Modified /jobs, /jobs/{id}, /jobs/all endpoints considering the following points:

    1. Currently, the structure of the jobs object returned by these endpoints is as follows:

      [
      {
      "id": <job id>,
      "name": <name of the job (upload name)>,
      "queueDate": <job queue date>,
      "uploadId": <upload id of the package>,
      "userId": <user id of the user>,
      "groupId": <group id of the user>,
      "eta": <time left>,
      "status": <job status (completed, killed, failed, etc.)>
      }
      ]

      This structure has the following problems, as outlined in #1301, #1800 and #1972.

      • The structure is very simple, and does not provide much information about the job.
      • For example, the agent triggered by the job is not present in the response.
      • There is no information about child jobs being executed, and what is the status of each child job.
      • The information about status is redundant and returns invalid response for some testcases.
      • (After queue was added) The parent job status is determined by the success/failure of all the jobs related to that specific upload, and not the child jobs.
      • This, along with the current structure, makes it impossible to determine what exactly failed.
    2. To handle all the things mentioned in point 1, I modified the structure as:

      [
      {
      "id": <job id>,
      "name": <name of the job (upload name)>,
      "queueDate": <job queue date>,
      "uploadId": <upload id of the package>,
      "userId": <user id of the user>,
      "groupId": <group id of the user>,
      "eta": <time left>,
      "status": <job status (completed, killed, failed, etc.)>,
      "jobQueue": [
      {
      "jobQueueId": <job queue id>,
      "jobQueueType": <job queue type (generally agent name)>,
      "startTime": <job queue start time>,
      "endTime": <job queue end time>,
      "status": <job queue completion statu>,
      "itemsProcessed": <number of items processed>,
      "log": <location of log file if exists>,
      "dependencies": <Array: list of dependent job queue ids>,
      "itemsPerSec": <number of items processed per second>,
      "canDoActions": <job can accept new actions like pause and cancel>,
      "isInProgress": <checks if the job queue is still in progress>,
      "isReady": <is the job ready>,
      "download": {
      "text": <text for download link>,
      "link": <link to download the report
      }
      }
      ]
      }
      ]

      This structure provides a lot more information about the job, and also provides information about the child jobs being executed, and what is the status of each child job. Attaching a screenshot of the response below.

      newJobs

    3. The same structure is used for the /jobs/{id} and /jobs/all endpoints.

    4. The status returned by the endpoints now depend solely on the status of the child jobs, and not all the jobs related to the upload.

    Uplink PR feat(api): New endpoints to get/delete/restore/update scancode copyright, email, author, url findings

  • Worked on documenting some improvements in the REST API implementation.