Week 6 meeting and activities
(July 02,2024)
Attendees
- Divij Sharma
- Katharina Ettinger
- Shaheem Azmal M MD
Discussion
- Gave updates and demo on previous week's work.
- Discussed on the endpoint requirements for the Jobs API.
Activities
-
Modified
/jobs
,/jobs/{id}
,/jobs/all
endpoints considering the following points:-
Currently, the structure of the
jobs
object returned by these endpoints is as follows:[
{
"id": <job id>,
"name": <name of the job (upload name)>,
"queueDate": <job queue date>,
"uploadId": <upload id of the package>,
"userId": <user id of the user>,
"groupId": <group id of the user>,
"eta": <time left>,
"status": <job status (completed, killed, failed, etc.)>
}
]This structure has the following problems, as outlined in #1301, #1800 and #1972.
- The structure is very simple, and does not provide much information about the job.
- For example, the
agent
triggered by the job is not present in the response. - There is no information about child jobs being executed, and what is the status of each child job.
- The information about
status
is redundant and returns invalid response for some testcases. - (After queue was added) The parent job status is determined by the success/failure of all the jobs related to that specific upload, and not the child jobs.
- This, along with the current structure, makes it impossible to determine what exactly failed.
-
To handle all the things mentioned in point 1, I modified the structure as:
[
{
"id": <job id>,
"name": <name of the job (upload name)>,
"queueDate": <job queue date>,
"uploadId": <upload id of the package>,
"userId": <user id of the user>,
"groupId": <group id of the user>,
"eta": <time left>,
"status": <job status (completed, killed, failed, etc.)>,
"jobQueue": [
{
"jobQueueId": <job queue id>,
"jobQueueType": <job queue type (generally agent name)>,
"startTime": <job queue start time>,
"endTime": <job queue end time>,
"status": <job queue completion statu>,
"itemsProcessed": <number of items processed>,
"log": <location of log file if exists>,
"dependencies": <Array: list of dependent job queue ids>,
"itemsPerSec": <number of items processed per second>,
"canDoActions": <job can accept new actions like pause and cancel>,
"isInProgress": <checks if the job queue is still in progress>,
"isReady": <is the job ready>,
"download": {
"text": <text for download link>,
"link": <link to download the report
}
}
]
}
]This structure provides a lot more information about the job, and also provides information about the child jobs being executed, and what is the status of each child job. Attaching a screenshot of the response below.
-
The same structure is used for the
/jobs/{id}
and/jobs/all
endpoints. -
The status returned by the endpoints now depend solely on the status of the child jobs, and not all the jobs related to the upload.
-
-
Worked on documenting some improvements in the REST API implementation.