Skip to main content

WEEK 5

(July 02, 2025)

Attendees:

Engagements

  • This week I went ahead with the pre-written decluttering script application into the pipeline.

  • I started by trying out the script to check what our expected output should be, as this will give us an insight of our decluttered output

  • While experimenting with examples and trying out the above, I realized the decluttering script and features isn't much effective in general.

  • I set out to include some little regex into the decluttering script to increase its effectiveness a little, but this is constrained as copyright text doesn't have a set of rules they follow, there by the text we used to try this out might not apply to other text.

  • I integrated the available pre-written script additionally into the pipeline

Meeting Discussion:

  • In this week meeting I discussed with the mentors about the task and what I discovered from my applications of the declutter script.
  • I informed them and showed them the output I got from an example text which I also test it for them live for confirmation
  • I also went ahead to show them the output I got from the modified script I wrote with additional regex rules, and we compared this against each other.
  • I was instructed to now create a PR on the work I have done so far on the ongoing project, and submit it on the Safaa main repository.

Subsequent Steps

  • I will be proceeding on getting some few dataset from the mentors to have data for training and carry out some experiment about our model modifications and improvement task.