Skip to main content

Week 1

(May 31, 2024 - June 6, 2024)

Meeting 1

(June 5, 2024)

Attendees

Discussions

  • Discussed unified diff format to populate the data fetched from the Github and Gitlab API's
  • We also discussed after extraction of the content in unified diff format, how will we extract the line number from it.
  • We discussed potential risks that we had to keep in mind before approaching this:
    • The scanner results should give required info for searching line number.
    • The scanner results should not be affected by this.

Updates

  • Came across this thread on stackoverflow. Used this gawk command as a reference and wrote a python script to convert the api content into unified diff format.
  • Create a new class FormatResult to handle all the formatting of the results and diff content.
  • Also, created a function to extract the line number from the formatted diff content.
  • Tested both the scripts extensively and all cover potential edge cases.

Planning for next week

  • Use the script on the diff content and try to find the line number for copyright and keyword scanners.
  • Add relevant byte info to the JSON output of nomos scanner.
  • Figure out what to do for repo scans.