Skip to main content

Weekly Sync - 9 June 2026

Attendees

Discussion

  • PR status: Reviewed the two Nirjas PRs (tooling and full refactor). Both on track pending minor test and documentation refinements. Agreed on prioritizing reproducible dataset artifacts and clear integration points.
  • Datasets & Minerva: Discussed dataset composition for Nirjas — include license corpora, non-license comments as hard negatives, and augmentations. Decided to produce explicit train/val/test splits with near-dup filtering.
  • Minerva integration: Agreed on a migration plan that adds the new pipeline as an optional submodule/component in Minerva, with a transition period where both pipelines run and results are compared.

Action Items

  • Finalize both Nirjas PRs (tooling and refactor) ready for merge.
  • Finalize Nirjas dataset schema, augmentation plan, and evaluation checklist.
  • Implement near-dedup and hard negative sampling in the data pipeline.
  • Draft the Minerva integration proposal with stepwise migration and compatibility tests.