PR status: Reviewed the two Nirjas PRs (tooling and full refactor). Both on track pending minor test and documentation refinements. Agreed on prioritizing reproducible dataset artifacts and clear integration points.
Datasets & Minerva: Discussed dataset composition for Nirjas — include license corpora, non-license comments as hard negatives, and augmentations. Decided to produce explicit train/val/test splits with near-dup filtering.
Minerva integration: Agreed on a migration plan that adds the new pipeline as an optional submodule/component in Minerva, with a transition period where both pipelines run and results are compared.