Week 1
Meeting
Date: May 26, 2026
Attendees:
Summary:
- Discussed the behaviour changes in the CPP reuser compared to the PHP reuser.
- Discussed the testing of the CPP reuser and the benchmarks.
- Discussed how the smart reuse suggestions should work.
- Shaheem suggested moving the Bulk reuse feature from Decider to the reuser.
Progress
Tested the CPP reuser PR #3609. Noted the behaviour changes:
1. Reuse Skips Already-Cleared Files
The C++ reuser now skips applying reuse to files that already have clearing decisions. This prevents reuse data from being re-applied to files that have already been cleared.
2. Decision Priority Handling
A new getDecisionTypePriority() function has been introduced to determine the effective clearing decision when the same pfile appears in multiple locations with different decision types. This ensures that the most appropriate decision is selected instead of simply using the latest one.
3. Heartbeat Handling
The reuserCopyrights() function currently does not invoke fo_scheduler_heart() for each processed event, whereas the PHP reuser calls $this->heartbeat(1) inside the processing loop. Added fo_scheduler_heart(1) in reuserCopyrights() after each event.
Testing
Refer to PR #3655 for full details.
The C++ reuser was tested against the PHP reuser using the following uploads:
- Source uploads:
linux-6.7.tar.xz,linux-6.9.tar.xz - Target upload:
linux-6.8.tar.xz
Functional Verification
The following reuse modes were tested and behaved as expected, matching the PHP reuser:
REUSE_MAINREUSE_CONFREUSE_ENHANCEDREUSE_COPYRIGHT
Additionally:
- Unit tests pass successfully.
- Debian packaging changes were verified and look good.
Performance Comparison
The same test scenario was executed three times under identical conditions, and the mean elapsed time was recorded:
| Reuser | Mean Elapsed Time |
|---|---|
| PHP Reuser | 00:04:05.325627 |
| C++ Reuser | 00:03:52.083240 |
Improvement: ~13.24 seconds faster than the PHP reuser.