atarashi.libs.license_clustering module

Copyright 2018 Aman Jain (amanjain5221@gmail.com)

This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License version 2 as published by the Free Software Foundation. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA.

atarashi.libs.license_clustering.cluster_licenses(licenseList, verbose=0)[source]
Parameters:licenseList – Processed License List path
Returns:Array of license short names cluster
atarashi.libs.license_clustering.refine_cluster(license_cluster, verbose=0)[source]
Parameters:license_cluster – Initial license cluster based on the same root license name
Returns:Refined license cluster array using cosine similarity >= MAX_ALLOWED_DISTANCE (0.97)
atarashi.libs.license_clustering.union_and_find(arr)[source]

Implememt Union and find algorithm (Graph Algorithm)

Parameters:arr – Array of pairs of licenses that should be in same cluster
Returns:Nested Array of License clusters