atarashi.agents.cosineSimNgram module¶
Copyright 2018 Aman Jain (amanjain5221@gmail.com)
This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License version 2 as published by the Free Software Foundation. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA.
-
class
atarashi.agents.cosineSimNgram.
NgramAgent
(licenseList, ngramJson, algo=<NgramAlgo.bigramCosineSim: 3>)[source]¶ Bases:
atarashi.agents.atarashiAgent.AtarashiAgent
-
class
NgramAlgo
[source]¶ Bases:
enum.Enum
An enumeration.
-
bigramCosineSim
= 3¶
-
cosineSim
= 1¶
-
diceSim
= 2¶
-
-
_NgramAgent__Ngram_guess
(processedData)¶ Parameters: processedData – Processed Data form input file Returns: Returns possible licenses contained in the input file based on matching unique N-grams from Ngram_keywords.json
-
_NgramAgent__bigram_tokenize
(s)¶ Parameters: string – Input string to create tokens Returns: Array of bi-gram tokens
-
scan
(inputFile)[source]¶ Parameters: inputFile – Input file path that needs to be scanned Returns: Array of JSON with the output of scan of the file. shortname Short name of the license sim_type Type of similarity from which the result is generated sim_score Similarity score for the algorithm used mentioned above desc Description/ comments for the similarity measure
-
class