atarashi.libs.commentPreprocessor module

Copyright 2018 Aman Jain (amanjain5221@gmail.com)

This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License version 2 as published by the Free Software Foundation. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA.

class atarashi.libs.commentPreprocessor.CommentPreprocessor[source]

Bases: object

static extract(inputFile)[source]

Extract comments from given input file and return a temp file stored in OS. This reads all comments from the different files types.

Parameters:inputFile – Location of Input file from which comments needs to be extracted
Returns:Temp file path from the OS
static preprocess(data)[source]
  • All whitespace should be treated as a single blank space
  • All upper case and lower case letters should be treated as lower case letters “(c)”, or “Copyright” should be considered equivalent and interchangeable
  • Any hyphen, dash, en dash, em dash, or other variation should be considered equivalent.
  • Remove the exceptional characters
Parameters:data – Input file in string format
Returns:Pre-process the data according to the rules mentioned above