prototype.utility_funcs package

Submodules

prototype.utility_funcs.io_agent module

@file: io_agent.py Created on 11.12.2016 19:59 @project: GitHubRepositoryClassifier

@author: QueensGambit

The InputOutputAgent loads data (json-Data, README...) from a given repository which
is defined by strUser and strName. If the needed data has already been requested before, then is loaded from a file. Otherwise a new connection is created. By default the autorization of the connection is done with an API-Token
class prototype.utility_funcs.io_agent.InputOutputAgent(strUser, strName)[source]

Bases: object

getReadme(strPathReadme)[source]

Gets the content from the Redme as a string. The Readme is either loaded from file or web.

Parameters:strPathReadme – path were the readme is loaded and exported to
Returns:
loadJSONdata(strPathJSON)[source]

loads the requested json-data either from a file or alternatively from the web files are exported in the ‘./json/’ directory if they were requested

static setRedownload(bRedownload)[source]

sets up if the readme and json-file shall get redownload

Parameters:bRedownload – true, false
Returns:
static setWithToken(bWithToken)[source]

sets up if the github token shall be used for connection to github

Parameters:bWithToken – true, false
Returns:

prototype.utility_funcs.preprocessing_operations module

prototype.utility_funcs.preprocessing_operations.createVoabularyFeatures(lstRepos)[source]

Here the vocabulary-list is created by using the given list of GithubRepo-Objects

Parameters:lstRepos – list of GithubRepo-Objects
Returns:vocabList - list of the feature names
prototype.utility_funcs.preprocessing_operations.initInputParameters(strVocabPath, lstGithubRepo)[source]

Initialies the vocabulary set

Parameters:
  • strVocabPath – path were the vocab list is stored
  • lstGithubRepo – list of the githubRepository-objects
Returns:

prototype.utility_funcs.preprocessing_operations.readVocabFromFile(strVocabPath)[source]

reads the stored vocab list from a given file-path

Parameters:strVocabPath – path where the vocab is stored
Returns:

prototype.utility_funcs.reliableNormalizer module

@file: reliableNormalizer.py Created on 15.01.2017 15:53 @project: GitHubRepositoryClassifier

@author: Lukas

don’t use this

class prototype.utility_funcs.reliableNormalizer.ReliableNormalizer(use_log=True)[source]

Bases: object

fit(input_array)[source]
fit_transform(input_array)[source]
log(input_array)[source]
transform(input_array)[source]

prototype.utility_funcs.string_operation module

prototype.utility_funcs.string_operation.prepare_words(raw_text, bApplyStemmer=True, bCheckStopWords=False)[source]

prepares the word for the comparision with the vocab list

Parameters:
  • raw_text – text with control characters, number,
  • bApplyStemmer – true if is stemming shall be applied
  • bCheckStopWords – true if stopwords shall be removed
Returns:

normed word list

prototype.utility_funcs.string_operation.validate_txtfile(path)[source]

Checks file type whether its txt or not :param path: path to file :return:

prototype.utility_funcs.string_operation.validate_url(url_in)[source]

Performs some simple string checks to validate the URL for further processing

Parameters:url_in – The URL to perform the checks on
Returns:error: errorcode

Module contents