twitterpersona.preprocessing

Module Contents

Functions

generalPreprocessing(→ pandas.DataFrame)

Perform general preprocessing on df. Removes retweets/favourites and cleans URLs, Mentions, Numbers, and stop words.

twitterpersona.preprocessing.generalPreprocessing(df: pandas.DataFrame) pandas.DataFrame[source]

Perform general preprocessing on df. Removes retweets/favourites and cleans URLs, Mentions, Numbers, and stop words.

Parameters:
  • df (pd.DataFrame) – A dataframe storing all the raw data with text column.

  • output_path (str) – the path that the newly generated csv should located at.

Returns:

df – The processed tweet dataframe.

Return type:

pd.DataFrame

Examples

generalPreprocessing(df)