Python bad word filter

wordfilter 0.2.7

A small module meant for use in text generators that lets you filter strings for bad words.

Ссылки проекта

Статистика

Метаданные

Лицензия: MIT License (MIT)

Автор: Darius Kazemi

Требует: Python >=3

Сопровождающие

Классификаторы

Описание проекта

wordfilter

A small module meant for use in text generators. It lets you filter strings for bad words.

Getting Started

Install the module with: npm install wordfilter

Or with Python: Install the module with: pip install wordfilter

Documentation

This is a word filter adapted from code that I use in a lot of my twitter bots. It is based on a list of words that I’ve hand-picked for exclusion from my bots: essentially, it’s a list of things that I would not say myself. Generally speaking, they are «words of oppression», aka racist/sexist/ableist things that I would not say.

The list is not all-inclusive, and I’m always adding words to it. If you’d like to file an issue or a pull request to add more words, please do so, but understand that this is primarily for use in my own projects, and I may not agree to add certain words. (For example, I have no problem with scatological words, so «shit» and «fuck» will never be on this list.)

Words are case insensitive.

Also note that due to the complexities of the English language, I am considering anything containing the substring of a bad word to be blacklisted. For example, even though «homogenous» is not a bad word, it contains the substring «homo» and it gets filtered. The reason for this is that new slang pops up all the time using compound words and I can’t possibly keep up with it. I’m willing to lose a few words like «homogenous» and «Pakistan» in order to avoid false negatives.

Contributing

In lieu of a formal styleguide, take care to maintain the existing coding style. Add unit tests for any new or changed functionality. Lint and test your code using Grunt.

License

Copyright (c) 2013 Darius Kazemi Licensed under the MIT license.

Источник

Detect bad words in Python using better-profanity:

https://artificialintelligencestechnology.com/

In daily life, we may face bad words in conversation. In this tutorial, we will build a program to detect bad words in python using the better-profanity module. This is not built-in so we need to import it into our code. If this module is not installed then you must install this first.

pip install better_profanity

After installing this module we can use it in our code. This module has a method to filter the bad words in text that is profanity.censor().

Syntax:

Example:

Look at the following example then we will discuss the code.

Remove Curse words in python

# Remove the bad words form text # importing package from better_profanity import profanity # text to be censored text = "What the hell you are doing?" # do censoring censored = profanity.censor(text) print(censored)

code explanation:

  • First import the package profanity from better_profanity
  • Then specify the text to be censored and save it in a variable.
  • then use profanity.censor(text) method to filter the bad words.

Note one thing that by default python replace the curse word by *. We can specify the symbols that will be replaced with bad words. In following example we are using – for this.

# Remove the bad words form text # importing package from better_profanity import profanity # text to be censored text = "What the hell you are doing?" # do censoring censored = profanity.censor(text,'-') print(censored)

Check if the string contains any bad words:

In above example, we have removed the bad words from text. What if we just want to know whether the given text contains any bad words or not. Now, we will not remove bad words.

.contains_profanity() method returns True if any words in the given string has a word existing in the wordlist.

from better_profanity import profanity # text to be censored text = "What the hell you are doing?" # do censoring is_censored = profanity.contains_profanity(text) print(is_censored)

Above code returns True because text contain word ‘hell’ that is bad.

Censor swear words with a custom wordlist:

We can define our own list of bad words then python will check this list to filter the bad words. profanity.load_censorwrods() method is used to take the custom list.

from better_profanity import profanity #specify the list custom_badwords = ['bad', 'hell', 'good'] #now good is considered as bad word profanity.load_censor_words(custom_badwords) # text to be censored text = "You are good boy" # do censoring censored = profanity.censor(text) print(censored)

In above example we have define our own list of word and we have added word ‘good’ in bad list. Now good will be considered as bad word.

Источник

Censor bad words using Python

Guide to implement profanity text fIlter using python.

Vasanth Jagadeesan

Enterprise solution architect

Censor bad words using Python - mobilelabs.in

Building social media website or user generated text content or subject to pass profanity filter.

Profanity is a socially offensive use of language,[1] which may also be called cursing, cussing, swearing, or expletives. Accordingly, profanity is language use that is sometimes deemed impolite, rude, indecent, or culturally offensive. — Wikipedia

Using better-profanity package

Its a python package which is used to censor bad words and custom listed words from the text.

it is inspired from profanity package which is maintained by Ben Friedland, this package is much faster than on the original.

Install better-profanity package

pip install better-profanity

How does this works?

better-profanity package ships with predefined set of bad words by default. it used string comparison to match the given text with predefined words.

We can load custom set of wordlist using load_censor_words() function.

Default wordlist

Censor bad words

To censor the bad words we need to use censor() method from the profanity package. It will filter the swear words from the text.

from better_profanity import profanity text = 'You piec3 of sHIT.' censored = profanity.censor(text) print(censored) # Output: You **** of ****. 

Censor words with word dividers

better-profanity package mask the words separated not just the space but also dividers such as _, , .

from better_profanity import profanity if __name__ == "__main__": text = ". sh1t. hello_cat_fuck. 123" censored_text = profanity.censor(text) print(censored_text) # Output: ". ****. hello_cat_****. 123" 

Censor words with custom character

The character in second parameter in .censor() will be used to replace the swear words.

from better_profanity import profanity if __name__ == "__main__": text = "You p1ec3 of sHit." censored_text = profanity.censor(text, '-') print(censored_text) # Output: You ---- of ----. 

Adding custom censor words

Function load_censor_words takes a List of strings as censored words. The provided list will replace the default wordlist.

from better_profanity import profanity if __name__ == "__main__": custom_badwords = ['happy', 'jolly', 'merry'] profanity.load_censor_words(custom_badwords) print(profanity.contains_profanity("Have a merry day! :)")) # Output: Have a **** day! :) 

Conclusion

We have seen how to use profanity filter with Python. If you like the post please share it in social media and with your friends.

Источник

How To Do Profanity Filter With Pure Python vs. REST API

apilayer bad words api blog banner

In this current state of information age, where everybody can easily access the internet and social media, we need bad words / profanity filters, even more to offer a safe haven for people to connect in a virtual space. This article will explain you how to build your own profanity filter using pure Python from scratch vs. using the existing and mature Bad Words API created by APILayer.

APILayer, an Austrian tech company that builds a marketplace for various reliable application programming interfaces (APIs) builds and maintains the Bad Words API that we will use to perform the profanity filtering.

APILayer makes cutting-edge APIs affordable for developers, startups, and enterprises. APILayer provides a wide range of APIs, from data to machine learning, text processing, image processing, etc. Browse all available apilayer products here .

For other uses cases in text analytics, browse our article collections here:

What Is A Profanity Filter Or Profanity Checker?

A profanity filter is a sort of software that searches user-generated content (UGC) and scrubs it to get rid of profanity from online forums, social networks, online stores, and other locations.

Moderators decide on which words to censor, such as swear or cursing words, words associated with hate speech, harassment, and so on. Although profanity filters have limited functionality and don’t assess the context of words, they are thought to be a great starting point for content management because they are easy to set up.

How To Do Profanity Check With Pure Python?

Why Python?

According to TIOBE, Python will be the most widely used programming language in 2022 for developing websites and applications, automating processes, and conducting data analysis. Because Python is a general-purpose language, it may be used to develop a wide range of programs and is not concentrated on a single issue.

Using profanity library

We can do profanity check or filtering with pure Python using the profanity library.

profanity [1] is A Python library to check for (and clean) profanity in strings.

This library was created by Ben Friedland (@ben174 [2] ).

Installation

You can easily install the profanity library with this pip command:

Источник

Читайте также:  Positioning text in php
Оцените статью