Skip to content

bigfunctions > remove_words

remove_words

Call or Deploy remove_words ?

✅ You can call this remove_words bigfunction directly from your Google Cloud Project (no install required).

  • This remove_words function is deployed in bigfunctions GCP project in 39 datasets for all of the 39 BigQuery regions. You need to use the dataset in the same region as your datasets (otherwise you may have a function not found error).
  • Function is public, so it can be called by anyone. Just copy / paste examples below in your BigQuery console. It just works!
  • You may prefer to deploy the BigFunction in your own project if you want to build and manage your own catalog of functions. This is particularly useful if you want to create private functions (for example calling your internal APIs). Discover the framework

Public BigFunctions Datasets:

Region Dataset
eu bigfunctions.eu
us bigfunctions.us
europe-west1 bigfunctions.europe_west1
asia-east1 bigfunctions.asia_east1
... ...

Description

Signature

remove_words(string, words_to_remove)

Description

Remove any word of words_to_remove from string

Examples

select bigfunctions.eu.remove_words('I can eat candies', ['can', 'eat'])
select bigfunctions.us.remove_words('I can eat candies', ['can', 'eat'])
select bigfunctions.europe_west1.remove_words('I can eat candies', ['can', 'eat'])
+----------------+
| cleaned_string |
+----------------+
| I  candies     |
+----------------+

Need help using remove_words?

The community can help! Engage the conversation on Slack

For professional suppport, don't hesitate to chat with us.

Found a bug using remove_words?

If the function does not work as expected, please

  • report a bug so that it can be improved.
  • or open the discussion with the community on Slack.

For professional suppport, don't hesitate to chat with us.

Use cases

A common use case for the remove_words function is cleaning text data by removing stop words or unwanted terms.

Example: Product Review Analysis

Imagine you have a dataset of product reviews and you want to perform sentiment analysis. Common words like "a," "the," "and," "is," etc. (stop words) don't contribute much to the sentiment and can even skew the analysis. You can use remove_words to eliminate them:

SELECT bigfunctions.us.remove_words(review_text, ['a', 'the', 'and', 'is', 'this', 'it', 'to', 'in', 'of', 'for', 'on', 'with', 'at', 'by', 'that', 'from']) AS cleaned_review
FROM `your_project.your_dataset.product_reviews`;

This query will process each review_text and return a cleaned_review with the specified stop words removed. This cleaned text can then be used for more accurate sentiment analysis or other text processing tasks.

Other Use Cases:

  • Data Preprocessing for Machine Learning: Removing irrelevant or noisy words from text data before feeding it into a machine learning model can improve performance.
  • Spam Filtering: Identifying and removing common spam words from emails or messages.
  • Content Filtering: Blocking inappropriate or offensive language from user-generated content.
  • Keyword Extraction: Removing common words to identify the most important keywords in a piece of text.
  • Search Optimization: Cleaning search queries by removing unnecessary terms.

By customizing the words_to_remove array, you can tailor the remove_words function to various text cleaning and preprocessing tasks.

Spread the word

BigFunctions is fully open-source. Help make it a success by spreading the word!

Share on Add a on