bigfunctions > replace_special_characters
replace_special_characters¶
Call or Deploy replace_special_characters
?
✅ You can call this replace_special_characters
bigfunction directly from your Google Cloud Project (no install required).
- This
replace_special_characters
function is deployed inbigfunctions
GCP project in 39 datasets for all of the 39 BigQuery regions. You need to use the dataset in the same region as your datasets (otherwise you may have a function not found error). - Function is public, so it can be called by anyone. Just copy / paste examples below in your BigQuery console. It just works!
- You may prefer to deploy the BigFunction in your own project if you want to build and manage your own catalog of functions. This is particularly useful if you want to create private functions (for example calling your internal APIs). Discover the framework
Public BigFunctions Datasets:
Region | Dataset |
---|---|
eu |
bigfunctions.eu |
us |
bigfunctions.us |
europe-west1 |
bigfunctions.europe_west1 |
asia-east1 |
bigfunctions.asia_east1 |
... | ... |
Description¶
Signature
replace_special_characters(string, replacement)
Description
Replace most common special characters in a string
with replacement
Examples¶
select bigfunctions.eu.replace_special_characters('%♥!Hello!*♥#', '')
select bigfunctions.us.replace_special_characters('%♥!Hello!*♥#', '')
select bigfunctions.europe_west1.replace_special_characters('%♥!Hello!*♥#', '')
+----------------+
| cleaned_string |
+----------------+
| Hello |
+----------------+
Need help using replace_special_characters
?
The community can help! Engage the conversation on Slack
For professional suppport, don't hesitate to chat with us.
Found a bug using replace_special_characters
?
If the function does not work as expected, please
- report a bug so that it can be improved.
- or open the discussion with the community on Slack.
For professional suppport, don't hesitate to chat with us.
Use cases¶
A use case for the replace_special_characters
function is cleaning user-generated data before storing or processing it. Imagine you have a website where users can submit product reviews. These reviews might contain special characters like emoticons, punctuation marks beyond the standard set, or even unintended HTML entities. These characters can cause problems when:
- Storing data in a database: Some databases may not handle certain special characters correctly, leading to errors or data corruption.
- Displaying data: Special characters may not render correctly on different browsers or devices, leading to a poor user experience.
- Performing text analysis: Special characters can interfere with natural language processing tasks like sentiment analysis or topic modeling.
Using the replace_special_characters
function, you could clean the user-submitted reviews before storing them in your database. For example:
SELECT bigfunctions.us.replace_special_characters(review_text, ' ') AS cleaned_review
FROM `your_project.your_dataset.user_reviews`;
This query would replace all special characters in the review_text
column with spaces, resulting in a cleaner version of the review text that is more suitable for storage, display, and analysis. This helps to ensure data consistency and improve the performance of downstream tasks.
Here's another example, focusing on creating URL-friendly strings (slugs):
SELECT bigfunctions.us.replace_special_characters('This is a product title with special characters!@#$%^&*()', '-') AS url_slug
This would output This-is-a-product-title-with-special-characters-------
, which, after removing repeating hyphens, could be used as a URL slug.
In essence, the replace_special_characters
BigQuery function assists in data sanitization and preparation for various uses by removing or replacing characters that could otherwise cause issues.
Spread the word¶
BigFunctions is fully open-source. Help make it a success by spreading the word!