replace_special_characters¶
replace_special_characters(string, replacement)
Description¶
Replace most common special characters in a string
with replacement
Usage¶
Call or Deploy replace_special_characters
?
Call replace_special_characters
directly
The easiest way to use bigfunctions
replace_special_characters
function is deployed in 39 public datasets for all of the 39 BigQuery regions.- It can be called by anyone. Just copy / paste examples below in your BigQuery console. It just works!
- (You need to use the dataset in the same region as your datasets otherwise you may have a function not found error)
Public BigFunctions Datasets
Region | Dataset |
---|---|
eu |
bigfunctions.eu |
us |
bigfunctions.us |
europe-west1 |
bigfunctions.europe_west1 |
asia-east1 |
bigfunctions.asia_east1 |
... | ... |
Deploy replace_special_characters
in your project
Why deploy?
- You may prefer to deploy
replace_special_characters
in your own project to build and manage your own catalog of functions. - This is particularly useful if you want to create private functions (for example calling your internal APIs).
- Get started by reading the framework page
Deployment
replace_special_characters
function can be deployed with:
pip install bigfunctions
bigfun get replace_special_characters
bigfun deploy replace_special_characters
Examples¶
select bigfunctions.eu.replace_special_characters("%\u2665!Hello!*\u2665#", "")
select bigfunctions.us.replace_special_characters("%\u2665!Hello!*\u2665#", "")
select bigfunctions.europe_west1.replace_special_characters("%\u2665!Hello!*\u2665#", "")
+----------------+
| cleaned_string |
+----------------+
| Hello |
+----------------+
Use cases¶
A use case for the replace_special_characters
function is cleaning user-generated data before storing or processing it. Imagine you have a website where users can submit product reviews. These reviews might contain special characters like emoticons, punctuation marks beyond the standard set, or even unintended HTML entities. These characters can cause problems when:
- Storing data in a database: Some databases may not handle certain special characters correctly, leading to errors or data corruption.
- Displaying data: Special characters may not render correctly on different browsers or devices, leading to a poor user experience.
- Performing text analysis: Special characters can interfere with natural language processing tasks like sentiment analysis or topic modeling.
Using the replace_special_characters
function, you could clean the user-submitted reviews before storing them in your database. For example:
SELECT bigfunctions.us.replace_special_characters(review_text, ' ') AS cleaned_review
FROM `your_project.your_dataset.user_reviews`;
This query would replace all special characters in the review_text
column with spaces, resulting in a cleaner version of the review text that is more suitable for storage, display, and analysis. This helps to ensure data consistency and improve the performance of downstream tasks.
Here's another example, focusing on creating URL-friendly strings (slugs):
SELECT bigfunctions.us.replace_special_characters('This is a product title with special characters!@#$%^&*()', '-') AS url_slug
This would output This-is-a-product-title-with-special-characters-------
, which, after removing repeating hyphens, could be used as a URL slug.
In essence, the replace_special_characters
BigQuery function assists in data sanitization and preparation for various uses by removing or replacing characters that could otherwise cause issues.
Need help or Found a bug?
Get help using replace_special_characters
The community can help! Engage the conversation on Slack
We also provide professional suppport.
Report a bug about replace_special_characters
If the function does not work as expected, please
- report a bug so that it can be improved.
- or open the discussion with the community on Slack.
We also provide professional suppport.