remove_accents¶
remove_accents(str)
Description¶
Remove accents
Usage¶
Call or Deploy remove_accents
?
Call remove_accents
directly
The easiest way to use bigfunctions
remove_accents
function is deployed in 39 public datasets for all of the 39 BigQuery regions.- It can be called by anyone. Just copy / paste examples below in your BigQuery console. It just works!
- (You need to use the dataset in the same region as your datasets otherwise you may have a function not found error)
Public BigFunctions Datasets
Region | Dataset |
---|---|
eu |
bigfunctions.eu |
us |
bigfunctions.us |
europe-west1 |
bigfunctions.europe_west1 |
asia-east1 |
bigfunctions.asia_east1 |
... | ... |
Deploy remove_accents
in your project
Why deploy?
- You may prefer to deploy
remove_accents
in your own project to build and manage your own catalog of functions. - This is particularly useful if you want to create private functions (for example calling your internal APIs).
- Get started by reading the framework page
Deployment
remove_accents
function can be deployed with:
pip install bigfunctions
bigfun get remove_accents
bigfun deploy remove_accents
Examples¶
select bigfunctions.eu.remove_accents("Voil\u00e0 !")
select bigfunctions.us.remove_accents("Voil\u00e0 !")
select bigfunctions.europe_west1.remove_accents("Voil\u00e0 !")
+----------------+
| cleaned_string |
+----------------+
| Voila ! |
+----------------+
Use cases¶
A use case for the remove_accents
function is to standardize text data for searching, indexing, or comparison. For example, if you have a database of customer names with accents and you want to make it easier to search for names regardless of whether the user includes accents in their query, you can use this function.
Scenario:
You have a table of customer names in BigQuery, some of which contain accents:
| customer_name | |---|---| | José Pérez | | François Dupont | | Anna Müller |
You want to be able to search for "Jose Perez" and still find "José Pérez".
Query:
SELECT *
FROM your_table
WHERE bigfunctions.your_region.remove_accents(customer_name) = bigfunctions.your_region.remove_accents('Jose Perez');
(Remember to replace your_region
with the appropriate BigQuery region for your data, e.g., us
, eu
, us-central1
, etc.)
This query will remove accents from both the stored customer names and the search query, allowing you to find matches even if the accents are not typed precisely.
Other Use Cases:
- Data Cleaning: Removing accents can be a part of a broader data cleaning process to standardize text and remove inconsistencies.
- Natural Language Processing (NLP): Accents can sometimes interfere with NLP tasks like text classification or sentiment analysis. Removing them can improve the accuracy of these models.
- Generating slugs or URL-friendly strings: Accents can be problematic in URLs. Removing them can create cleaner and more readable slugs.
- Matching data from different sources: If you're combining data from multiple sources that might have different conventions for accents, removing them can help standardize the data and improve matching accuracy.
Need help or Found a bug?
Get help using remove_accents
The community can help! Engage the conversation on Slack
We also provide professional suppport.
Report a bug about remove_accents
If the function does not work as expected, please
- report a bug so that it can be improved.
- or open the discussion with the community on Slack.
We also provide professional suppport.