min_max_scaler¶
min_max_scaler(arr)
Description¶
Performs min-max scaling on an array. It takes an array of numbers as input and returns an array of values scaled between 0 and 1.
Examples¶
Call or Deploy min_max_scaler
?
Call min_max_scaler
directly
The easiest way to use bigfunctions
min_max_scaler
function is deployed in 39 public datasets for all of the 39 BigQuery regions.- It can be called by anyone. Just copy / paste examples below in your BigQuery console. It just works!
- (You need to use the dataset in the same region as your datasets otherwise you may have a function not found error)
Public BigFunctions Datasets
Region | Dataset |
---|---|
eu |
bigfunctions.eu |
us |
bigfunctions.us |
europe-west1 |
bigfunctions.europe_west1 |
asia-east1 |
bigfunctions.asia_east1 |
... | ... |
Deploy min_max_scaler
in your project
Why deploy?
- You may prefer to deploy
min_max_scaler
in your own project to build and manage your own catalog of functions. - This is particularly useful if you want to create private functions (for example calling your internal APIs).
- Get started by reading the framework page
Deployment
min_max_scaler
function can be deployed with:
pip install bigfunctions
bigfun get min_max_scaler
bigfun deploy min_max_scaler
select bigfunctions.eu.min_max_scaler([1, 2, 3, 4, 5])
select bigfunctions.us.min_max_scaler([1, 2, 3, 4, 5])
select bigfunctions.europe_west1.min_max_scaler([1, 2, 3, 4, 5])
+-------------------------+
| scaled_array |
+-------------------------+
| [0, 0.25, 0.5, 0.75, 1] |
+-------------------------+
Need help or Found a bug using min_max_scaler
?
Get help using min_max_scaler
The community can help! Engage the conversation on Slack
We also provide professional suppport.
Report a bug about min_max_scaler
If the function does not work as expected, please
- report a bug so that it can be improved.
- or open the discussion with the community on Slack.
We also provide professional suppport.
Use cases¶
Let's say you have a table of product prices and you want to compare their relative affordability. The prices range from $10 to $1000, but you need them on a normalized scale between 0 and 1 for a machine learning model or visualization. Here's how min_max_scaler
can be used:
WITH ProductPrices AS (
SELECT 'Product A' AS product, 10 AS price
UNION ALL SELECT 'Product B' AS product, 50 AS price
UNION ALL SELECT 'Product C' AS product, 200 AS price
UNION ALL SELECT 'Product D' AS product, 1000 AS price
),
MinMaxScaledPrices AS (
SELECT
product,
bigfunctions.us.min_max_scaler(ARRAY_AGG(price) OVER ()) AS scaled_prices
FROM ProductPrices
)
SELECT
product,
scaled_price
FROM MinMaxScaledPrices, UNARRAY(scaled_prices) AS scaled_price;
This query first collects all prices into an array using ARRAY_AGG
. Then, min_max_scaler
normalizes these prices within the array. Finally, the UNARRAY
function expands the resulting array so you get each product and its scaled price on separate rows.
This results in a table like this (the exact values might vary slightly due to floating-point precision):
product | scaled_price |
---|---|
Product A | 0 |
Product B | 0.04 |
Product C | 0.19 |
Product D | 1 |
Now "Product A", with the lowest price, has a scaled price of 0, and "Product D", with the highest price, has a scaled price of 1. The other products have scaled prices in between, reflecting their relative affordability.
Another use case would be normalizing features in a machine learning preprocessing step directly within BigQuery before exporting the data for training. This can simplify your data pipeline.
Spread the word!¶
BigFunctions is fully open-source. Help make it a success by spreading the word!