quantize_into_bins¶
quantize_into_bins(value, bin_bounds)
Description¶
Get the bin_range
in which belongs value
with bins defined by their bin_bounds
.
Examples¶
Call or Deploy quantize_into_bins
?
Call quantize_into_bins
directly
The easiest way to use bigfunctions
quantize_into_bins
function is deployed in 39 public datasets for all of the 39 BigQuery regions.- It can be called by anyone. Just copy / paste examples below in your BigQuery console. It just works!
- (You need to use the dataset in the same region as your datasets otherwise you may have a function not found error)
Public BigFunctions Datasets
Region | Dataset |
---|---|
eu |
bigfunctions.eu |
us |
bigfunctions.us |
europe-west1 |
bigfunctions.europe_west1 |
asia-east1 |
bigfunctions.asia_east1 |
... | ... |
Deploy quantize_into_bins
in your project
Why deploy?
- You may prefer to deploy
quantize_into_bins
in your own project to build and manage your own catalog of functions. - This is particularly useful if you want to create private functions (for example calling your internal APIs).
- Get started by reading the framework page
Deployment
quantize_into_bins
function can be deployed with:
pip install bigfunctions
bigfun get quantize_into_bins
bigfun deploy quantize_into_bins
select bigfunctions.eu.quantize_into_bins(-4, [0, 1, 5, 10])
select bigfunctions.us.quantize_into_bins(-4, [0, 1, 5, 10])
select bigfunctions.europe_west1.quantize_into_bins(-4, [0, 1, 5, 10])
+-----------+
| bin_range |
+-----------+
| ]-∞, 0[ |
+-----------+
select bigfunctions.eu.quantize_into_bins(3, [0, 1, 5, 10])
select bigfunctions.us.quantize_into_bins(3, [0, 1, 5, 10])
select bigfunctions.europe_west1.quantize_into_bins(3, [0, 1, 5, 10])
+-----------+
| bin_range |
+-----------+
| [1, 5[ |
+-----------+
select bigfunctions.eu.quantize_into_bins(9, [0, 1, 5, 10])
select bigfunctions.us.quantize_into_bins(9, [0, 1, 5, 10])
select bigfunctions.europe_west1.quantize_into_bins(9, [0, 1, 5, 10])
+-----------+
| bin_range |
+-----------+
| [5, 10] |
+-----------+
select bigfunctions.eu.quantize_into_bins(130, [0, 1, 5, 10])
select bigfunctions.us.quantize_into_bins(130, [0, 1, 5, 10])
select bigfunctions.europe_west1.quantize_into_bins(130, [0, 1, 5, 10])
+-----------+
| bin_range |
+-----------+
| ]10, +∞[ |
+-----------+
Need help or Found a bug using quantize_into_bins
?
Get help using quantize_into_bins
The community can help! Engage the conversation on Slack
We also provide professional suppport.
Report a bug about quantize_into_bins
If the function does not work as expected, please
- report a bug so that it can be improved.
- or open the discussion with the community on Slack.
We also provide professional suppport.
Use cases¶
You could use this function to categorize website session durations into bins for analysis. Let's say you have a table of website session data with a session_duration_seconds
column. You want to group these sessions into duration categories like "Short (0-30s)", "Medium (31-60s)", "Long (61-180s)", and "Very Long (181s+)".
SELECT
user_id,
bigfunctions.us.quantize_into_bins(session_duration_seconds, [0, 30, 60, 180]) AS session_duration_category
FROM
`your_project.your_dataset.your_session_table`
This query would add a session_duration_category
column to your results. For a session lasting 20 seconds, the category would be "]−∞, 0[", since the lower bound isn't inclusive. For 45 seconds it would be "[30, 60[", for 150 seconds it would be "[60, 180]", and for 200 seconds it would be "]180, +∞[". You can then use this new category for aggregation and reporting, such as:
SELECT
session_duration_category,
COUNT(*) AS num_sessions,
AVG(pages_viewed) AS avg_pages_viewed
FROM (
SELECT
user_id,
bigfunctions.us.quantize_into_bins(session_duration_seconds, [0, 30, 60, 180]) AS session_duration_category,
pages_viewed
FROM
`your_project.your_dataset.your_session_table`
)
GROUP BY 1
ORDER BY 1
This would give you a summary table showing the number of sessions and average pages viewed for each session duration category. This allows you to analyze user behavior based on how long they spend on your website.
Other use cases include:
- Customer Segmentation by Purchase Value: Categorize customers based on their total spending into different tiers (e.g., low, medium, high spenders).
- Lead Scoring: Assign leads to different score ranges based on factors like engagement and demographics.
- Performance Analysis: Group employees into performance categories based on metrics like sales or customer satisfaction scores.
- Data Visualization: Create histograms or other visualizations where data needs to be binned for clarity. The output of
quantize_into_bins
can be used directly for grouping in chart creation. - Data Preprocessing for Machine Learning: Binning continuous variables can be a useful preprocessing step for certain machine learning models.
Remember to replace bigfunctions.us
with the appropriate dataset for your BigQuery region.
Spread the word!¶
BigFunctions is fully open-source. Help make it a success by spreading the word!