Skip to content

frequent_values

frequent_values(values, frequency_threshold)

Description

Returns frequent_values among array of values

This function computes the frequency of each value in values array and returns the values which frequency is stricly above the given frequency_threshold.

Usage

Call or Deploy frequent_values ?
Call frequent_values directly

The easiest way to use bigfunctions

  • frequent_values function is deployed in 39 public datasets for all of the 39 BigQuery regions.
  • It can be called by anyone. Just copy / paste examples below in your BigQuery console. It just works!
  • (You need to use the dataset in the same region as your datasets otherwise you may have a function not found error)

Public BigFunctions Datasets

Region Dataset
eu bigfunctions.eu
us bigfunctions.us
europe-west1 bigfunctions.europe_west1
asia-east1 bigfunctions.asia_east1
... ...
Deploy frequent_values in your project

Why deploy?

  • You may prefer to deploy frequent_values in your own project to build and manage your own catalog of functions.
  • This is particularly useful if you want to create private functions (for example calling your internal APIs).
  • Get started by reading the framework page

Deployment

frequent_values function can be deployed with:

pip install bigfunctions
bigfun get frequent_values
bigfun deploy frequent_values

Examples

Detect frequent strings in an array of strings with a frequency_threshold of 0.2. banana appears 3 times for an array of 6 elements so its frequency is 3 / 6 = 0.5 > 0.4. It is a frequent string compared to the frequency_threshold.

select bigfunctions.eu.frequent_values(['apple', 'apple', 'banana', 'banana', 'banana', 'cherry'], 0.4)
select bigfunctions.us.frequent_values(['apple', 'apple', 'banana', 'banana', 'banana', 'cherry'], 0.4)
select bigfunctions.europe_west1.frequent_values(['apple', 'apple', 'banana', 'banana', 'banana', 'cherry'], 0.4)
+-----------------+
| frequent_values |
+-----------------+
| banana          |
+-----------------+

Use cases

Let's say you have a BigQuery table storing customer product reviews. Each row represents a review and includes a column named keywords which is an array of strings representing keywords extracted from the review text.

You want to identify the most frequently occurring keywords across all reviews to understand trending topics or product features that customers frequently mention.

Here's how you can use the frequent_values function:

SELECT bigfunctions.us.frequent_values(ARRAY_AGG(keywords), 0.05) AS frequent_keywords
FROM `your_project.your_dataset.your_table`

This query does the following:

  1. ARRAY_AGG(keywords): Aggregates all the keywords arrays from each review into a single array of all keywords.
  2. bigfunctions.us.frequent_values(..., 0.05): Applies the frequent_values function to this aggregated array with a frequency_threshold of 0.05. This means that only keywords that appear in at least 5% of the reviews will be returned.
  3. AS frequent_keywords: Aliases the resulting array of frequent keywords as frequent_keywords.

This will give you an array of strings containing the keywords that occur most frequently in your customer reviews, allowing you to identify important themes and trends.

Other Use Cases:

  • Log analysis: Identify frequent error messages or user actions in log data.
  • E-commerce: Find frequently purchased items together (market basket analysis).
  • Social media analysis: Detect trending hashtags or topics.
  • Genomics: Identify frequently occurring gene mutations in a population.

Essentially, any time you need to find frequently occurring elements within a large dataset of arrays, the frequent_values function can be a useful tool.


Need help or Found a bug?
Get help using frequent_values

The community can help! Engage the conversation on Slack

We also provide professional suppport.

Report a bug about frequent_values

If the function does not work as expected, please

  • report a bug so that it can be improved.
  • or open the discussion with the community on Slack.

We also provide professional suppport.


Show your ❤ by adding a ⭐ on