bigfunctions > benford_distance

# benford_distance¶

**Signature**

```
benford_distance(values)
```

**Description**

Calculate the distance from Benford's Law for given `values`

.

As mentioned in wikipedia, *Benford's law, is an observation that in many real-life sets of numerical data, the leading digit is likely to be small. In sets that obey the law, the number 1 appears as the leading significant digit about 30% of the time, while 9 appears as the leading significant digit less than 5% of the time.*

This function computes the Chi-square distance between the observed distribution of leading digits of `values`

and the expected distribution according to Benford's Law.

The smaller the `benford_distance`

, the more the `values`

follow Benford's Law.

Read "The Mysterious Benford’s Law and it’s Connection with Fraud Detection" by Vihasharma to see some applications of this function.

**Examples**

1. Uniformly distributed values do not follow Benford's Law

```
select bigfunctions.eu.benford_distance([1, 2, 3, 4, 5, 6, 7, 8, 9])
```

```
select bigfunctions.us.benford_distance([1, 2, 3, 4, 5, 6, 7, 8, 9])
```

```
select bigfunctions.europe_west1.benford_distance([1, 2, 3, 4, 5, 6, 7, 8, 9])
```

```
+------------------+
| benford_distance |
+------------------+
| 0.4 |
+------------------+
```

2. Having more small values follow more Benford's Law. Distance is lower

```
select bigfunctions.eu.benford_distance([1, 1, 1, 2, 2, 3, 4, 5, 6])
```

```
select bigfunctions.us.benford_distance([1, 1, 1, 2, 2, 3, 4, 5, 6])
```

```
select bigfunctions.europe_west1.benford_distance([1, 1, 1, 2, 2, 3, 4, 5, 6])
```

```
+------------------+
| benford_distance |
+------------------+
| 0.2 |
+------------------+
```

3. Having constant values follow less Benford's Law than uniform. Distance is higher

```
select bigfunctions.eu.benford_distance([1, 1, 1, 1, 1, 1, 1, 1, 1])
```

```
select bigfunctions.us.benford_distance([1, 1, 1, 1, 1, 1, 1, 1, 1])
```

```
select bigfunctions.europe_west1.benford_distance([1, 1, 1, 1, 1, 1, 1, 1, 1])
```

```
+------------------+
| benford_distance |
+------------------+
| 2.3 |
+------------------+
```

4. Higher leading digits is worse. Distance is much higher

```
select bigfunctions.eu.benford_distance([9, 9, 9, 9, 9, 9, 9, 9, 9])
```

```
select bigfunctions.us.benford_distance([9, 9, 9, 9, 9, 9, 9, 9, 9])
```

```
select bigfunctions.europe_west1.benford_distance([9, 9, 9, 9, 9, 9, 9, 9, 9])
```

```
+------------------+
| benford_distance |
+------------------+
| 20.7 |
+------------------+
```