I’ve been reading Stephen Few’s excellent book, “Now You See It”. Its a big book (300+) pages, packed with practical examples, and one that I think any Entrance user will find useful. The Pareto Chart above is loosely based on one of his examples (p. 196).

In a Pareto Chart, bars are drawn with a line showing accumulated sums. The bars are ordered for decreasing height, so the changes in the sums decrease as well. This makes it possible to read off the largest contributors to the total. In our example, you can see that the first three categories account for 88.9% of total sales.

One way to accumulate sums is to use a MySQL variable, like this:

SET @sum=0;

SELECT @sum := @sum+sales FROM t;

To re-express values in a table column as percentages, use a pattern like this:

SET @total = (SELECT SUM(sales) FROM t);

SELECT (sales / @total) * 100 FROM t;

Putting the pieces together, adding PLOT, and filling in a few details we have the Pareto Chart shown above:

-- DROP TABLE IF EXISTS t;

-- DROP TABLE IF EXISTS t2;

```
```-- the sample data

CREATE TABLE t (what varchar(255), sales double);

INSERT INTO t VALUES

('bagels', 20), ('donuts', 10), ('coffee', 80), ('prune danish', '40'),

('scones', 2.5), ('toast', 5);

-- accumulate totals

SET @sum = 0.0;

CREATE TABLE t2

SELECT what, sales, @sum := sales+@sum AS accum

FROM t

ORDER BY sales DESC

;

-- calculate percentages

SET @total = (SELECT SUM(sales) FROM t2);

ALTER TABLE t2 ADD COLUMN accum_pct DOUBLE AFTER accum;

ALTER TABLE t2 ADD COLUMN sales_pct DOULBE AFTER accum;

UPDATE t2

SETt accum_pct = (accum/@total) * 100.0,

sales_pct = (sales/@total) * 100.0;

`-- make the plot`

PLOT

AXISLABELS, LIGHT GRAY BAR,

RED LINE, BLACK FILLED CIRCLE, DATALABELS

WITH

SCALE Y 0 100 25

FORMAT Y DECIMAL "#'%'"

NO SIDES

TITLE " "

TITLE Y "Percentage of Sales"

TITLE X "Item"

SELECT

what, sales_pct,

accum_pct, accum_pct, CONCAT(FORMAT(accum_pct, 1), '% ')

FROM t2

ORDER BY sales DESC

;

**Note**: You have to be careful about updating user defined variables in this way, as discussed in “High Performance SQL” by Schartz et al. and on Roland Bouman’s blog. This script works because we haven’t made assumptions about the evaluation order of the columns.