Understanding DB2 OLAP By Example
The OLAP features in DB2 are very cool however I don't see a lot of people using them. In addition, sometimes reading the docs on these features are overwhelming so hopefully these examples will make it easy to understand.
We'll look at the following OLAP features in particular. There are more available if you navigate to the IBM website for your DB2 version.
First we need to create a sample table and data:
Then lets insert some sample data:
Now lets look at the result of a common group by clause:
These are some pretty common group by scenarios but what if you needed a grouping by store and state, and also to display the total sales for each store, as well as for all stores. Here is where rollup comes into play.
You could also swap the rollup fields and rollup on state and then store:
Please note that if two rows tie for a ranking, the ranking number will not increment on the tied row.
To demonstrate, lets add a new state, CT that ties with IL for Amazing sales. Lets also add another state with lower sales than CT and IL.
We'll look at the following OLAP features in particular. There are more available if you navigate to the IBM website for your DB2 version.
First we need to create a sample table and data:
CREATE TABLE sales( item VARCHAR(20), state CHAR(2), store VARCHAR(20), amount DECIMAL);
Then lets insert some sample data:
INSERT INTO sales VALUES('Watch', 'IL', 'Buymore', 15); INSERT INTO sales VALUES('Watch', 'NY', 'Buymore', 15); INSERT INTO sales VALUES('Watch', 'NY', 'Buymore', 15); INSERT INTO sales VALUES('Watch', 'NY', 'Buymore', 15); INSERT INTO sales VALUES('Watch', 'MN', 'Buymore', 15); INSERT INTO sales VALUES('Watch', 'NY', 'Buymore', 15); INSERT INTO sales VALUES('Watch', 'NY', 'Buymore', 15); INSERT INTO sales VALUES('Phone', 'MN', 'Buymore', 150); INSERT INTO sales VALUES('Phone', 'IL', 'Buymore', 150); INSERT INTO sales VALUES('Phone', 'NY', 'Buymore', 150); INSERT INTO sales VALUES('Phone', 'NY', 'Buymore', 150); INSERT INTO sales VALUES('Phone', 'NY', 'Buymore', 150); INSERT INTO sales VALUES('Phone', 'MN', 'Buymore', 150); INSERT INTO sales VALUES('Watch', 'NY', 'Amazing', 15); INSERT INTO sales VALUES('Watch', 'IL', 'Amazing', 15); INSERT INTO sales VALUES('Watch', 'NY', 'Amazing', 15); INSERT INTO sales VALUES('Phone', 'MN', 'Amazing', 150); INSERT INTO sales VALUES('Phone', 'NY', 'Amazing', 150); INSERT INTO sales VALUES('Phone', 'MN', 'Amazing', 150);
Now lets look at the result of a common group by clause:
SELECT store, SUM(amount) FROM sales GROUP BY store
STORE TOTAL_SALES -------------------- --------------------------------- Buymore 1020 Amazing 495As we can see the group by clause took the sum of sales by store. Now lets look at the sum of sales by state using a group by store and state.
SELECT store, state, SUM(amount) AS total_sales FROM sales GROUP BY store, state
STORE ST TOTAL_SALES -------------------- -- --------------------------------- Amazing MN 300 Buymore NY 525 Buymore IL 165 Amazing IL 15 Amazing NY 180 Buymore MN 330
These are some pretty common group by scenarios but what if you needed a grouping by store and state, and also to display the total sales for each store, as well as for all stores. Here is where rollup comes into play.
Rollup
Rollup is similar to a group by clause except that it also outputs running totals for your rollup groups.SELECT store, state, SUM(amount) AS total_sales FROM sales GROUP BY ROLLUP(store, state)
STORE ST TOTAL_SALES -------------------- -- ---------------------- Amazing IL 15 Amazing MN 300 Amazing NY 180 Amazing 495 <-Total Amazing Buymore IL 165 Buymore MN 330 Buymore NY 525 Buymore 1020 <-Total Buymore 1515 <-Total
You could also swap the rollup fields and rollup on state and then store:
SELECT store, state, SUM(amount) AS total_sales FROM sales GROUP BY ROLLUP(state,store)
STORE ST TOTAL_SALES -------------------- -- --------------------------------- Amazing IL 15 Buymore IL 165 IL 180 Amazing MN 300 Buymore MN 330 MN 630 Amazing NY 180 Buymore NY 525 NY 705 1515
Cube
The cube feature is similar to the rollup feature except it aggregates the data into all combinations in your cube clause. For example this will list the total sales by all combinations of store and state:- Store per State Total
- Store Total
- State Total
SELECT store, state, SUM(amount) AS total_sales FROM sales GROUP BY CUBE(store, state)
STORE ST TOTAL_SALES ---------------- -- ----------------- Amazing CT 15 Amazing IL 15 Amazing MD 5 Amazing MN 300 Amazing NY 180 Amazing 515 <-Total Amazing Buymore IL 165 Buymore MN 315 Buymore NY 525 Buymore 1005 <-Total Buymore 1520 <-Grand Total MD 5 <-Total by states IL 180 MN 615 CT 15 NY 705
Grouping Set
You can also do aggregates based off grouping sets. For example this will list the total sales by store and item, as well as state and item.SELECT store, state, item, SUM(amount) AS total_sales FROM sales GROUP BY GROUPING SETS((store, item), (state, item));
STORE ST ITEM TOTAL_SALES -------------------- -- ----------- -------------------- Buymore Watch 105 Amazing Watch 60 Buymore Phone 900 Amazing Phone 450 Amazing Pen 5 MD Pen 5 NY Phone 600 CT Watch 15 IL Watch 30 MN Watch 15 MN Phone 600 IL Phone 150 NY Watch 105
Rank
Now lets say you want to rank/order the stores by state of most sales. Here is where the rank function comes into play:SELECT store, state, SUM(amount) AS total_sales, RANK() OVER (PARTITION BY store ORDER BY SUM(amount) DESC) AS rank FROM sales GROUP BY state, store;
STORE ST TOTAL_SALES RANK ------------ -- -------------- ----- Amazing MN 300 1 <-Highest sale state for Amazing Amazing NY 180 2 Amazing IL 15 3 Buymore NY 525 1 <-Highest sale state for Buymore Buymore MN 330 2 Buymore IL 165 3Rank basically gives an ordering to your results based on how you want that ordering to work. "Partition by store" means your ordering is specific to each store. "order by sum(amount) desc" means for each store, order the results by the highest sales in descending order.
Please note that if two rows tie for a ranking, the ranking number will not increment on the tied row.
Dense Rank
There is also a slightly different rank type; dense rank that differs from rank in how subsequent rows after a tie will increment.To demonstrate, lets add a new state, CT that ties with IL for Amazing sales. Lets also add another state with lower sales than CT and IL.
INSERT INTO sales VALUES('Watch', 'CT', 'Amazing', 15); INSERT INTO sales VALUES('Pen', 'MD', 'Amazing', 5); SELECT store, state, SUM(amount) AS total_sales, DENSE_RANK() OVER (PARTITION BY store ORDER BY SUM(amount) DESC) AS dense_rank, RANK() OVER (PARTITION BY store ORDER BY SUM(amount) DESC) AS rank FROM sales GROUP BY state, store;
STORE ST TOTAL_SALES DENSE_RANK RANK -------------------- -- ---------------- ----------------- -------------------- Amazing MN 300 1 1 Amazing NY 180 2 2 Amazing CT 15 3 3 Amazing IL 15 3 3 Amazing MD 5 4 5 Buymore NY 525 1 1 Buymore MN 315 2 2 Buymore IL 165 3 3In this example, after a tie ranking (ranking #3), rank skipped over ranking #4 and went straight to 5 while dense rank incremented one count to 4. Rank is nice if you need a total count of items in the rank.
Row Number
Row number is similar to rank however instead of ranking the row, just gives the row a number starting at 1 and incrementing by 1. In the following example, we assign a row number to each row and partition the row number by store. This basically means that each store will have a row number assigned starting at 1.SELECT store, state, SUM(amount) AS total_sales, ROW_NUMBER() OVER (PARTITION BY store) AS row_number FROM sales GROUP BY state, store;
STORE ST TOTAL_SALES ROW_NUMBER -------------------- -- ---------------- ----------------- Buymore NY 525 1 Buymore IL 165 2 Buymore MN 315 3 Amazing MN 300 1 Amazing CT 15 2 Amazing NY 180 3 Amazing IL 15 4 Amazing MD 5 5
Comments
Post a Comment