Data Henrik: engine

Friday, August 9, 2013

DB2 BLU: Some additional metadata, the synopsis table, and page map indexes

Today I continue my journey into the metadata of DB2 with BLU Acceleration which I recently started. I had created a table and looked at the catalog and found a synopsis table. The new column-organized tables for in-memory analytics are supposed to be simple to use, no indexes to create. No indexes? Let me check...

First on today's list is a look at SYCAT.INDEXES, the catalog view holding index information.

select indschema, indname, tabname, colnames, indextype from syscat.indexes where tabname like '%BLUDATA%'

INDSCHEMA     INDNAME               TABNAME                       COLNAMES                           INDEXTYPE
------------- --------------------- ----------------------------- ---------------------------------- ---------
SYSIBM        SQL130809032717072049 BLUDATA       +SQLNOTAPPLICABLE+SQLNOTAPPLICABLE CPMA
SYSIBM    SQL130809032717232599 SYN130809032716942015_BLUDATA +SQLNOTAPPLICABLE+SQLNOTAPPLICABLE CPMA

2 record(s) selected.
What is interesting to find, is that two indexes have been created. One is on the base table, one on the synopsis table. They are of a new index type CPMA which stands for Cde Page MAp (and CDE for Columnar Data Engine). I found this in the description of SYSCAT.INDEXES and by dumping the index data using db2dart (see below).

Next I wanted to see how the synopsis table changed by inserting data. The following small script generated my data file with 10000 rows.

#!/bin/bash

for i in {1..10000}
do
   echo "$i,0.8,$i,''" | tee -a bludata.csv
done

Then I was ready for loading the data. Here are some snippets of the output.

load from bludata.csv of del replace into hl.bludata
...
SQL3500W The utility is beginning the "ANALYZE" phase at time "08/09/2013
05:06:47.313324".
...
SQL3500W The utility is beginning the "LOAD" phase at time "08/09/2013
05:06:48.215545".
...
SQL3110N The utility has completed processing. "10000" rows were read from
the input file.
...
SQL3500W The utility is beginning the "BUILD" phase at time "08/09/2013
05:06:48.802651".

First the data is analyzed to optimize data layout and the column-specific compression. The comes the actual load phase followed by a build phase (for maintaining the internal index). With an empty table the synopsis table was empty, too. Now, with 10000 rows loaded, we can find some entries:

select * from sysibm.SYN130809032716942015_BLUDATA

IDMIN       IDMAX       QUOTAMIN QUOTAMAX SOLDMIN     SOLDMAX     TSNMIN               TSNMAX
----------- ----------- -------- -------- ----------- ----------- -------------------- --------------------
          1        2328     0.80     0.80           1        2328                    0                 1023
        754        2810     0.80     0.80         754        2810                 1024                 2047
       1296        3563     0.80     0.80        1296        3563                 2048                 3071
       1567        4187     0.80     0.80        1567        4187                 3072                 4095
       3895        6243     0.80     0.80        3895        6243                 4096                 5119
       4790        6996     0.80     0.80        4790        6996                 5120                 6143
       5061        7387     0.80     0.80        5061        7387                 6144                 7167
       5694        8283     0.80     0.80        5694        8283                 7168                 8191
       7750       10000     0.80     0.80        7750       10000                 8192                 9215
       8984        9767     0.80     0.80        8984        9767                 9216                 9999

10 record(s) selected.

For every 1024 rows there is an entry in the synopsis table. Not much surprising stuff can be found as my test data only has variations in the ID and SOLD columns.

My last action for today was to look at the index data as stored on disk. By invoking db2dart with the DI option it is possible to dump formatted index data. Both indexes only have 3 pages each and you can try using db2dart yourself to see the output. What is included in each report as part of the index metadata is the index type which is as follows:
Index Type = NonUnique Bi-directional Large RID CDE Page Map Split-High No exclusive lock optimzation RID-list Compression Variable Slot Directory Prefix Compression
Seeing it labeled as Page Map index verifies that I dumped the right index. That concludes my Friday "spare time", have a nice weekend. And if you have time to read, here is the link to my other DB2 BLU-related posts.

Friday, March 6, 2009

Does you car need different engines? (XML Storage)

Today I plan to talk about cars and XML storage, two seemingly different topics.

Cars have an engine, the engine is used to move the car with slow, middle, or high speeds, on flat roads, up and down hills or mountains, with light or heavy load. You don't need a different engine for different terrains. It's great because you can go where you need to go without too much pre-planning - flexibility you love. You don't need to select and put in a different engine for the next task ahead. There are more benefits because service technicians know how to deal with that single engine, car manufacturers can put their development budget into improving that single engine, and you know how the engine is behaving in different situations. It's not always optimal, but you are prepared to take the roads ahead because you don't need to change engines. You are prepared for unexpected detours ahead. Watch out if someone tries to sell you a car where you need to change engines!

BTW: With automatic transmissions you don't need to shift gears, with newer automatic transmissions there isn't even any gear shifting. Cars have evolved to turn on lights automatically, even to watch out for and assist in critical or complex situations (ESP, ABS, etc.). It takes stress from you and let's you reach your destinations safely.

Coming to the topic of XML storage, I want to emphasize that XML is about flexibility. DB2 pureXML only has a single XML type, a single XML storage engine. You don't need to choose a different XML storage option if your requirements change. You don't need to decide (and know) upfront which road to take, no issue dealing with an unexpected detour because your DB2 engine is suited to handle it. This is flexibility you love (or will learn to love)! Combine that with automatic storage, self-tuning memory management, and other autonomics and life is less stressful.

What a perspective for the weekend. And remember to watch out...

Pages

Friday, August 9, 2013

DB2 BLU: Some additional metadata, the synopsis table, and page map indexes

Friday, March 6, 2009

Does you car need different engines? (XML Storage)