Data Henrik: Information Center

Showing posts with label Information Center. Show all posts

Monday, May 30, 2016

New IBM Knowledge Center for DB2 and other products

New DB2 Knowledge Center

Maybe you have already seen this, but the Knowledge Center for DB2 and other IBM products just changed. When you go to your bookmarked link for the DB2 10.5 Knowledge Center you will notice a slightly different layout for the known content. However, there are more differences and I really like them.

The first thing I tried was the language picker. At the bottom right of each page you can now switch between supported languages. Something that is useful especially when your first language is not English and you want to check language-specific terms or clarify a feature description. On the welcome page for DB2 10.5 you can also switch between DB2 versions and going back to even DB2 9.5 is still supported.

Switching Languages in the IBM Knowledge Center

What I first missed due to layout changes was the navigation tree. It is visible after clicking the icon on the upper left and topics can be expanded much faster than in the old version of the Knowledge Center. What also is much faster is the search functionality. After clicking on "Search" on the upper right, a page with a search box is coming up. It is possible to select the DB2 version to be searched and enter the search term. Suggestions for possible keywords are made and when you hit "enter" the search results appear. Everything as expected. However, what I find very useful is the option to preview individual search results by expanding them within the result list (see screenshot below). That way you can stay with the result page without switching back and forth between documentation and search results.

Once you are on a regular documentation page, you can again switch between different DB2 versions and thus easily compare what has changed or check out syntax for a specific version. There are also new forward and backward buttons on top of each page to walk through a topic or section split over multiple pages - less navigation and clicks required to consume the content.

That's my update for today. If you feel nostalgic, check out my blog entry from 2009 about changes to what was called "Information Center" at that time. And in case you are cloud-based already, the new Knowledge Center for IBM dashDB is here.

Simplified search in the DB2 Knowledge Center

Tuesday, July 8, 2014

DB2 Quiz: Processes and CPU

In today's DB2 quiz the focus is on DB2 processes and CPU consumption. Which SQL statement did I run in DB2 for the following output? What function is used?

NAME                                CPU_USER             CPU_SYSTEM
----------------------------------- -------------------- --------------------
db2fmp (                                              13                    4
db2fmp (                                               9                   10
db2vend (PD Vendor Process - 1) 0                      5                    2
db2ckpwd 0                                             0                   13
db2ckpwd 0                                             0                   13
db2ckpwd 0                                             0                    9
db2sysc 0                                          18561                30282
db2wdog 0 [hloeser]                                    6                   44
db2acd 0                                            6997                11550

9 record(s) selected.

A statement similar to the one I used can be found in the DB2 Knowledge Center. The statement makes use of a special table function.

Tuesday, June 17, 2014

DB2 Screenshot Quiz: Where is this taken from?

I am using different DB2-related services, such as the new Knowledge Center for DB2, BLU for Cloud (DB2 with BLU Acceleration in the Cloud), IBM Bluemix, and of course a local DB2 installation. Where did I find the following graphic? It is part of one of the above mentioned services...

Let me know by comment or direct email.

Tuesday, May 20, 2014

SQL Quiz: Which command did I run?

I have a small DB2 test database which I wanted to clean up. I ran a command which has the output below. Which command was it?

Table/View                      Schema          Type Creation time
------------------------------- --------------- ----- --------------------------
FOO                             HLOESER         T     2014-03-19-12.14.19.598515
INUSE                           HLOESER         T     2014-05-08-11.49.18.419975

2 record(s) selected.

BTW: The command can be found in the IBM Knowledge Center which will be/is replacing and integrating the DB2 Information Center.

Tuesday, May 6, 2014

Tuning your DB2 CLP environment: Customize appearance and editor

Tuning World Bodensee via Wikipedia

Over the last weekend, the annual Tuning World Bodensee was guest at the Messe Friedrichshafen (exhibition center and fair grounds). More than 100,000 people interested in car tuning traveled to Friedrichshafen. "Tuning" can be trying to get more performance out of engine or to customize the car to the personal style. With DB2, you can customize the command line processor to your personal style and preferences. Let's have a look at the available tuning options.

All the recent versions of DB2 provide three environment variables to tune the editing experience in the interactive DB2 CLP: DB2_CLP_EDITOR, DB2_HIST_SIZE, and DB2_CLPPROMPT. The first variable, DB2_CLP_EDITOR, is used to specify an external editor to be used for editing SQL statements. On my Linux system, I did the following:

export DB2_CLP_EDITOR=gedit

Now you can edit previous statements using the EDIT command. "EDIT 1" would call the editor with the first statement in the command history, "E 1" would do the same. To know which statements are available, use the HISTORY command or its short version "H". The maximum number of available commands is determined by the variable DB2_HIST_SIZE. It accepts numbers from 1 to 500.

export DB2_HIST_SIZE=100

To reduce the number of statements listed with the HISTORY commands, you can limit it: "H 10" would return the last 10 statements in history, "H R 5" would return the last five in reverse order. Instead of the option "R" you could also use the full word "REVERSE", e.g., "H REVERSE" or "HISTORY REVERSE". Editing commands is fun, but actually executing them is probably why they were edited. To execute a specific statement from the history, you can utilize RUNCMD. The short version is just "R" and a valid parameter would be the number corresponding to a "historic" statement.

Both RUNCMD and EDIT, if not invoked with a number, will pick the newest statement in history. Both also accept negative numbers with "-1" being the most recent statement.

What is left is to "decorate" the command line processor in your personal style. DB2_CLPPROMPT is used to modify the command prompt. It accepts different tokens and most characters. Here is my version which prints the current database name followed by "=> ":

export DB2_CLPPROMPT="DB: %d => "

Here is a small sample session with the bew prompt:

DB: => connect to hltest

   Database Connection Information

Database server        = DB2/LINUXX8664 10.5.3
SQL authorization ID   = HLOESER
Local database alias   = HLTEST

DB: HLTEST => values 'Good Morning'

1
------------
Good Morning

1 record(s) selected.

DB: HLTEST => e -1
DB: HLTEST => values 'That''s it, good bye!'
Do you want to execute the above command ? (y/n) y

1
--------------------
That's it, good bye!

1 record(s) selected.

DB: HLTEST => h r
4    h r
3    values 'That''s it, good bye!'
2    values 'Good Morning'
1    connect to hltest
DB: HLTEST =>

Thursday, April 3, 2014

(SOLVED) DB2 Quiz: What function do I call?

Today, it is time for another DB2 quiz. I am going to present you the output of a DB2 function:

1
---
YES

1 record(s) selected.
Which built-in function did return this result? I invoked the function this way "values funcname", no parameters were involved. The function was added to DB2 in version 10.1.

Added on April 3rd:
As you can see from the comments, the quiz has been solved. I called the administrative function ADMIN_GET_INTRA_PARALLEL(). The function returns either YES or NO depending on whether the database application will run with intraparallel parallelism enabled or not. In DB2 10.1 several enhancements were made to exploiting parallelism. In addition to the database manager (dbm) switch INTRA_PARALLEL to enable or disable parallelism, it is also possible to call the new procedure ADMIN_SET_INTRA_PARALLEL() to configure parallelism for a specific connection.

Friday, January 24, 2014

Security and DB2 LUW

Did you recently start paying more attention to credit card bills? Thinking twice before speaking on the phone or sending a text message? Cutting short on communication with your partner...? Awareness for topics such as privacy and data security has increased dramatically over the past few months. In some industries such as banking, the supervisory authorities - in Germany it is BaFin - have tightened regulations over several years, requiring changes to how databases are set up and administrated, how data can be stored and accessed. As I recently declared 2014 as the year of database security, I thought collecting some related DB2 resources would be a good way to promote it. Here we go...

When you work with DB2 for Linux, UNIX, and Windows, and are researching a topic, then the DB2 Information Center is a good start. It has an entire section on security (look at the navigation section on the left). It explains the DB2 Security Model, various security-related concepts, and has links and background information on some IBM InfoSphere Guardium tools. Many security and auditing tools as well as the Data Encryption (formerly Encryption Expert) product are labeled Guardium.

Other places to visit are the DB2 Best Practices, IBM Redbooks, and IBM developerWorks (list of DB2 security articles). There is a IBM Data Server Security best practices paper and also a redbook "DB2 Security and Compliance Solutions for Linux, UNIX, and Windows". You can also learn about security functionality when attending one of the offered Information Management bootcamps or taking a DB2 class through Learning Services.

Last but least, before I start my weekend, I would like to point to the blog articles I have written on DB2 security topics.

Have a nice weekend and watch your transactions...

Thursday, December 12, 2013

The DB2 10.5 Year End Quiz

This is the last quiz for 2013 and before I take a break ("vacation"). When looking at the English language version of the DB2 10.5 Information Center, what do these four (4) pages have in common?

Where condition in SQL warehousing transform operator:
http://pic.dhe.ibm.com/infocenter/db2luw/v10r5/topic/com.ibm.dwe.sqw.doc/designing/data_flow/ropwhere.html

SEND procedure - Send E-Mail to an SMTP server:
http://pic.dhe.ibm.com/infocenter/db2luw/v10r5/topic/com.ibm.db2.luw.apdv.sqlpl.doc/doc/r0055177.html

UTL_SMTP module:
http://pic.dhe.ibm.com/infocenter/db2luw/v10r5/topic/com.ibm.db2.luw.apdv.sqlpl.doc/doc/r0055180.html

The SAMPLE database:
http://pic.dhe.ibm.com/infocenter/db2luw/v10r5/topic/com.ibm.db2.luw.apdv.samptop.doc/doc/r0001094.html

Any guesses? It is about the samples and the current season...

Enjoy the holidays and all the best for 2014!

Wednesday, December 4, 2013

DB2: Nothing to hide...

Today, while teaching a DB2 class, I was asked whether it is possible to hide all columns of a table. The rationale behind the question was that then all application developers would be forced to specify the columns during INSERT. Well, my first reaction was that at least one column needs to be visible. How does it look like?

Testing is faster than looking into the manual for IMPLICITLY HIDDEN columns, so here is a quick test:
db2 => create table hideme(id int implicitly hidden, s varchar(60) implicitly hidden)
DB21034E The command was processed as an SQL statement because it was not a
valid Command Line Processor command. During SQL processing it returned:
SQL20422N The statement failed because the table "HLOESER.HIDEME" would only
contain hidden columns. SQLSTATE=428GU
db2 => create table hideme(id int, s varchar(60) implicitly hidden)
DB20000I The SQL command completed successfully.
db2 => insert into hideme values(1)
DB20000I The SQL command completed successfully.
db2 => insert into hideme(id,s) values(2,'Hey')
DB20000I The SQL command completed successfully.
db2 => select * from hideme

ID
-----------
          1
          2

2 record(s) selected.

db2 => select id,s from hideme

ID          S
----------- ------------------------------------------------------------
          1 -
          2 Hey

2 record(s) selected.

At least one column must be visible as assumed.

Tuesday, November 26, 2013

MySQL-style LIMIT and OFFSET in DB2

An "ancient" but not yet well-known feature in DB2 is the support of MySQL/PostgreSQL-style LIMIT and OFFSET in SELECT statements and searched UPDATE/DELETE. Unfortunately, it is not really documented and I am working on getting some more documentation added (think about the "mostly harmless" as in the Hitchhiker's Guide to the Galaxy).

To demonstrate what is possible, I created a small table full of good stuff:
db2 => create table sweets(id int, desc varchar(60), quant int)
DB20000I The SQL command completed successfully.
db2 => insert into sweets values(1,'dark chocolate',4)
DB20000I The SQL command completed successfully.
db2 => insert into sweets values(2,'marzipan bar',1)
DB20000I The SQL command completed successfully.
db2 => insert into sweets values(3,'almond cookies',10)
DB20000I The SQL command completed successfully.
db2 => insert into sweets values(4,'granola bar',1)
DB20000I The SQL command completed successfully.
db2 => insert into sweets values(5,'nut chocolate',1)
DB20000I The SQL command completed successfully.
db2 => insert into sweets values(6,'chocolate-covered peanuts',1)
DB20000I The SQL command completed successfully.
db2 => select * from sweets limit 5
SQL0104N An unexpected token "limit" was found following "". Expected tokens
may include: "FETCH FIRST ROWS ONLY". SQLSTATE=42601

By default, DB2 does not know about LIMIT and OFFSET either. But setting the DB2_COMPATIBILITY_VECTOR to MYS (as in MySQL) or "04000" enables the syntactic alternative to the FETCH FIRST n ROWS ONLY. After restarting DB2 the SELECT works:

db2 => select * from sweets limit 5

ID          DESC                                                QUANT
----------- --------------------------------------------------- -----------
          1 dark chocolate                                                4
          2 marzipan bar                                                  1
          3 almond cookies                                               10
          4 granola bar                                                   1
          5 nut chocolate                                                 1

5 record(s) selected.

The following two examples show the use of the OFFSET, i.e., where to start in the result set. I combined it with ordering on the quantity.

db2 => select quant,desc from sweets order by num limit 3 offset 2

QUANT       DESC
----------- ------------------------------------------------------------
          1 nut chocolate
          1 chocolate-covered peanuts
          4 dark chocolate

3 record(s) selected.

db2 => select quant,desc from sweets order by num limit 2,3

QUANT       DESC
----------- ------------------------------------------------------------
          1 nut chocolate
          1 chocolate-covered peanuts
          4 dark chocolate

3 record(s) selected.

Note that in the second example a special abbreviated syntax is used. The first number is the offset where to start, followed by the number of rows to return.

As said, there is not much in the DB2 Information Center (yet). Serge has some more background in his blog about different ways of paging through result sets.

Monday, October 21, 2013

Heavy hitter: SYSIBM.SYSTABLES

I tried out the GET_DEPENDENCY procedure on one of the central catalog tables. SYSIBM.SYSTABLES is the catalog table behind SYSCAT.TABLES (and some more). As you can see from the output below, a total of 23 views are dependent on it.

db2 => call dbms_utility.get_dependency('TABLE','SYSIBM','SYSTABLES')

Return Status = 0

DEPENDENCIES ON SYSIBM.SYSTABLES
------------------------------------------------------------------
*TABLE SYSIBM.SYSTABLES()
*   VIEW SYSCAT .AUDITUSE()
*   VIEW SYSCAT .INDEXES()
*   VIEW SYSCAT .NICKNAMES()
*   VIEW SYSCAT .PERIODS()
*   VIEW SYSCAT .TABLES()
*   VIEW SYSIBM .CHECK_CONSTRAINTS()
*   VIEW SYSIBM .COLUMNS()
*   VIEW SYSIBM .SQLCOLPRIVILEGES()
*   VIEW SYSIBM .SQLCOLUMNS()
*   VIEW SYSIBM .SQLFOREIGNKEYS()
*   VIEW SYSIBM .SQLPRIMARYKEYS()
*   VIEW SYSIBM .SQLSPECIALCOLUMNS()
*   VIEW SYSIBM .SQLSTATISTICS()
*   VIEW SYSIBM .SQLTABLEPRIVILEGES()
*   VIEW SYSIBM .SQLTABLES()
*   VIEW SYSIBM .TABLES()
*   VIEW SYSIBM .TABLE_CONSTRAINTS()
*   VIEW SYSIBM .VIEWS()
*   VIEW SYSIBMADM.OBJECTOWNERS()
*   VIEW SYSIBMADM.PRIVILEGES()
*   VIEW SYSSTAT .COLDIST()
*   VIEW SYSSTAT .COLUMNS()
*   VIEW SYSSTAT .TABLES()
*   PACKAGE SYSIBMADM.DBMS_ALERT()
*   PACKAGE SYSIBMADM.DBMS_JOB()
*   PACKAGE SYSIBMADM.DBMS_UTILITY()
*   PACKAGE SYSIBMADM.UTL_DIR()

Do you know of another system table with more dependencies?

Monday, October 14, 2013

connect_proc and locales: Connecting the dots for combined fun

Last year in a blog article I had written about some fun I had with different locales and the dayname function in DB2. Last week I showed you links to customize the application environment. Why not combine the two for some added fun? Let me show you how to connect the dots. I am going to show you how to adapt the locale setting in DB2 based on user preferences.

First, we need a simple table to store our user preferences:
create table myschema.users (id int, uname varchar(30), locale char(5));

For our testing purposes, two user entries will do:
insert into myschema.users values(1,'HLOESER','de_DE');
insert into myschema.users values(2,'DB2FENC1','fr_FR');

What we will do is to set up our own procedure as connect_proc in DB2. In that procedure we access our small user table, read out the preference for the locale setting based on the value of the SESSION_USER. Then we set CURRENT LOCALE LC_TIME special register which controls the language and behavior for day name, timestamp format, rounding, truncation and some more.

create or replace procedure hl.my_connect()
reads sql data
language sql
BEGIN
   declare loc char(5);
   select locale into loc from myschema.users
        where uname=SESSION_USER;
   set current locale lc_time loc;
END@

After the procedure is created, some testing should be done:
db2 => call my_connect()

Return Status = 0
What also needs to be done is to allow everyone to execute our procedure:
db2 => grant execute on procedure hl.my_connect to public
DB20000I The SQL command completed successfully.
The final step is to update the database configuration and tell DB2 to use our procedure on every successful connection.
db2 => update db cfg using connect_proc hloeser.my_connect
DB20000I The UPDATE DATABASE CONFIGURATION command completed successfully.

To see the effect of our procedure in place we need to connect as user "hloeser" and as "db2fenc1".

Testing as "hloeser":
select dayname(current date) from dual

1
----------------------------------------------------------------------------------------------------
Monday

When we connect as "db2fenc1", the result is different:

values(dayname(current date))

1
----------------------------------------------------------------------------------------------------
lundi

It seems, replacing the connect_proc with our procedure worked. I close with one important advise: Leave one database connection open during testing. If your connect procedure is not working well, you need that connection to reset the database configuration (update db cfg using connect_proc '')...

Update: There was a question whether the connection is needed or why I recommend it: The answer is that connect_proc, once set, can only be updated when connected to the database. Unsetting connect_proc can be done when not connected, but the database needs to be inactive. Thus, having the connection saves time when developing.

Tuesday, October 8, 2013

I'm not you - Ways to customize a DB2 application environment

I am not you, she is not him. Users differ, applications differ. There are two interesting concepts in DB2 that help to adapt an application environment and application and that are not widely known. One is the connect_proc database configuration parameter, the other is conditional compilation of SQL. Let me give you some ideas of what it is and links to dig deeper into it.

Introduced in DB2 9.7 FP3, the connect procedure let's use configure a stored procedure that is invoked whenever someone (an application) connects to DB2. Thus, the session context can be tailored to the user or application by setting the locale, the path to resolve functions, optimization levels and more. Even though the procedure is not allowed to modify the database it could invoke another routine with an autonomous transaction. That way some information could be logged ("track", "audit") or another event be triggered. The DB2 Information Center has some examples on how to use connect_proc for session customization. Serge Rielau showed how to implement some logging.

Conditional compilation allows to tailor the compiled SQL to the actual environment. You could just use procedure or function code for little or for big endian - making it execute faster, implement your own routine in different ways depending on the DB2 version - use a more efficient implementation with newer DB2 versions, or plug in a lighter version of code - maybe without debug code. The key is to use SQL_CCFLAGS, flags for conditional SQL compilation. The variable can be set at the database or session level and then referenced within the actual SQL code. Special "_IF"/"_ELSEIF"/"_THEN"/"_END" directives do the trick of selecting the code you want. This is similar to shell scripting and programming languages.
In the DB2 system catalog you can find out which flags were set when a routine or trigger was compiled ("whom to blame..."). Of course the currently set value can be retrieved using the CURRENT SQL_CCFLAGS special registry.

Of course it is possible to combine the two. Develop your own library of connect-related actions, audit, and setup routines, then set SQL_CCFLAGS to implement just the mix you need for a specific application environment.

Thursday, September 5, 2013

Oracle DBLINK and DB2 Three-Part-Names for Federation

One of the new features that went mostly unnoticed for DB2 10.1 and DB2 10.5 is the support for so-called three-part-names (nice pun?) for Federation. It allows the access to remote tables without the need to create nicknames. Thus, applications and users can be more dynamic and agile in what they access (there are drawbacks to it, too). When the DB2_COMPATIBILITY_VECTOR is set accordingly, even the Oracle-like DBLINK syntax of "schema.table@database" is tolerated. So much for the theory, I tried to get it to work on my laptop...

I already had two different databases created, and the DB2 compatibility vector is set to "ORA". Thus, I could focus on Federation-related commands. The first thing to do is to enable my DB2 instance for Federation (FEDERATED is the configuration parameter).
db2 update dbm cfg using federated yes
DB20000I The UPDATE DATABASE MANAGER CONFIGURATION command completed
successfully.

The next commands are directly executed in my first, "local" database, after connecting to it:
db2 => create wrapper drda
DB20000I The SQL command completed successfully.

This creates the DRDA wrapper which is required to access DB2 sources. If you want to access Oracle databases from within DB2, you would need to create the NET8 or Oracle wrapper and some more steps which are described here. As I am going to access a local DB2 database, I don't need to catalog any nodes or map users. So what is left is to create a server, i.e., to make the database I want to access known to my database I am operating with.

db2 => create server myserver type db2/cs version 10.5 wrapper drda authorization hloeser password "secrets4you" options (dbname 'DBTWO')
DB20000I The SQL command completed successfully.

Thereafter, I can already access the remote table using either a three-part-name or the Oracle dblink syntax.

db2 => select * from myserver.henrik.worldtest

ID          S
----------- ---------------
          1 Hello World

1 record(s) selected.

db2 => select * from henrik.worldtest@myserver

ID          S
----------- ---------------
          1 Hello World

1 record(s) selected.

Both statements do not require a nickname to be created, thus saving administration time and allowing to migrate Oracle applications more easily. One thing to note is that I need to connect to my local database with the user option in order for Federation to pass on my credentials. That is "connect to dbone user henrik" will work, just using "connect to dbone" will bring you authentication errors or SQL0204N "... is an undefined name".

Have you tried out dblink syntax with DB2? Comments?

Friday, August 9, 2013

DB2 BLU: Some additional metadata, the synopsis table, and page map indexes

Today I continue my journey into the metadata of DB2 with BLU Acceleration which I recently started. I had created a table and looked at the catalog and found a synopsis table. The new column-organized tables for in-memory analytics are supposed to be simple to use, no indexes to create. No indexes? Let me check...

First on today's list is a look at SYCAT.INDEXES, the catalog view holding index information.

select indschema, indname, tabname, colnames, indextype from syscat.indexes where tabname like '%BLUDATA%'

INDSCHEMA     INDNAME               TABNAME                       COLNAMES                           INDEXTYPE
------------- --------------------- ----------------------------- ---------------------------------- ---------
SYSIBM        SQL130809032717072049 BLUDATA       +SQLNOTAPPLICABLE+SQLNOTAPPLICABLE CPMA
SYSIBM    SQL130809032717232599 SYN130809032716942015_BLUDATA +SQLNOTAPPLICABLE+SQLNOTAPPLICABLE CPMA

2 record(s) selected.
What is interesting to find, is that two indexes have been created. One is on the base table, one on the synopsis table. They are of a new index type CPMA which stands for Cde Page MAp (and CDE for Columnar Data Engine). I found this in the description of SYSCAT.INDEXES and by dumping the index data using db2dart (see below).

Next I wanted to see how the synopsis table changed by inserting data. The following small script generated my data file with 10000 rows.

#!/bin/bash

for i in {1..10000}
do
   echo "$i,0.8,$i,''" | tee -a bludata.csv
done

Then I was ready for loading the data. Here are some snippets of the output.

load from bludata.csv of del replace into hl.bludata
...
SQL3500W The utility is beginning the "ANALYZE" phase at time "08/09/2013
05:06:47.313324".
...
SQL3500W The utility is beginning the "LOAD" phase at time "08/09/2013
05:06:48.215545".
...
SQL3110N The utility has completed processing. "10000" rows were read from
the input file.
...
SQL3500W The utility is beginning the "BUILD" phase at time "08/09/2013
05:06:48.802651".

First the data is analyzed to optimize data layout and the column-specific compression. The comes the actual load phase followed by a build phase (for maintaining the internal index). With an empty table the synopsis table was empty, too. Now, with 10000 rows loaded, we can find some entries:

select * from sysibm.SYN130809032716942015_BLUDATA

IDMIN       IDMAX       QUOTAMIN QUOTAMAX SOLDMIN     SOLDMAX     TSNMIN               TSNMAX
----------- ----------- -------- -------- ----------- ----------- -------------------- --------------------
          1        2328     0.80     0.80           1        2328                    0                 1023
        754        2810     0.80     0.80         754        2810                 1024                 2047
       1296        3563     0.80     0.80        1296        3563                 2048                 3071
       1567        4187     0.80     0.80        1567        4187                 3072                 4095
       3895        6243     0.80     0.80        3895        6243                 4096                 5119
       4790        6996     0.80     0.80        4790        6996                 5120                 6143
       5061        7387     0.80     0.80        5061        7387                 6144                 7167
       5694        8283     0.80     0.80        5694        8283                 7168                 8191
       7750       10000     0.80     0.80        7750       10000                 8192                 9215
       8984        9767     0.80     0.80        8984        9767                 9216                 9999

10 record(s) selected.

For every 1024 rows there is an entry in the synopsis table. Not much surprising stuff can be found as my test data only has variations in the ID and SOLD columns.

My last action for today was to look at the index data as stored on disk. By invoking db2dart with the DI option it is possible to dump formatted index data. Both indexes only have 3 pages each and you can try using db2dart yourself to see the output. What is included in each report as part of the index metadata is the index type which is as follows:
Index Type = NonUnique Bi-directional Large RID CDE Page Map Split-High No exclusive lock optimzation RID-list Compression Variable Slot Directory Prefix Compression
Seeing it labeled as Page Map index verifies that I dumped the right index. That concludes my Friday "spare time", have a nice weekend. And if you have time to read, here is the link to my other DB2 BLU-related posts.

Monday, August 5, 2013

DB2 BLU: A look at the catalog tables and the synopsis table

DB2 10.5 with BLU Acceleration allows to create column-organized tables. They are the foundation for "actionable compression" and smart in-memory processing. I plan to cover some details in upcoming posts and already have written some articles about DB2 BLU earlier. Today I show you what happens in the DB2 catalog, the metadata, when you create a table.

I started by "db2set DB2_WORKLOAD=ANALYTICS". This is the magic knob to simplicity and performance for analytic queries. After creating a database, I connected and then I was ready to create a table.

create table hl.bludata(id int, quota decimal(5,2),sold int, comment varchar(200))
DB20000I The SQL command completed successfully.

The most important metadata for tables is accessible in the SYSCAT.TABLES catalog view. By sending the following query to DB2 a small subset of the columns is fetched for the table I just created:

select tabname,tabschema,property,compression,tableorg from syscat.tables where tabname like '%BLUDATA%'

TABNAME                        TABSCHEMA PROPERTY                         COMPRESSION TABLEORG
------------------------------ --------- -------------------------------- ----------- --------
BLUDATA                        HL                            Y                         C
SYN130805063137432548_BLUDATA SYSIBM                        YY                        C

2 record(s) selected.

But wait, there are two tables. The first has the name and schemaname as specified, the second is created in the schema SYSIBM and is created under the covers. It is the so-called synopsis table and its name starts with the prefix SYN followed by a timestamp and the name of the base table. Both tables are column-organized (TABLEORG=C). For both we also see a value for PROPERTY which is a character vector. The first "Y" is at position 20, indicating a column-organized table. The synopsis table also has a "Y" at position 21, officially marking it a synopsis table (see the descriptions for SYSCAT.TABLES).

The purpose of the synopsis table is to help with data organization of the base table and to aid DB2 with fast data skipping during query processing (I plan to talk about dyanmic list prefetching in a future post).

Because both tables are related to each other, another catalog table is involved, SYSCAT.TABDEP, where table or object dependencies are recorded.

select tabname,dtype,bname,bschema,btype from syscat.tabdep where tabname like '%BLUDATA%' or bname like '%BLUDATA%'

TABNAME                       DTYPE BNAME BSCHEMA BTYPE
----------------------------- ------ ------- ------- -----
SYN130805063137432548_BLUDATA 7    BLUDATA HL      T

1 record(s) selected.

Based on the metadata the table "SYN130805063137432548_BLUDATA" is a synopsis table (DTYPE=7) and depends on the table I created (BLUDATA). But how does the synopsis table look like? I am using the command "describe table" to see what columns have been defined:

describe table SYSIBM.SYN130805063137432548_BLUDATA

                                Data type                     Column
Column name                     schema    Data type name      Length     Scale Nulls
------------------------------- --------- ------------------- ---------- ----- ------
IDMIN                           SYSIBM    INTEGER                      4     0 Yes
IDMAX                           SYSIBM    INTEGER                      4     0 Yes
QUOTAMIN                        SYSIBM    DECIMAL                      5     2 Yes
QUOTAMAX                        SYSIBM    DECIMAL                      5     2 Yes
SOLDMIN                         SYSIBM    INTEGER                      4     0 Yes
SOLDMAX                         SYSIBM    INTEGER                      4     0 Yes
TSNMIN                          SYSIBM    BIGINT                       8     0 No
TSNMAX                          SYSIBM    BIGINT                       8     0 No

8 record(s) selected.

What can be seen is that for all non-character columns of my table we have a corresponding MIN and MAX column. In addition we have two BIGINT columns TSNMIN and TSNMAX. TSN stands for Tuple Sequence Number and helps to recompose a row based on individual column values. The synopsis table plays a pivotal role (pun intended) for faster query execution. But that is part of a future post...

Thursday, July 25, 2013

DB2 in virtualized environments

I had written about DB2 and virtualization in the past. You can use DB2 in many virtual environments and there are already kind of "dated" redbooks explaining details about DB2 with VMWare, PowerVM, and others. Today I want to point you to two interesting resources:

You can even virtualize a DB2 pureScale cluster on Linux: The "yes, it is supported" can be found in the Information Center when looking for KVM. All the technical details of setting such a cluster up are discussed in this developerWorks article: Virtualize the IBM DB2 pureScale Feature on Linux using Kernel-based Virtual Machine
An overview of all supported DB2 virtualization environments, e.g., which hypervisor, which guest operating systems, etc., is provided on this developerWorks page.

Wednesday, July 10, 2013

Extended row size support in DB2: Fitting more data into a single row than the page size would typically allow (and sometimes it makes sense, but that is another story - stay tuned for user experience...)

One of the new features of the new DB2 10.5 release is the support for so-called extended row size. As you might have guessed from the article's title, you can squeeze a lot of data into a single row now. The feature can be used for compatibility reasons with other vendors, to consolidate a table with "side tables" into a single one, or to possibly improve performance by moving tables to smaller page sizes. Let's take a look into the details of "extended row size".

Many of the DB2 size-related limits are documented in the overview of SQL and XML limits, however all the information on column sizes, row sizes, and what can be fitted into data pages of different size is part of the CREATE TABLE documentation. There, way down, we find that for 4k pages the row size is limited to 4005 bytes, for 8k to 8101 bytes, for 16k to 16293 bytes, and for 32k to 32677 bytes. So, how can we insert more data into a row?

A new database configuration parameter, extendend_row_sz, has been added. For new databases the default is ENABLE, for migrated databases DISABLE (for backward compatibility). I created a new database and the extended row size support is enabled. Let's give it a try:

db2 => connect to hl

   Database Connection Information

Database server        = DB2/LINUX 10.5.0
SQL authorization ID   = HLOESER
Local database alias   = HL

db2 => create table ers(id int unique not null, firstname varchar(3000), lastname varchar(3000))
DB20000I The SQL command completed successfully.
db2 => update db cfg using extended_row_sz disable
DB20000I The UPDATE DATABASE CONFIGURATION command completed successfully.
db2 => create table ers2(id int unique not null, firstname varchar(3000), lastname varchar(3000))
DB21034E The command was processed as an SQL statement because it was not a
valid Command Line Processor command. During SQL processing it returned:
SQL0286N A table space could not be found with a page size of at least "8192"
that authorization ID "HLOESER" is authorized to use. SQLSTATE=42727
db2 => update db cfg using extended_row_sz enable
DB20000I The UPDATE DATABASE CONFIGURATION command completed successfully.
db2 => create table ers2(id int unique not null, firstname varchar(3000), lastname varchar(3000))
DB20000I The SQL command completed successfully.
db2 => insert into ers values(1,'Henrik','Loeser')
DB20000I The SQL command completed successfully.

The database configuration parameter can be changed online as can be seen above. Creation of the table ERS2 is blocked with extended row sizes DISABLED, then it works after enabling it again. Inserting some simple test data into the table succeeds. To test this new feature further, I created a data file with 3 rows (ID is 2 to 4). The reason for this type of test is the size limitation of the DB2 CLP and trying to avoid a GUI. The first row has two long strings with about 3000 bytes each, the second row a short and a long string, the last row a long and a short string. Importing the data is no problem:

db2 => import from "data.txt" of del insert into ers
SQL3109N The utility is beginning to load data from file "data.txt".

SQL3110N The utility has completed processing. "3" rows were read from the
input file.

SQL3221W ...Begin COMMIT WORK. Input Record Count = "3".

SQL3222W ...COMMIT of any database changes was successful.

SQL3149N "3" rows were processed from the input file. "3" rows were
successfully inserted into the table. "0" rows were rejected.

Number of rows read         = 3
Number of rows skipped      = 0
Number of rows inserted     = 3
Number of rows updated      = 0
Number of rows rejected     = 0
Number of rows committed    = 3

So how can we find out more about how this feature works? The documentation says that some data might be stored outside the row as LOB if the data is too big for the row. I am going to use db2dart for that. First I obtain the TABLE ID and TABLESPACE ID of our test table ERS:

db2 => select tableid,tbspaceid from syscat.tables where TABNAME='ERS'

TABLEID TBSPACEID
------- ---------
      8         2

1 record(s) selected.

The next step is to invoke db2dart from the shell for the database "hl":
hloeser@rotach:~$ db2dart hl /DD

Please enter
Table ID or name, tablespace ID, first page, num of pages, and y/n for verbose:
(may suffix page number with 'p' for pool relative if working with a pool-relative tablespace)
8 2 0 80 y

         The requested DB2DART processing has completed successfully!
                    Complete DB2DART report found in: HL.RPT

The "/DD" option direct db2dart to dump data pages. It prompts us to enter the IDs for the table and tablespace (which we know), the first page to dump (zero), the number of pages to dump (80), and whether verbose mode is preferred (yes). Based on our input a report file "HL.RPT" is generated. After some general information, free space control records, and a table description record, the first actual data records can be seen:
            Slot 4:

               Offset Location = 2850 (xB22)
               Record Length = 34 (x22)

               Record Type = Table Data Record (FIXEDVAR)

               Record Flags = 0

               Fixed part length value = 14

                  Column 1:
            Fixed offset: 0
                  Type is Long Integer
                  Value = 1

                  Column 2:
            Fixed offset: 4
                  Type is Variable Length Character String
                  Length = 6 Offset = 14
                      48656E72 696B                          Henrik

                  Column 3:
            Fixed offset: 9
                  Type is Variable Length Character String
                  Length = 6 Offset = 20
                      4C6F6573 6572                          Loeser

         Slots Summary: Total=5, In-use=5, Deleted=0.
This is our first row (ID 1) that we inserted, everything is stored in a single record, as expected and as usual. Reading on in the file, the other rows appear. I won't print them here for size reasons, but here is how they are stored.

ID 2, two long strings: first string in the record, second string outside the record
ID 3, short string, long string: both strings within a data page, single record because they fit
ID 4, long string and short string: both strings within a data page, single record because they fit

The column value for row with ID2 is stored outside the record as LOB. The db2dart output looks like this:

                  Column 3:
            Fixed offset: 9
                  Type is Variable Length Character String
                  Length = 24 Offset = 3020

                  Var Descriptor Length = 24
                     lfd_check = 89
                     lfd_version = 0
                     lfd_life_lsn = 0000000000064942
                     lfd_size = 3024
                     lfd_first = 2
                     lfd_dir = 0

         Slots Summary: Total=1, In-use=1, Deleted=0.
What can also be seen is that only a single slot is in use on this data page because of the record size.

I hope you found these details interesting and it encourages you to use db2dart to look into storage structures. When I visit universities, I always tell students to dig deeper into the implementation of database systems...

Tuesday, June 18, 2013

developerWorks article on setting up DB2 for encrypted communication with SSL

DB2 allows to encrypt data both in storage and in transit. Securing stored data makes sure that even if someone can access the disk, the actual data can not be read. Encrypting data in transit prevents sniffing, i.e., someone trying to listen to the communication channels and spy on the data (Footnote: this is not entirely true). There exist different options in DB2 for securing data on disk and during communication. SSL, the Secure Socket Layer, is a commonly used technology to encrypt communication, e.g., for Web access with the HTTPS protocol.

A new article has been published on developerWorks titled "Secure Sockets Layer (SSL) support in DB2 for Linux, UNIX, and Windows" that provides step by step instructions on how to setup SSL for use with DB2. If you haven't looked into that topic before, this is a good starter.

Monday, June 17, 2013

DB2 10.5 (Kepler) and Information Center available now

The latest release of DB2, version 10.5 which was code-named "Kepler", is available now. I recommend to visit the DB2 10.5 Information Center and then to read the "What's new" and "What's changed" sections. Significant changes are not only in terms of features and functionality, but also in the packaging.

The DB2 database product editions now include both "regular" and an advanced Enterprise Server and Workgroup Server editions (AESE, AWSE, ESE, WSE as acronyms to remember) plus the Express, Express-C, and Developer editions. As you can see from the features by DB2 edition overview, the database partitioning features ("shared nothing") is now included in both advanced editions. The Infosphere Warehouse edition is gone and it is named DB2 now - the name of its foundation for all the past years.

Technical highlights of this new DB2 release include the BLU acceleration (column-organized tables) for building in-memory Data Marts, many features for DB2 pureScale (including rolling fixpack updates and online topology changes to the cluster).

Pages