Showing posts with label workload. Show all posts
Showing posts with label workload. Show all posts

Wednesday, March 18, 2020

My best practices for home office - Corona edition

Take some rest
If you follow my blog, you may already know that I work from home since more than 12 years. Except for some business travel, I tend to do the "things" considered work from a dedicated room in my house in Germany. Over the past years, I tried to find the balance between being productive (team, customers, employer, ...), take care of my family, socialize where possible and staying sane and healthy. Here are some of my best practices. They may or may not work for you, but at least give you some inspiration.

Friday, February 28, 2020

Swashbooking for crowd-sourced book reviews and fun

Books for review
Usually, I don't go to book clubs or write book reviews. But yesterday evening was different with my first swashbooking session (German: Buchstrudeln). It is fast-paced book skimming and crowd-sourced book review combined. And a lot of fun. So what is it and what really did we do? Read on...

Wednesday, July 1, 2015

DB2 Battle: Optimization Profile vs. Statement Concentrator (Part 3)

db2expln output - statement concentrator active
In February I had fiddled with optimization profiles and REOPT working against the statement concentrator (part 2 is here). Today I wanted to give an interesting (hopefully) update. The tool db2expln can be used to describe the access plan for static and dynamic SQL statements. It works in a different way than db2exfmt and visual explain (part of tools like Data Studio and IBM Data Server Manager). Thus, I was eager to see whether it could help to find out whether my optimization profile was applied.


Friday, June 12, 2015

DB2 pureScale for the Masses

DB2 pureScale Cluster (nanoCluster)
It is almost four years since I wrote about my exciting tests with a DB2 pureScale demo cluster (see picture). At that time the pureScale feature was still fairly new and was supported on a limited choice of hardware/software. Since then the requirements to put DB2 pureScale into production have been reduced dramatically and at the same time many useful features have been added. Let me give a quick overview of why customers choose DB2 with pureScale for both scaling out systems as well as a consolidation platform. Today, DB2 pureScale really is something for the masses...

The following is an unsorted and incomplete list of features and good-to-know items that come to mind when talking about DB2 pureScale:

All of the above boils down to cost savings and higher flexibility, an important driver behind opting for pureScale. Having said that I need to mention that earlier this week I decided against pureScale. I gave a nanoCluster, the same type as pictured above, away to make some room in my home office.

BTW: I have been asked whether a DB2 pureScale cluster can brew a good coffee. What would be your answer...?

Friday, February 27, 2015

DB2 Battle: Optimization Profile vs. Statement Concentrator (Part 2)

Today I wanted to try out using DB2 optimization profiles for a statement impacted by the statement concentrator. In part 1 I gave the background, showed how I created an optimization profile and that a simple explain statement didn't return what I expected. In this final part I am going to look at DB2 section actuals to hopefully proof that my optimization guideline works as I had hoped.

Because all my "explain plan for select ..." statements resulted in the same access plan, I decided to use session actuals to look at how statements are really executed within DB2. The actuals are kind of a live log of the real statement execution costs and the applied access plan. The first step towards session actuals is to have a workload and an activity event monitor and to switch the monitor on:


db2 "create workload betw applname('python') collect activity data with details,section"
db2 "create event monitor betwmon for activities write to table"
db2 "set event monitor betwmon state 1"

The above statements create a workload which collects section data. The application name (APPLNAME) is "python" because I use a Python script (see below) for parts of the testing.


Script:
import ibm_db
conn = ibm_db.connect("hltest","hloeser","secretpw")
ibm_db.exec_immediate(conn, 'set current optimization profile="HLOESER"."PROFILE_BETW"')
ibm_db.exec_immediate(conn, 'select id, s from betw where id between 2 and 20')



Wednesday, August 13, 2014

Using some Workload Management for free in non-Advanced Editions of DB2

One of the new features of DB2 10.5 is BLU Acceleration. In introduces a couple of default Workload Management objects that are intended to control heavy queries running against column-organized tables. The objects are automatically created with every database, independent of the product edition. They are only enabled when DB2_WORKLOAD has been set to ANALYTICS before creating the database, i.e., a database for in-memory analytics is set up. But what is available for the regular guy like myself? What can be used for free and as foundation for some monitoring and understanding the system workload? Let's take a look.


Typically I use a DB2 Developer Edition which includes all features including WLM. So I removed the db2de license and organized (being IBMer has some benefits!) a Workgroup Server Edition (db2wse) which I added to the system using db2licm. I also turned on hard license enforcement, so that any attempts of using an unlicensend feature are directly blocked. Here is what db2licm returned thereafter:

mymachine> db2licm -l

Product name:                     "DB2 Workgroup Server Edition"
License type:                     "Authorized User Single Install"
Expiry date:                      "Permanent"
Product identifier:               "db2wse"
Version information:              "10.5"
Max amount of memory (GB):        "128"
Enforcement policy:               "Hard Stop"
Number of licensed authorized users: "25"

With that in place I created a new database named WLMTEST and connected to it. My first test was to create a workload object which should not be possible given my DB2 edition:

DB: WLMTEST => create workload freeride applname('xploit')
DB21034E  The command was processed as an SQL statement because it was not a
valid Command Line Processor command.  During SQL processing it returned:
SQL8029N  A valid license key was not found for the requested functionality.
Reference numbers: "".


Ok, this look right. I don't have a license to use DB2 WLM (Workload Manager). My next query was intended to check what service classes are present in my system.

DB: WLMTEST => select varchar(serviceclassname,30), varchar(parentserviceclassname,30), enabled from syscat.serviceclasses

1                              2                              ENABLED
------------------------------ ------------------------------ -------
SYSDEFAULTSUBCLASS             SYSDEFAULTSYSTEMCLASS          Y     
SYSDEFAULTSUBCLASS             SYSDEFAULTMAINTENANCECLASS     Y     
SYSDEFAULTSUBCLASS             SYSDEFAULTUSERCLASS            Y     
SYSDEFAULTMANAGEDSUBCLASS      SYSDEFAULTUSERCLASS            Y     
SYSDEFAULTSYSTEMCLASS          -                              Y     
SYSDEFAULTMAINTENANCECLASS     -                              Y     
SYSDEFAULTUSERCLASS            -                              Y     

  7 record(s) selected.


The DB2 Knowledge Center has an overview of related default WLM objects and which parts can be modified with DBADM or WLMADM authority. Having the names of the system objects I tried my luck altering a work class set to reduce the cost barrier for the managed heavy queries (SYSMANAGEDQUERIES):

DB: WLMTEST => alter work class set sysdefaultuserwcs alter work class SYSMANAGEDQUERIES for timeroncost from 1000   
DB20000I  The SQL command completed successfully.


The threshold SYSDEFAULTCONCURRENT defines how many of those queries can run concurrently in the system. Why not change that threshold definition?

DB: WLMTEST => alter threshold SYSDEFAULTCONCURRENT when sqlrowsreturned > 20 stop execution
DB21034E  The command was processed as an SQL statement because it was not a
valid Command Line Processor command.  During SQL processing it returned:
SQL4721N  The threshold "SYSDEFAULTCONCURRENT" cannot be created or altered 
(reason code = "7").  SQLSTATE=5U037


Well, it seems that you cannot modify the entire threshold to your liking. However, following the documentation on what can be done, I successfully reduced the number of parallel activities.

DB: WLMTEST => alter threshold SYSDEFAULTCONCURRENT when CONCURRENTDBCOORDACTIVITIES > 3 stop execution
DB20000I  The SQL command completed successfully.


To test the impact of my changes, I opened 4 different shells, connected to DB2 in each window, and more or less simultaneously executed the following query:

select * from syscat.tables,syscat.columns

I have to restate that I tried to execute it in all four windows. It only ran in three of them. Why? Because the threshold kicked in for this heavy query and stopped the execution for the 4th session ("concurrentdbcoordactivities> 3 stop execution"). So some basic workload management seems to work even without a license.

Can I change the threshold to force the application off, i.e., to not allow running the query?

DB: WLMTEST => alter threshold SYSDEFAULTCONCURRENT when CONCURRENTDBCOORDACTIVITIES > 2 force application
DB21034E  The command was processed as an SQL statement because it was not a
valid Command Line Processor command.  During SQL processing it returned:
SQL4721N  The threshold "SYSDEFAULTCONCURRENT" cannot be created or altered 
(reason code = "13").  SQLSTATE=5U037

No, changing the entire definition of the threshold is not possible, but at least parts of it can be modified. You can then use the adapted default WLM objects to better understand what work is running on your system, e.g., testing what would fall into the category of "heavy queries". As a last step, I used a monitoring function to return the CPU time spent by service subclass. Most was in the managed subclass into which my queries from above were mapped:

DB: WLMTEST => SELECT varchar(service_superclass_name,30) as service_superclass, varchar(service_subclass_name,30) as service_subclass, sum(total_cpu_time) as total_cpu, sum(app_rqsts_completed_total) as total_rqsts FROM TABLE(MON_GET_SERVICE_SUBCLASS('','',-2)) AS t GROUP BY service_superclass_name, service_subclass_name ORDER BY total_cpu desc

SERVICE_SUPERCLASS             SERVICE_SUBCLASS               TOTAL_CPU            TOTAL_RQSTS        
------------------------------ ------------------------------ -------------------- --------------------
SYSDEFAULTUSERCLASS            SYSDEFAULTSUBCLASS                          1207794                  552
SYSDEFAULTUSERCLASS            SYSDEFAULTMANAGEDSUBCLASS                    852547                    0
SYSDEFAULTMAINTENANCECLASS     SYSDEFAULTSUBCLASS                           466436                 1374
SYSDEFAULTSYSTEMCLASS          SYSDEFAULTSUBCLASS                                0                    0

  4 record(s) selected.


With that I leave more testing to you. Happy monitoring!

BTW: The same tests can also be done on the SQL DB service on IBM Bluemix. That service is a DB2 Enterprise Server Edition.

Monday, July 28, 2014

What's Going On? - DB2 Workload Management: Identification of Activities

In an earlier blog post I had written about why Workload Management is needed. It's not just something for the database system or on the operating system level, it is really useful and done in "real life". But what is managed in the system, how are you able to identify activies in a DB2 database system? I am going to explain that today.

Before I dig deeper into the identification, first we need to clarify what is meant with "activity". It could be almost anything going on in the database system that is related to a single database and could be both user- and system-related tasks. The important distinction is that is on the database level, not for a DB2 instance. Identification of activities deals with three questions: WHO is doing WHAT on my database and WHERE is that data located?

The WHO can be answered by looking at the connection properties such as:
  • Who is the user and which group does the user belong to? 
  • Is the user operating in a special role?
  • From where is the user connecting, does the machine have a name, is it from a specific application?
In DB2 these properties can be checked and used for identification of an activity by defining a workload (CREATE WORKLOAD). The workload object deals with the WHO. The WHAT and WHERE aspect of identification are handled by work classes. They are defined as part of a so-called work class set (CREATE WORK CLASS SET). Each work class can be used to identify an activity based on the type of work it is performing on the database and its related costs (the WHAT part). The type of work could be any combination of LOADing data, defining, altering or dropping objects (DDL), or read or write operations as part of select, insert, update, and delete statements (DML). The cost is what has been estimated by the DB2 compiler/optimizer and is the expected overall execution cost (timeron cost) or the cardinality (how many rows are we expecting in the result). Note that the actual cost of an activity is dealt with as part of controlling and managing the activities which I plan to describe in a later article.

By specifying a "data tag" for a work class, it can be related to storage groups or tablespaces and their priority. This is how activity can be identified by from WHERE the data is processed.

Because multiple work classes in a work class set could identify the same activity, the individual work classes can be ordered/positioned within the set. That way a work class with several properties could pick a very specific activity whereas other activities would be mapped to more general work classes.

Using the concepts of WORKLOAD and WORK CLASS SET it is possible to identify an activity. They help to understand what is going on in the DB2 database system. It is the prerequisite for actively controlling and managing the activities in the system by assigning resources.

Tuesday, June 24, 2014

Why we need and have workload management

Wikipedia
While working in your office a rare visitor from another location stops by. Time for a break to connect on the latest gossip, but not too long. On the way back to your office your boss asks you to call someone from the client team to clarify some technical issues and you have to squeeze it in between two important customer calls. And you just received a text message that your wife cannot pick up the kids and you need to leave on time this afternoon to do it. Workload Management (WLM) in real life. Everybody seems to be doing WLM, some better, some not so well. And there are many unwritten rules.

In a database system like DB2 there is also a built-in Workload Management. If you are using BLU Acceleration, it is activated by default and some rules have been defined, else it is switched off. Why turn it on and use it? Same reasons as in real life:
  • A "fair" allocation of time and resources between different work items/applications is needed ("work / life balancing"?).
  • Response time for critical tasks or some type of work is important and needs to be protected against less important tasks ("your mother-in-law visits, take care of her").
  • Implementation of rules to control and regulate the system behavior ("kids in bed by 8pm means time for you to watch soccer").
  • Deal with rogue queries that threaten regular operations ("kids bring over the entire neighborhood").
  • The system (sometimes) is overloaded and you have to set priorities ("no party this weekend").
All this can be done with the DB2 Workload Manager. It allows to identify different types of activities (work), manage them based on rules that govern available resources and set controls, and to monitor the system behavior. The database workload manager can be integrated with the operating system (OS) workload manager on AIX and Linux. This is especially useful when more than a single database is active and resources need to be controlled on a higher level ("sync your family calendar with the grandparents").

Does Workload Management help? Yes, it does. However, similar to family life it is possible that because of resource shortage not all planned tasks can be performed. Maybe time for an upgrade ("hire some help, do not get more kids... :)").

I plan to discuss DB2 WLM details in future articles, workload permitting...