![]() |
Take some rest |
Henrik's thoughts on life in IT, data and information management, cloud computing, cognitive computing, covering IBM Db2, IBM Cloud, Watson, Amazon Web Services, Microsoft Azure and more.
Showing posts with label workload. Show all posts
Showing posts with label workload. Show all posts
Wednesday, March 18, 2020
My best practices for home office - Corona edition
Friday, February 28, 2020
Swashbooking for crowd-sourced book reviews and fun
![]() |
Books for review |
Wednesday, July 1, 2015
DB2 Battle: Optimization Profile vs. Statement Concentrator (Part 3)
![]() |
db2expln output - statement concentrator active |
Labels:
administration,
best practices,
DB2,
dba,
IT,
monitoring,
performance,
process model,
sql,
version 10.5,
workload
Friday, June 12, 2015
DB2 pureScale for the Masses
DB2 pureScale Cluster (nanoCluster) |
The following is an unsorted and incomplete list of features and good-to-know items that come to mind when talking about DB2 pureScale:
- License: One of my favorite pages in the DB2 Knowledge Center is the "Functionality in DB2 product editions and DB2 offerings". It shows that pureScale is part of or can be purchased as add-on (Business Application Continuity/BAC Offering) for almost all DB2 editions.
- Application Transparency: How about writing a pureScale-enabled application without knowing about it? There are no special needs to be considered when writing an application to be run on a pureScale cluster.
- Cluster Size and Scale-Out: You can start small and increase the cluster size, even online, depending on your needs.
- Availability: DB2 supports continuous availability of the pureScale cluster by its system design and by features like rolling fixpack updates. Depending on requirements the cluster can span data centers and it is then called Geographically Dispersed pureScale Cluster (GDPC or "stretch cluster"). Basically, if one data center becomes unavailable, DB2 databases continue to be available and are managed from the hardware in the second data center.
Two pureScale clusters can be linked by, e.g., the built-in HADR feature where transaction logs are shipped to and applied to the standby-cluster, increasing availability even further.
For increased availability network adapters on each machine and network switches can be redundant. - Hardware/OS: DB2 runs on the POWER and Intel platform on AIX, Red Hat, and SuSE operatings. Infiniband adapters can be used for highest network performance, but even the GDPC/stretch cluster version of DB2 pureScale only requires a regular TCP/IP network ("vanilla ethernet"). If you don't want to run pureScale on dedicated hardware, no problem, virtual machines (VMware and KVM) and even VM mobilitity are supported, too.
- Smart Workload Management and Processing: Depending on requirements and available resources, applications or specific workloads can be tied to a single node or to a group of computers in the pureScale cluster. This is great for consolidation. Workload balancing distributes the load among the nodes within a group. Over the past years several performance enhancements have also been added to DB2 that cater to consolidation scenarios.
BTW: I have been asked whether a DB2 pureScale cluster can brew a good coffee. What would be your answer...?
Labels:
administration,
consolidation,
DB2,
fixpack,
IT,
knowledge center,
performance,
pureScale,
version 10.5,
workload
Friday, February 27, 2015
DB2 Battle: Optimization Profile vs. Statement Concentrator (Part 2)
Today I wanted to try out using DB2 optimization profiles for a statement impacted by the statement concentrator. In part 1 I gave the background, showed how I created an optimization profile and that a simple explain statement didn't return what I expected. In this final part I am going to look at DB2 section actuals to hopefully proof that my optimization guideline works as I had hoped.
Because all my "explain plan for select ..." statements resulted in the same access plan, I decided to use session actuals to look at how statements are really executed within DB2. The actuals are kind of a live log of the real statement execution costs and the applied access plan. The first step towards session actuals is to have a workload and an activity event monitor and to switch the monitor on:
db2 "create workload betw applname('python') collect activity data with details,section"
db2 "create event monitor betwmon for activities write to table"
db2 "set event monitor betwmon state 1"
The above statements create a workload which collects section data. The application name (APPLNAME) is "python" because I use a Python script (see below) for parts of the testing.
Script:
import ibm_db
conn = ibm_db.connect("hltest","hloeser","secretpw")
ibm_db.exec_immediate(conn, 'set current optimization profile="HLOESER"."PROFILE_BETW"')
ibm_db.exec_immediate(conn, 'select id, s from betw where id between 2 and 20')
Because all my "explain plan for select ..." statements resulted in the same access plan, I decided to use session actuals to look at how statements are really executed within DB2. The actuals are kind of a live log of the real statement execution costs and the applied access plan. The first step towards session actuals is to have a workload and an activity event monitor and to switch the monitor on:
db2 "create workload betw applname('python') collect activity data with details,section"
db2 "create event monitor betwmon for activities write to table"
db2 "set event monitor betwmon state 1"
The above statements create a workload which collects section data. The application name (APPLNAME) is "python" because I use a Python script (see below) for parts of the testing.
Script:
import ibm_db
conn = ibm_db.connect("hltest","hloeser","secretpw")
ibm_db.exec_immediate(conn, 'set current optimization profile="HLOESER"."PROFILE_BETW"')
ibm_db.exec_immediate(conn, 'select id, s from betw where id between 2 and 20')
Labels:
administration,
best practices,
DB2,
dba,
IT,
monitoring,
performance,
process model,
sql,
version 10.5,
workload
Wednesday, August 13, 2014
Using some Workload Management for free in non-Advanced Editions of DB2
One of the new features of DB2 10.5 is BLU Acceleration. In introduces a couple of default Workload Management objects that are intended to control heavy queries running against column-organized tables. The objects are automatically created with every database, independent of the product edition. They are only enabled when DB2_WORKLOAD has been set to ANALYTICS before creating the database, i.e., a database for in-memory analytics is set up. But what is available for the regular guy like myself? What can be used for free and as foundation for some monitoring and understanding the system workload? Let's take a look.
Typically I use a DB2 Developer Edition which includes all features including WLM. So I removed the db2de license and organized (being IBMer has some benefits!) a Workgroup Server Edition (db2wse) which I added to the system using db2licm. I also turned on hard license enforcement, so that any attempts of using an unlicensend feature are directly blocked. Here is what db2licm returned thereafter:
mymachine> db2licm -l
Product name: "DB2 Workgroup Server Edition"
License type: "Authorized User Single Install"
Expiry date: "Permanent"
Product identifier: "db2wse"
Version information: "10.5"
Max amount of memory (GB): "128"
Enforcement policy: "Hard Stop"
Number of licensed authorized users: "25"
With that in place I created a new database named WLMTEST and connected to it. My first test was to create a workload object which should not be possible given my DB2 edition:
DB: WLMTEST => create workload freeride applname('xploit')
DB21034E The command was processed as an SQL statement because it was not a
valid Command Line Processor command. During SQL processing it returned:
SQL8029N A valid license key was not found for the requested functionality.
Reference numbers: "".
Ok, this look right. I don't have a license to use DB2 WLM (Workload Manager). My next query was intended to check what service classes are present in my system.
DB: WLMTEST => select varchar(serviceclassname,30), varchar(parentserviceclassname,30), enabled from syscat.serviceclasses
1 2 ENABLED
------------------------------ ------------------------------ -------
SYSDEFAULTSUBCLASS SYSDEFAULTSYSTEMCLASS Y
SYSDEFAULTSUBCLASS SYSDEFAULTMAINTENANCECLASS Y
SYSDEFAULTSUBCLASS SYSDEFAULTUSERCLASS Y
SYSDEFAULTMANAGEDSUBCLASS SYSDEFAULTUSERCLASS Y
SYSDEFAULTSYSTEMCLASS - Y
SYSDEFAULTMAINTENANCECLASS - Y
SYSDEFAULTUSERCLASS - Y
7 record(s) selected.
The DB2 Knowledge Center has an overview of related default WLM objects and which parts can be modified with DBADM or WLMADM authority. Having the names of the system objects I tried my luck altering a work class set to reduce the cost barrier for the managed heavy queries (SYSMANAGEDQUERIES):
DB: WLMTEST => alter work class set sysdefaultuserwcs alter work class SYSMANAGEDQUERIES for timeroncost from 1000
DB20000I The SQL command completed successfully.
The threshold SYSDEFAULTCONCURRENT defines how many of those queries can run concurrently in the system. Why not change that threshold definition?
DB: WLMTEST => alter threshold SYSDEFAULTCONCURRENT when sqlrowsreturned > 20 stop execution
DB21034E The command was processed as an SQL statement because it was not a
valid Command Line Processor command. During SQL processing it returned:
SQL4721N The threshold "SYSDEFAULTCONCURRENT" cannot be created or altered
(reason code = "7"). SQLSTATE=5U037
Well, it seems that you cannot modify the entire threshold to your liking. However, following the documentation on what can be done, I successfully reduced the number of parallel activities.
DB: WLMTEST => alter threshold SYSDEFAULTCONCURRENT when CONCURRENTDBCOORDACTIVITIES > 3 stop execution
DB20000I The SQL command completed successfully.
To test the impact of my changes, I opened 4 different shells, connected to DB2 in each window, and more or less simultaneously executed the following query:
select * from syscat.tables,syscat.columns
I have to restate that I tried to execute it in all four windows. It only ran in three of them. Why? Because the threshold kicked in for this heavy query and stopped the execution for the 4th session ("concurrentdbcoordactivities> 3 stop execution"). So some basic workload management seems to work even without a license.
Can I change the threshold to force the application off, i.e., to not allow running the query?
DB: WLMTEST => alter threshold SYSDEFAULTCONCURRENT when CONCURRENTDBCOORDACTIVITIES > 2 force application
DB21034E The command was processed as an SQL statement because it was not a
valid Command Line Processor command. During SQL processing it returned:
SQL4721N The threshold "SYSDEFAULTCONCURRENT" cannot be created or altered
(reason code = "13"). SQLSTATE=5U037
No, changing the entire definition of the threshold is not possible, but at least parts of it can be modified. You can then use the adapted default WLM objects to better understand what work is running on your system, e.g., testing what would fall into the category of "heavy queries". As a last step, I used a monitoring function to return the CPU time spent by service subclass. Most was in the managed subclass into which my queries from above were mapped:
DB: WLMTEST => SELECT varchar(service_superclass_name,30) as service_superclass, varchar(service_subclass_name,30) as service_subclass, sum(total_cpu_time) as total_cpu, sum(app_rqsts_completed_total) as total_rqsts FROM TABLE(MON_GET_SERVICE_SUBCLASS('','',-2)) AS t GROUP BY service_superclass_name, service_subclass_name ORDER BY total_cpu desc
SERVICE_SUPERCLASS SERVICE_SUBCLASS TOTAL_CPU TOTAL_RQSTS
------------------------------ ------------------------------ -------------------- --------------------
SYSDEFAULTUSERCLASS SYSDEFAULTSUBCLASS 1207794 552
SYSDEFAULTUSERCLASS SYSDEFAULTMANAGEDSUBCLASS 852547 0
SYSDEFAULTMAINTENANCECLASS SYSDEFAULTSUBCLASS 466436 1374
SYSDEFAULTSYSTEMCLASS SYSDEFAULTSUBCLASS 0 0
4 record(s) selected.
With that I leave more testing to you. Happy monitoring!
BTW: The same tests can also be done on the SQL DB service on IBM Bluemix. That service is a DB2 Enterprise Server Edition.
Typically I use a DB2 Developer Edition which includes all features including WLM. So I removed the db2de license and organized (being IBMer has some benefits!) a Workgroup Server Edition (db2wse) which I added to the system using db2licm. I also turned on hard license enforcement, so that any attempts of using an unlicensend feature are directly blocked. Here is what db2licm returned thereafter:
mymachine> db2licm -l
Product name: "DB2 Workgroup Server Edition"
License type: "Authorized User Single Install"
Expiry date: "Permanent"
Product identifier: "db2wse"
Version information: "10.5"
Max amount of memory (GB): "128"
Enforcement policy: "Hard Stop"
Number of licensed authorized users: "25"
With that in place I created a new database named WLMTEST and connected to it. My first test was to create a workload object which should not be possible given my DB2 edition:
DB: WLMTEST => create workload freeride applname('xploit')
DB21034E The command was processed as an SQL statement because it was not a
valid Command Line Processor command. During SQL processing it returned:
SQL8029N A valid license key was not found for the requested functionality.
Reference numbers: "".
Ok, this look right. I don't have a license to use DB2 WLM (Workload Manager). My next query was intended to check what service classes are present in my system.
DB: WLMTEST => select varchar(serviceclassname,30), varchar(parentserviceclassname,30), enabled from syscat.serviceclasses
1 2 ENABLED
------------------------------ ------------------------------ -------
SYSDEFAULTSUBCLASS SYSDEFAULTSYSTEMCLASS Y
SYSDEFAULTSUBCLASS SYSDEFAULTMAINTENANCECLASS Y
SYSDEFAULTSUBCLASS SYSDEFAULTUSERCLASS Y
SYSDEFAULTMANAGEDSUBCLASS SYSDEFAULTUSERCLASS Y
SYSDEFAULTSYSTEMCLASS - Y
SYSDEFAULTMAINTENANCECLASS - Y
SYSDEFAULTUSERCLASS - Y
7 record(s) selected.
The DB2 Knowledge Center has an overview of related default WLM objects and which parts can be modified with DBADM or WLMADM authority. Having the names of the system objects I tried my luck altering a work class set to reduce the cost barrier for the managed heavy queries (SYSMANAGEDQUERIES):
DB: WLMTEST => alter work class set sysdefaultuserwcs alter work class SYSMANAGEDQUERIES for timeroncost from 1000
DB20000I The SQL command completed successfully.
The threshold SYSDEFAULTCONCURRENT defines how many of those queries can run concurrently in the system. Why not change that threshold definition?
DB: WLMTEST => alter threshold SYSDEFAULTCONCURRENT when sqlrowsreturned > 20 stop execution
DB21034E The command was processed as an SQL statement because it was not a
valid Command Line Processor command. During SQL processing it returned:
SQL4721N The threshold "SYSDEFAULTCONCURRENT" cannot be created or altered
(reason code = "7"). SQLSTATE=5U037
Well, it seems that you cannot modify the entire threshold to your liking. However, following the documentation on what can be done, I successfully reduced the number of parallel activities.
DB: WLMTEST => alter threshold SYSDEFAULTCONCURRENT when CONCURRENTDBCOORDACTIVITIES > 3 stop execution
DB20000I The SQL command completed successfully.
To test the impact of my changes, I opened 4 different shells, connected to DB2 in each window, and more or less simultaneously executed the following query:
select * from syscat.tables,syscat.columns
I have to restate that I tried to execute it in all four windows. It only ran in three of them. Why? Because the threshold kicked in for this heavy query and stopped the execution for the 4th session ("concurrentdbcoordactivities> 3 stop execution"). So some basic workload management seems to work even without a license.
Can I change the threshold to force the application off, i.e., to not allow running the query?
DB: WLMTEST => alter threshold SYSDEFAULTCONCURRENT when CONCURRENTDBCOORDACTIVITIES > 2 force application
DB21034E The command was processed as an SQL statement because it was not a
valid Command Line Processor command. During SQL processing it returned:
SQL4721N The threshold "SYSDEFAULTCONCURRENT" cannot be created or altered
(reason code = "13"). SQLSTATE=5U037
No, changing the entire definition of the threshold is not possible, but at least parts of it can be modified. You can then use the adapted default WLM objects to better understand what work is running on your system, e.g., testing what would fall into the category of "heavy queries". As a last step, I used a monitoring function to return the CPU time spent by service subclass. Most was in the managed subclass into which my queries from above were mapped:
DB: WLMTEST => SELECT varchar(service_superclass_name,30) as service_superclass, varchar(service_subclass_name,30) as service_subclass, sum(total_cpu_time) as total_cpu, sum(app_rqsts_completed_total) as total_rqsts FROM TABLE(MON_GET_SERVICE_SUBCLASS('','',-2)) AS t GROUP BY service_superclass_name, service_subclass_name ORDER BY total_cpu desc
SERVICE_SUPERCLASS SERVICE_SUBCLASS TOTAL_CPU TOTAL_RQSTS
------------------------------ ------------------------------ -------------------- --------------------
SYSDEFAULTUSERCLASS SYSDEFAULTSUBCLASS 1207794 552
SYSDEFAULTUSERCLASS SYSDEFAULTMANAGEDSUBCLASS 852547 0
SYSDEFAULTMAINTENANCECLASS SYSDEFAULTSUBCLASS 466436 1374
SYSDEFAULTSYSTEMCLASS SYSDEFAULTSUBCLASS 0 0
4 record(s) selected.
With that I leave more testing to you. Happy monitoring!
BTW: The same tests can also be done on the SQL DB service on IBM Bluemix. That service is a DB2 Enterprise Server Edition.
Labels:
administration,
blu,
bluemix,
DB2,
ibmcloud,
IT,
monitoring,
performance,
version 10.5,
workload
Monday, July 28, 2014
What's Going On? - DB2 Workload Management: Identification of Activities
In an earlier blog post I had written about why Workload Management is needed. It's not just something for the database system or on the operating system level, it is really useful and done in "real life". But what is managed in the system, how are you able to identify activies in a DB2 database system? I am going to explain that today.
Before I dig deeper into the identification, first we need to clarify what is meant with "activity". It could be almost anything going on in the database system that is related to a single database and could be both user- and system-related tasks. The important distinction is that is on the database level, not for a DB2 instance. Identification of activities deals with three questions: WHO is doing WHAT on my database and WHERE is that data located?
The WHO can be answered by looking at the connection properties such as:
By specifying a "data tag" for a work class, it can be related to storage groups or tablespaces and their priority. This is how activity can be identified by from WHERE the data is processed.
Because multiple work classes in a work class set could identify the same activity, the individual work classes can be ordered/positioned within the set. That way a work class with several properties could pick a very specific activity whereas other activities would be mapped to more general work classes.
Using the concepts of WORKLOAD and WORK CLASS SET it is possible to identify an activity. They help to understand what is going on in the DB2 database system. It is the prerequisite for actively controlling and managing the activities in the system by assigning resources.
Before I dig deeper into the identification, first we need to clarify what is meant with "activity". It could be almost anything going on in the database system that is related to a single database and could be both user- and system-related tasks. The important distinction is that is on the database level, not for a DB2 instance. Identification of activities deals with three questions: WHO is doing WHAT on my database and WHERE is that data located?
The WHO can be answered by looking at the connection properties such as:
- Who is the user and which group does the user belong to?
- Is the user operating in a special role?
- From where is the user connecting, does the machine have a name, is it from a specific application?
By specifying a "data tag" for a work class, it can be related to storage groups or tablespaces and their priority. This is how activity can be identified by from WHERE the data is processed.
Because multiple work classes in a work class set could identify the same activity, the individual work classes can be ordered/positioned within the set. That way a work class with several properties could pick a very specific activity whereas other activities would be mapped to more general work classes.
Using the concepts of WORKLOAD and WORK CLASS SET it is possible to identify an activity. They help to understand what is going on in the DB2 database system. It is the prerequisite for actively controlling and managing the activities in the system by assigning resources.
Labels:
administration,
best practices,
data in action,
DB2,
IT,
performance,
version 10,
version 10.5,
workload
Tuesday, June 24, 2014
Why we need and have workload management
Wikipedia |
In a database system like DB2 there is also a built-in Workload Management. If you are using BLU Acceleration, it is activated by default and some rules have been defined, else it is switched off. Why turn it on and use it? Same reasons as in real life:
- A "fair" allocation of time and resources between different work items/applications is needed ("work / life balancing"?).
- Response time for critical tasks or some type of work is important and needs to be protected against less important tasks ("your mother-in-law visits, take care of her").
- Implementation of rules to control and regulate the system behavior ("kids in bed by 8pm means time for you to watch soccer").
- Deal with rogue queries that threaten regular operations ("kids bring over the entire neighborhood").
- The system (sometimes) is overloaded and you have to set priorities ("no party this weekend").
Does Workload Management help? Yes, it does. However, similar to family life it is possible that because of resource shortage not all planned tasks can be performed. Maybe time for an upgrade ("hire some help, do not get more kids... :)").
I plan to discuss DB2 WLM details in future articles, workload permitting...
Labels:
administration,
best practices,
data in action,
DB2,
IT,
knowledge center,
Life,
version 10.5,
workload
Subscribe to:
Posts (Atom)