Data Henrik: data in action

Showing posts with label data in action. Show all posts

Thursday, February 6, 2025

Db2: Loading and unloading of data on remote clients with external tables

Unload Db2 data to remote client

Today, I am going to take a look at another variation of external tables. They can be used to load data from a remote Db2 client or unload data to that client - directly from or into a file. I tested it with the Db2 CLP and with a Python script and it was quite easy.

Once again: Db2 External Tables

Db2 log files and data on IBM COS

Today, I once again share some notes about Db2 External Tables. Recently, I set up Db2 12.1 Community Edition and had some time to "play" with external tables, multi-tenancy, and catalog tables. As this blog serves me as external memory :), here is what I need to remember...

A look at local external tables in Db2

Surprising Db2 results or not?

My recent post about Db2 backup to S3 or Object Storage triggered a question. And the answer to it is that I wrote about external tables first in 2017, about 7 years ago. Since then, some features were added or changed for external tables and remote storage support, most recently in Db2 11.5.7 (AFAIK). With my new Db2 playground in place, I thought it was nice to test external tables again.

About BIRD, SQL, IBM granite models, and your business reporting

BIRD-SQL benchmark

Some years ago, when composing SQL queries, I was hoping that those queries would just "fly", performing flawlessly and quickly. Now, I stumbled over something SQL-related that seems to fly: BIRD-bench. It measures Large Language Models' (LLMs) capabilities to generate SQL queries from text input. It is at the core of SQL: You describe the result set you need.

Spreadsheets: How to excel with Db2 data

Generated chart in Excel file

Recently, I had to produce a spreadsheet from database data. One naive way is to export the data to a CSV file, then import the data from that file into the spreadsheet. Another option was to quickly script a small Python program that fetches the data and directly generates a Microsoft Excel file. As a bonus, I added even a line chart (as shown).

Unicode string length, code points, and Db2

Byte length of (Unicode) strings

After my recent blog post "🎃 Unicode characters and Db2 🕸️ 🏚️", I had some follow-up discussions. One was around how to determine the Unicode UTF-8 byte length of strings in a non-Unicode Db2 database. There were solutions proposed that required data export to analyze the data externally or to implement some functions or procedures. I insisted that there is an SQL-based solution. Here is my proposal.

Finally together: Db2 and Zeppelin

United: Db2 and Zeppelin

If you followed my blog, you may have noticed that I wrote about Db2 and about Zeppelins in the past - but not together. Today, I am going to discuss how I configured a JDBC interpreter in an Apache Zeppelin notebook to connect to a Db2 on Cloud database. So, finally, within a single blog post, I can talk about both of them. Let's get started.

New IBM Cloud security features you should know

Custom role for operating Code Engine

Cybersecurity is always in a state of change. There are new security features and new attack methods - or was it the other way? Over the past months, several new security features were added to IBM Cloud. In this blog post, I want to quickly describe them and point you to IBM Cloud blog posts where I discuss them in greater detail.

Rate-limit Kafka event generation with kcat and bash

Traffic for event streams

Recently, I worked with IBM Cloud Event Streams which is a message bus built with Apache Kafka. I was looking for a simple command-line tool to test my Event Streams instance and to stream access logs into it. That's when I ran into kcat (formerly known as kafkacat). It is a generic command line Kafka producer and consumer and easy to install - just use a Docker image. All worked well, I could even read a file of historic Apache access logs and, line by line, send them over. But I still faced the issue of controlling how much to send, how to throttle it. I solved it using a bash script.

On serverless data scraping, cloud object storage, MinIO and rclone

Building a data lake the serverless way

This fall, I started another side project. It involves mobility data and its analysis and visualization. As first step and still ongoing, I am building up a data lake (and maybe integrating it into a data fabric). For that, I regularly download data from various sources, then upload it to a Cloud Object Storage/S3 bucket. Once in cloud storage, I process it further, often involving tools like the MinIO client and rclone. Let me give you a quick overview and links to some more background reading...

Some advanced SQL to analyze COVID-19 data

Learn to write SQL

All of us are impacted by COVID-19. Looking at daily case numbers, the basic reproduction number or mortality rates has become a routine. But what is behind those numbers? How are they computed and how does SQL help? In this post, I want to discuss how some SQL analytics clauses help to dig through the data. When I teach database systems, I always try to show how it applies to life. Here, SQL may not directly help to cure the disease, but SQL is essential to compute and understand the statistics.

25th meeting of German Db2 User Group

Celebrating the 25th DeDUG meeting

Last Friday, the 25th meeting of the German Db2 User Group was held, this time hosted by Fiducia & GAD IT AG in Karlsruhe. As usual, it was both a great networking and informative event. I had the opportunity to catch up with many Db2 users - customers, partner and IBMers. Moreover, similar to a couple of previous meetings, I had the after-lunch talk, this time featuring Db2 on Cloud security topics.

Cloud-based FIPS 140-2 Level 4 crypto service

Locks, keys, and data security

Yesterday, I got my hands on a FIPS 140-2 Level 4 cloud-based crypto service. If you are asking "what's that and what can I do with it?" you should continue to read. It is a dedicated HSM (Hardware Security Module) to manage encryption keys and offered as IBM Hyper Protect Crypto Services on IBM Cloud (HPCS). Here is what I learned while using the service.

Updated tutorial: Database-driven chatbot

If you want to build a chatbot that gets its content from a database, there is a good news. The existing tutorial “Build a database-driven Slackbot” was just updated to adapt to latest features of IBM Watson Assistant. First, define a skill that reaches out to a database service like Db2. Thereafter, use the built-in integrations to easily tie in the assistant with Slack, Facebook Messenger, embed the chatbot into your own application or use the WordPress plugin.

Architecture of database-driven chatbot

Automated reports with IBM Cloud Functions, Db2 and Slack

GitHub Traffic Analytics

One of my (many) favorite IBM Cloud solution tutorials is about combining serverless and Cloud Foundry for data retrieval and analytics. I blogged about it and described how an automated IBM Cloud Functions action retrieves GitHub statistics and stores them in Db2. Using an embedded Cognos dashboard and regular Javascript / HTML tables, the solution offers GitHub Traffic Analytics. I extended that solution by automatic weekly reports that are posted to Slack.

Enable Botkit Middleware for Watson Assistant for serverless actions

Slack chatbot with Watson Assistant

Last week I blogged about using Botkit Middleware to create a Watson-powered database interface. I pointed to a tutorial demonstrating a Slack chatbot for a SQL database (Db2). I described that it is relatively simple to replace the Conversation connector with Botkit Middleware. Today, I want to point you to some code I wrote to enable the server-side (serverless) actions in Watson Assistant.

DeDUG-Treffen in Ehningen / Db2 User Group meeting near Stuttgart

Db2 User Group Meeting

Remember the last German Db2 User Group (DeDUG) meetings at IBM in Munich (April) and at Fiducia GAD in Karlsruhe (January)? Or not? Anyway, get ready for an educative meeting with great networking. The next Db2 meeting is coming up this June 29th. IBM is going to host the user group at is German headquarters in Ehningen.

This time you have the opportunity to learn more about the SQL explain facility, query optimization, SQL recursion and several other topics. Similarly to the previous event, we are going to have lightning talks again. They are short, 3-8 minute talks in which YOU can present a solution, introduce a technical problem for which you are looking for a solution, or tell us your best database-related joke or an anecdote.

You can register through several channels:

See you at the DeDUG meeting at IBM Germany in Ehningen. If you have feedback, suggestions, or questions about this post, please reach out to me on Twitter (@data_henrik) or LinkedIn.

Monday, April 23, 2018

Use Db2 and IBM Cloud to analyze GitHub traffic data (tutorial)

Architecture: GitHub Traffic Analytics

In a new solution tutorial, I show you how to automatically retrieve and store GitHub traffic data the serverless way with IBM Cloud Functions and Db2. The data can then be analyzed via a Web app deployed to Cloud Foundry on IBM Cloud. The app is secured with App ID using OpenID Connect. The new service Dynamic Dashboard Embedded provides visualization of the views and clones of GitHub repositories.

New tutorial: Db2-driven Slackbot

Ever wanted to build a Slackbot, a chatbot integrated into Slack, on your own? I am going to show you how easy it is to integrate Slack or Facebook Messenger with the IBM Watson Conversation service. As a bonus, the bot is going to access a Db2 database to store and retrieve data. The solution is based on IBM Cloud Functions and entirely serverless

Slackbot Architecture

Chatbots: Some tricks with slots in IBM Watson Conversation

As you might remember, I have been using the IBM Watson Conversation service and DB2. My goal was to write a database-driven Slackbot, a Slack app that serves as chat interface to data stored in Db2. I will write more about that entire Slackbot soon, but today I wanted to share some chatbot tricks I learned. How to gather input data, perform checks and clean up the processing environment.

Slots

With my chatbot interface to Db2 I want to both query the database and insert new records. Thus, I need to collect input data of various kind. The Conversation service has a neat feature named input slots that simplifies that process. Within a dialog node (a logical step within the chat flow) I can specify a list of items the Conversation service should check for. I can tell in which variable to save that input and what question to ask if that data was not provided yet. Optional slots, i.e., optional data, can be enabled.

Pages

Thursday, February 6, 2025

Friday, January 17, 2025

Wednesday, August 28, 2024

Tuesday, July 16, 2024

Tuesday, February 20, 2024

Wednesday, November 22, 2023

Friday, November 25, 2022

Tuesday, September 6, 2022

Wednesday, November 24, 2021

Thursday, November 18, 2021

Monday, May 18, 2020

Monday, January 27, 2020

Tuesday, May 7, 2019

Wednesday, April 24, 2019

Monday, October 22, 2018

Tuesday, June 26, 2018

Tuesday, June 19, 2018

Monday, April 23, 2018

Monday, February 19, 2018

Tuesday, February 6, 2018

Slots