Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Connection issues encountered with CTCE devices #369

Closed
jeff-snyder opened this issue Feb 17, 2021 · 28 comments
Closed

Connection issues encountered with CTCE devices #369

jeff-snyder opened this issue Feb 17, 2021 · 28 comments
Assignees
Labels
BUG The issue describes likely incorrect product functionality that likely needs corrected.

Comments

@jeff-snyder
Copy link

jeff-snyder commented Feb 17, 2021

Hi,

I'd like to report the following issues when using the CTCE driver to emulate CTCs. Please note:

  • PC operating system - Windows 10 Professional
  • Hercules version 4.3.0.10296-SDL-ga4db8213 (4.3.0.10296)

In the following, the Hercules instance referred to as JS05 is running OS/390 2.10, the two "peer" instances (JWS1 and JS02) are running VM/SP5.

  1. Port numbers 1000 and 1001 were rejected as an invalid ports in the configuration.

The attached "ports" configuration and log files show the following.

With this configuration (Note: this is the complete configuration for this issue):

# Configuration file to show port issue
DEFSYM	IMAGE	05
DEFSYM	TITLE	JS05
# CTC links
530 CTCE $(IMAGE)000 600=192.168.1.32 01000   # link to JWS1
531 CTCE $(IMAGE)001 601=192.168.1.32 01001   # link to JWS1

The following errors are generated:

HHC05059E 0:0530 CTCE: Invalid port number: 01000
HHC00007I Previous message from function 'CTCE_Init' at ctcadpt.c(2094)
HHC01463E 0:0530 device initialization failed
HHC00007I Previous message from function 'attach_device' at config.c(1330)
HHC05059E 0:0531 CTCE: Invalid port number: 01001
HHC00007I Previous message from function 'CTCE_Init' at ctcadpt.c(2094)
HHC01463E 0:0531 device initialization failed
HHC00007I Previous message from function 'attach_device' at config.c(1330)

Please don't get sidetracked by the leading zero in the port number, this does not cause any impact.

While I know ports below 1024 require the user to be authorized, they are still valid port numbers and should be allowed. If the user tries to use a reserved port while unauthorized, an appropriate error could be generated at that time. Since I have to run authorized to support my networking requirements, those ports should be available to me.

  1. Connections got mixed up when using the same device address on multiple instances of Hercules running on the same PC.

The complete configurations are in the attached zip file, but the significant portions are:

JS05 (the hub):

# CTC links
DEFSYM	IMAGE	05
530 CTCE $(IMAGE)000 600=192.168.1.32 01600
531 CTCE $(IMAGE)001 601=192.168.1.32 01601
532 CTCE $(IMAGE)002 600=192.168.1.32 02000
533 CTCE $(IMAGE)003 601=192.168.1.32 02001

JWS1 (peer 1):

DEFSYM	IMAGE	01
# CTC links
600 CTCE $(IMAGE)600 530=192.168.1.32 05001 ATTNDELAY 200   # VTAM link to JS05 (SNA to OS/390 2.10)
601 CTCE $(IMAGE)601 531=192.168.1.32 05002 ATTNDELAY 200   # VTAM link to JS05

JS02 (peer 2):

DEFSYM	IMAGE	02
# SNA CTC links
600 CTCE $(IMAGE)000 532=192.168.1.32 05002 ATTNDELAY 200   # VTAM link to JS05
601 CTCE $(IMAGE)001 533=192.168.1.32 05003 ATTNDELAY 200   # VTAM link to JS05

As you can see, both JWS1 and JS02 have their CTCs connected to addresses 600-601. When bringing up the JS05 and JWS1 instances of Hercules by themselves, no problems were encountered. The OSes were IPLed and communication established over the link. A devlist of the affected devices show good connections:

21:35:11 HHC01603I devlist 530-53f
21:35:11 HHC02279I 0:0530 3088 CTCE 05000/62728 <=> 0:0600=192.168.1.32:1600/62747 IO[29] open 
21:35:11 HHC02279I 0:0531 3088 CTCE 05001/62733 <=> 0:0601=192.168.1.32:1601/62748 IO[3] open 
21:35:11 HHC02279I 0:0532 3088 CTCE 05002/63601 !=! 0?0600=192.168.1.32:2000/* IO[7] 
21:35:11 HHC02279I 0:0533 3088 CTCE 05003/63599 !=! 0?0601=192.168.1.32:2001/* IO[3] 

As soon as the Hercules instance for JS02 was brought up, the CTC connections between JS05 and JWS1 were "renewed", causing them to fail:

21:37:10 HHC05070I 0:0530 CTCE: Renewing inbound connection : 5000 <- 0:0600=192.168.1.32:64003 (bufsize=62552,16)
21:37:10 HHC05054I 0:0530 CTCE: Renewed outbound connection :64004 -> 0:0600=192.168.1.32:1600
21:37:10 HHC05070I 0:0531 CTCE: Renewing inbound connection : 5001 <- 0:0601=192.168.1.32:64005 (bufsize=62552,16)
21:37:10 HHC05054I 0:0531 CTCE: Renewed outbound connection :64006 -> 0:0601=192.168.1.32:1601
21:37:10 HHC05054I 0:0533 CTCE: Started outbound connection :63985 -> 0:0601=192.168.1.32:2001
21:37:10 HHC05070I 0:0531 CTCE: Renewing inbound connection : 5001 <- 0:0601=192.168.1.32:64010 (bufsize=62552,16)
21:37:10 HHC05054I 0:0532 CTCE: Started outbound connection :63986 -> 0:0600=192.168.1.32:2000
21:37:10 HHC05070I 0:0530 CTCE: Renewing inbound connection : 5000 <- 0:0600=192.168.1.32:64011 (bufsize=62552,16)

The resulting devlist shows the device entries for JS05 devices 530 and 531 have been overwritten with new remote port numbers:

21:37:31 HHC01603I devlist 530-53f
21:37:31 HHC02279I 0:0530 3088 CTCE 05000/64004 <=> 0:0600=192.168.1.32:2000/64011 IO[32] open busy 
21:37:31 HHC02279I 0:0531 3088 CTCE 05001/64006 <=> 0:0601=192.168.1.32:2001/64010 IO[3] open 
21:37:31 HHC02279I 0:0532 3088 CTCE 05002/63986 !=> 0?0600=192.168.1.32:2000/* IO[7] open 
21:37:31 HHC02279I 0:0533 3088 CTCE 05003/63985 !=> 0?0601=192.168.1.32:2001/* IO[3] open 

JS02's OS was never IPLed.

  1. Some kind of hang was encountered while trying to shutdown after (2), leading to hung CPs and a dump.

Soon after gathering the evidence for item (2) above, I attempted to "quit" Hercules instance JS02.

JS02:

21:40:28 HHC01603I quit

This appeared to shutdown properly, but had adverse effects on JS05 and JWS1:

JS05:

21:40:29 HHC05076I 0:0530 CTCE: Connection closed; 0 MB received in 0 packets from 0:0600=192.168.1.32:2000/64011; shutdown=0
21:40:29 HHC05086I 0:0530 CTCE: Recovery is about to issue Hercules command: DEVINIT 0:0530
21:40:29 HHC05076I 0:0530 CTCE: Connection closed; 0 MB received in 74 packets from 0:0600=192.168.1.32:2000/64011; shutdown=0
21:40:29 HHC05086I 0:0530 CTCE: Recovery is about to issue Hercules command: DEVINIT 0:0530
21:40:29 HHC05081I 0:0530 CTCE: Already awaiting connection : 5000 <- 0:0600=192.168.1.32:1600/*
21:40:29 HHC02245I 0:0530 device initialized
21:40:29 HHC05081I 0:0530 CTCE: Already awaiting connection : 5000 <- 0:0600=192.168.1.32:1600/*
21:40:29 HHC90013E 'initialize lock(&dev->ctceEventLock)' failed: rc=17: already init'ed; tid=00008808, loc=ctcadpt.c:3530
21:40:29 HHC00007I Previous message from function 'loglock' at hthreads.c(104)
21:40:29 HHC90028I lock &dev->ctceEventLock was already initialized at ctcadpt.c:3530
21:40:29 HHC02245I 0:0530 device initialized
21:40:29 HHC05054I 0:0530 CTCE: Started outbound connection :64488 -> 0:0600=192.168.1.32:1600
21:40:29 HHC05054I 0:0530 CTCE: Renewed outbound connection :64488 -> 0:0600=192.168.1.32:1600
21:41:38 HHC05076I 0:0531 CTCE: Connection closed; 0 MB received in 4 packets from 0:0601=192.168.1.32:2001/64010; shutdown=0
21:42:12 HHC00822S PROCESSOR CP00 APPEARS TO BE HUNG!
21:42:12 HHC00007I Previous message from function 'watchdog_thread' at impl.c(527)
21:42:12 HHC00822S PROCESSOR CP01 APPEARS TO BE HUNG!
21:42:12 HHC00007I Previous message from function 'watchdog_thread' at impl.c(527)

At this point, JS05 recorded a crash dump.

JWS1:

21:40:29 HHC05076I 0:0600 CTCE: Connection closed; 0 MB received in 6 packets from 0:0530=192.168.1.32:5000/64004; shutdown=0
21:40:29 HHC05086I 0:0600 CTCE: Recovery is about to issue Hercules command: DEVINIT 0:0600
21:40:29 HHC02231E 0:0600 busy or interrupt pending
21:40:29 HHC00007I Previous message from function 'devinit_cmd' at hsccmd.c(6830)
21:40:29 HHC05070I 0:0600 CTCE: Accepted inbound connection : 1600 <- 0:0530=192.168.1.32:64487 (bufsize=62552,16)
21:40:29 HHC05070I 0:0600 CTCE: Renewing inbound connection : 1600 <- 0:0530=192.168.1.32:64488 (bufsize=62552,16)

I was then able to shutdown JWS1 normally:

21:41:27 HHC00809I Processor CP00: disabled wait state 000A0000 00000008
21:41:31 HHC01022I 0:0010 COMM: client 192.168.1.32 devtype 3270: connection closed by client
21:41:38 HHC01603I quit

My estimation of what might be going on:

By changing the OS configurations to use unique device addresses, I have been able to run for almost 24 hours with these 3 and several more images all interconnected. Instances have been brought up and down without affecting other instances (except their connection peers).

It looks like CTCE is using the remote device number, possibly in addition to the remote IP address as some kind of lookup value to keep track of multiple TCP sessions. In my case, since all the Hercules instances run on the same computer, the IP address does not vary, so multiple instances with the same device number result in the same lookup value.

If you have any questions or I can provide any additional information, please contact me at jsnyder1369 at google dot com.

Please note, the crash dump taken by JS05 is 611 MB and compresses down to 49 MB. Git hub will only allow attachments of 10 MB or less, so if you need access to the dump, please let me know and we can work out other means to get it to you.

Thanks for your help!

@jeff-snyder
Copy link
Author

jeff-snyder commented Feb 17, 2021

Sorry, my file names in the zip did not match my commentary. I have corrected that in this file.

For reference: JS05 = hub, JWS1 = peer1 and JS02 = peer2.

@Fish-Git
Copy link
Member

Fish-Git commented Feb 17, 2021

I cannot explain what is going on or why. Peter Jansen (@Peter-J-Jansen) is the person to help you. He wrote our CTCE support and knows it inside and out.

What I will say is this:

While I know ports below 1024 require the user to be authorized, they are still valid port numbers and should be allowed. If the user tries to use a reserved port while unauthorized, an appropriate error could be generated at that time. Since I have to run authorized to support my networking requirements, those ports should be available to me.

Hercules CTCE documentation clearly states:

"Any port number > 1024 and < 65534 is allowed."

Complaining that "ports below 1024 should be allowed!" does not somehow magically change Hercules's existing already written code to somehow magically allow it. Whether you like it or not (whether you agree with it or not), you must nonetheless abide by the rules (constraints) that CTCE imposes upon you.

I would personally try fixing your port number problem first before reporting any type of Hercules problem. Valid or not, Hercules does currently require CTCE port numbers to be >= 1024, so regardless of whether you like that or not (regardless of whether you agree with that or not), you must nevertheless abide by it, or things are obviously not going to work correctly.

Fix your port problem and try again. THEN if things still do not work correctly, then you have a valid problem to report. Otherwise your "problem" report IMHO is invalid.

But I will let Peter decide whether to accept your problem report (not recommended), or whether to reject and close it as "User Error" (recommended).

Peter?

@jeff-snyder
Copy link
Author

Fish,

If you'll read the whole report, you'll see that the port number issue was only one of 3. In the other two issues, I used a port number above 1024 in the configuration and the port number played no part in the major problem, i.e. issue 2 - port numbers moving from one device to another.

Peter,

I had not noticed the port limitation in the documentation, and while I may not agree with it, I will obviously abide by it. Please disregard issue 1 in my report and look at issues 2 and 3. Thanks!

@Peter-J-Jansen
Copy link
Collaborator

Peter-J-Jansen commented Feb 17, 2021

Hi Jeff,

Glad to see the Hercules CTCE's are put to good use!

Concerning the issues you encountered:

  1. Port numbers < 1024 are indeed currently not supported. This was a decision I took in order to not complicate matters with requiring root or administrator privileges, as Linux Hercules users frequently try to avoid that, and there is no real need to use the low ports. I would rather therefore leave it as it is, as suggested also by Fish. Hopefully you can live with that.

  2. So you have three Hercules instances on a single Windows 10 host, which is supported, as you have experienced yourself as well. I think however I see a configuration error for the JWS1 host:

600 CTCE $(IMAGE)600 530=192.168.1.32 05001 ATTNDELAY 200 # VTAM link to JS05 (SNA to OS/390 2.10)
601 CTCE $(IMAGE)601 531=192.168.1.32 05002 ATTNDELAY 200 # VTAM link to JS05

Should that not be, noting 05000 and 05001:

600 CTCE $(IMAGE)600 530=192.168.1.32 05000 ATTNDELAY 200 # VTAM link to JS05 (SNA to OS/390 2.10)
601 CTCE $(IMAGE)601 531=192.168.1.32 05001 ATTNDELAY 200 # VTAM link to JS05

This could explain the errors you described under item 2. As you explained that changing to unique device address (devnum's) fixed your problem, did you perhaps also correct this "rport" number error?

  1. The shutdown problems you encountered after 2. were probably caused by the CTCE configuration error, and I think were caused by that. Trying to catch CTCE configuration errors between multiple Hercules instances like this one, whilst theoretically possible, would be quite an undertaking. I think it's unrealistic to think that would happen soon though (I have more urgent priorities around TXF).

When an incoming CTCE connection is received by the CTCE listener, a decision must be made whether it matches a connection waiting to be connected to. That decision is made using the remote IP address ("raddress") of the incoming connection attempt, combined with the remote device number "rdevnum", the preferred method. (Alternatively, the older method uses the remote port number ("rport") when "rdevnum" is not specified. Good to no longer use that method by always specifying "rdevnum", or at least the equal sign (=) prior to the "raddress").

A feature which I nearly always use is the exclusive-or operation on "rdevnum" so that each Hercules side uses the even devnum addresses for reading, and the odd ones for writing (or the other way around). The resulting CTCE configuration could then for example be:

   JS05 (the hub) -
   # CTC links
   DEFSYM IMAGE 05
   530.2 CTCE 01100 1=192.168.1.32 01100
   532.2 CTCE 02100 1=192.168.1.32 02100

   JWS1 (peer 1) -
   DEFSYM IMAGE 01
   # CTC links
   530.2 CTCE $(IMAGE)100 1=192.168.1.32 $(IMAGE)100 ATTNDELAY 200 # VTAM link to JS05 (SNA to OS/390 2.10)

   JS02 (peer 2) -
   DEFSYM IMAGE 02
   # SNA CTC links
   532.2 CTCE $(IMAGE)100 1=192.168.1.32 $(IMAGE)100 ATTNDELAY 200 # VTAM link to JS05

When using connections between different hosts and thus different IP addresses, the port numbers ("lport" and "rport") can be omitted completely. Same host Hercules instances though will need it.

Jeff, I hope this helps.

I would like to close this issue, but will await your OK to do so.

Thanks, and let's stay healthy!

Cheers,

Peter

@Fish-Git Fish-Git added (Invalid/PEBKAC) Likely user error. The described problem does not exist or was otherwise determined to be bogus. Waiting to close... Waiting for user to report back whether problem still exists or not before closing as resolved. labels Feb 17, 2021
@Fish-Git
Copy link
Member

Peter said:

  1. The shutdown problems you encountered after 2. were probably caused by the CTCE configuration error, and I think were caused by that.

Regardless of whether there was a user configuration error or not, Hercules should NOT be crashing.

It looks to me like you have a locking problem somewhere:

21:40:29 HHC05076I 0:0530 CTCE: Connection closed; 0 MB received in 0 packets from 0:0600=192.168.1.32:2000/64011; shutdown=0
21:40:29 HHC05086I 0:0530 CTCE: Recovery is about to issue Hercules command: DEVINIT 0:0530
21:40:29 HHC05076I 0:0530 CTCE: Connection closed; 0 MB received in 74 packets from 0:0600=192.168.1.32:2000/64011; shutdown=0
21:40:29 HHC05086I 0:0530 CTCE: Recovery is about to issue Hercules command: DEVINIT 0:0530
21:40:29 HHC05081I 0:0530 CTCE: Already awaiting connection : 5000 <- 0:0600=192.168.1.32:1600/*
21:40:29 HHC02245I 0:0530 device initialized
21:40:29 HHC05081I 0:0530 CTCE: Already awaiting connection : 5000 <- 0:0600=192.168.1.32:1600/*
21:40:29 HHC90013E 'initialize lock(&dev->ctceEventLock)' failed: rc=17: already init'ed; tid=00008808, loc=ctcadpt.c:3530
21:40:29 HHC00007I Previous message from function 'loglock' at hthreads.c(104)
21:40:29 HHC90028I lock &dev->ctceEventLock was already initialized at ctcadpt.c:3530
21:40:29 HHC02245I 0:0530 device initialized
21:40:29 HHC05054I 0:0530 CTCE: Started outbound connection :64488 -> 0:0600=192.168.1.32:1600
21:40:29 HHC05054I 0:0530 CTCE: Renewed outbound connection :64488 -> 0:0600=192.168.1.32:1600

[...]

21:42:57 HHC01603I locks held sort tid
21:42:57 HHC90017I Lock 00000000017b4350 (&sysblk.ioqlock) created by 00008a4c (panel_display) on 21:24:08.982615 at impl.c:862
21:42:57 HHC90029I Lock 00000000017b4350 (&sysblk.ioqlock) obtained by 00000eec (idle dev thrd) on 21:41:44.644110 at channel.c:2473
21:42:57 HHC90017I Lock 0000000001819f90 (&logger_lock) created by 00008a4c (panel_display) on 21:24:08.982848 at logger.c:484
21:42:57 HHC90029I Lock 0000000001819f90 (&logger_lock) obtained by 000048f8 (logger_thread) on 21:42:57.085613 at logger.c:383
21:42:57 HHC90017I Lock 0000000004f05038 (&dev->lock 0:0531) created by 00008a4c (panel_display) on 21:24:09.056089 at config.c:657
21:42:57 HHC90029I Lock 0000000004f05038 (&dev->lock 0:0531) obtained by 00006f44 (CTCE 0531 RecvT) on 21:41:38.807221 at ctcadpt.c:2719
21:42:57 HHC90017I Lock 000000000177da08 (&cckdblk.gclock) created by 00008a4c (panel_display) on 21:24:09.337180 at cckddasd.c:60
21:42:57 HHC90029I Lock 000000000177da08 (&cckdblk.gclock) obtained by 00009084 (cckd_gcol) on 21:41:42.463992 at null:0
21:42:57 HHC01603I threads waiting sort tid
21:42:57 HHC90023W Thread Processor CP00 tid=0000788c waiting since 21:41:38.898873 for lock &dev->lock 0:0531 = 0000000004f05038
21:42:57 HHC00007I Previous message from function 'threads_cmd' at hthreads.c(1614)

 
As I recall, at least one (more?) CTCE lock issue was corrected since Hercules 4.3 was released, so the above problem(s) may already have been fixed.

I just wanted to point out that regardless of the cause, Hercules should never be crashing.

Jeff said:

Fish,

If you'll read the whole report, you'll see that the port number issue was only one of 3. In the other two issues, I used a port number above 1024 in the configuration and the port number played no part in the major problem, i.e. issue 2 - port numbers moving from one device to another.

I apologize for the misunderstanding, Jeff. But the issue is now moot: Peter has identified the cause for your problems, just as I knew he would.   :)

Take care both of you.

(Jeff? Please close this issue whenever you feel comfortable to do so, OR let us know why you feel it should remain open. Thanks.)

@jeff-snyder
Copy link
Author

jeff-snyder commented Feb 18, 2021

Hi Peter,

Glad to see the Hercules CTCE's are put to good use !

Yep, I'm getting a lot of use out of it. Thanks for creating it!

Concerning the issues you encountered :
(1) Port numbers < 1024 are indeed currently not supported.... Hopefully you can live with that.

No problem. Hey, it's your tool, so your rules. :) It just ruined a great port numbering scheme I had going... :(

(2) So you have three Hercules instances on a single Windows 10 host, which is supported, as you have experienced yourself as well. I think however I see a configuration error for the JWS1 host: ...

Good catch! You are correct, I must have introduced a typo in my last day's testing.

This could explain the errors you described under item 2. As you explained that changing to unique device address (devnum's) fixed your problem, did you perhaps also correct this "rport" number error ?

I did correct it without even realizing it when I switched to unique device numbers. Unfortunately, once I switched the device numbers back using the corrected port numbers, I still get the same errors.

Using shortened configuration files and starting the images with no "guest" OS IPLs, I still see the connection crossovers. (Note: the configurations and full log files are in the attached zip file at the end.)

  • Configurations:
PANTITLE JS05
530 CTCE 05000 600=192.168.1.32 01600 # link to JWS1/600
531 CTCE 05001 601=192.168.1.32 01601 # link to JWS1/601
532 CTCE 05002 600=192.168.1.32 02000 # link to JS02/600
533 CTCE 05003 601=192.168.1.32 02001 # link to JS02/601
PANTITLE JWS1
600 CTCE 01600 530=192.168.1.32 05000 ATTNDELAY 200 # link to JS05/530
601 CTCE 01601 531=192.168.1.32 05001 ATTNDELAY 200 # link to JS05/531
PANTITLE JS02
600 CTCE 02000 532=192.168.1.32 05002 ATTNDELAY 200 # link to JS05/532
601 CTCE 02001 533=192.168.1.32 05003 ATTNDELAY 200 # link to JS05/533

Results in JS05:

  • When I started JWS1:
16:00:45 HHC01603I * now starting JWS1
16:00:55 HHC05070I 0:0530 CTCE: Accepted inbound connection : 5000 <- 0:0600=192.168.1.32:62996 (bufsize=62552,16)
16:00:55 HHC05070I 0:0531 CTCE: Accepted inbound connection : 5001 <- 0:0601=192.168.1.32:62997 (bufsize=62552,16)
16:00:55 HHC05054I 0:0531 CTCE: Started outbound connection :62998 -> 0:0601=192.168.1.32:1601
16:00:55 HHC05054I 0:0530 CTCE: Started outbound connection :63001 -> 0:0600=192.168.1.32:1600
16:01:10 HHC01603I devlist
16:01:10 HHC02279I 0:0530 3088 CTCE 05000/63001 <=> 0:0600=192.168.1.32:1600/62996 IO[0] open 
16:01:10 HHC02279I 0:0531 3088 CTCE 05001/62998 <=> 0:0601=192.168.1.32:1601/62997 IO[0] open 
16:01:10 HHC02279I 0:0532 3088 CTCE 05002/63010 !=! 0:0600=192.168.1.32:2000/* IO[0] 
16:01:10 HHC02279I 0:0533 3088 CTCE 05003/63011 !=! 0:0601=192.168.1.32:2001/* IO[0] 
  • When I started JS02:
16:01:57 HHC05070I 0:0530 CTCE: Renewing inbound connection : 5000 <- 0:0600=192.168.1.32:63056 (bufsize=62552,16)
16:01:57 HHC05054I 0:0530 CTCE: Renewed outbound connection :63057 -> 0:0600=192.168.1.32:1600
16:01:57 HHC05070I 0:0531 CTCE: Renewing inbound connection : 5001 <- 0:0601=192.168.1.32:63058 (bufsize=62552,16)
16:01:57 HHC05054I 0:0531 CTCE: Renewed outbound connection :63059 -> 0:0601=192.168.1.32:1601
16:01:57 HHC05054I 0:0533 CTCE: Started outbound connection :63049 -> 0:0601=192.168.1.32:2001
16:01:57 HHC05054I 0:0532 CTCE: Started outbound connection :63048 -> 0:0600=192.168.1.32:2000
16:01:57 HHC05070I 0:0531 CTCE: Renewing inbound connection : 5001 <- 0:0601=192.168.1.32:63060 (bufsize=62552,16)
16:01:57 HHC05070I 0:0530 CTCE: Renewing inbound connection : 5000 <- 0:0600=192.168.1.32:63061 (bufsize=62552,16)
16:02:25 HHC01603I devlist
16:02:25 HHC02279I 0:0530 3088 CTCE 05000/63057 <=> 0:0600=192.168.1.32:2000/63061 IO[0] open 
16:02:25 HHC02279I 0:0531 3088 CTCE 05001/63059 <=> 0:0601=192.168.1.32:2001/63060 IO[0] open 
16:02:25 HHC02279I 0:0532 3088 CTCE 05002/63048 !=> 0:0600=192.168.1.32:2000/* IO[0] open 
16:02:25 HHC02279I 0:0533 3088 CTCE 05003/63049 !=> 0:0601=192.168.1.32:2001/* IO[0] open 

Results in JWS1:

16:00:55 HHC05063I 0:0600 CTCE: Awaiting inbound connection : 1600 <- 0:0530=192.168.1.32:5000/*
16:00:55 HHC05063I 0:0601 CTCE: Awaiting inbound connection : 1601 <- 0:0531=192.168.1.32:5001/*
16:00:55 HHC05054I 0:0600 CTCE: Started outbound connection :62996 -> 0:0530=192.168.1.32:5000
16:00:55 HHC05054I 0:0601 CTCE: Started outbound connection :62997 -> 0:0531=192.168.1.32:5001
16:00:55 HHC05070I 0:0601 CTCE: Accepted inbound connection : 1601 <- 0:0531=192.168.1.32:62998 (bufsize=62552,16)
16:00:55 HHC05070I 0:0600 CTCE: Accepted inbound connection : 1600 <- 0:0530=192.168.1.32:63001 (bufsize=62552,16)
16:01:03 HHC01603I devlist
16:01:03 HHC02279I 0:0600 3088 CTCE 01600/62996 <-> 0:0530=192.168.1.32:5000/63001 IO[0] open 
16:01:03 HHC02279I 0:0601 3088 CTCE 01601/62997 <-> 0:0531=192.168.1.32:5001/62998 IO[0] open 
16:01:50 HHC01603I * now starting JS02
16:01:57 HHC05070I 0:0600 CTCE: Renewing inbound connection : 1600 <- 0:0530=192.168.1.32:63057 (bufsize=62552,16)
16:01:57 HHC05070I 0:0601 CTCE: Renewing inbound connection : 1601 <- 0:0531=192.168.1.32:63059 (bufsize=62552,16)
16:02:19 HHC01603I devlist
16:02:19 HHC02279I 0:0600 3088 CTCE 01600/62996 <-> 0:0530=192.168.1.32:5000/63057 IO[0] open 
16:02:19 HHC02279I 0:0601 3088 CTCE 01601/62997 <-> 0:0531=192.168.1.32:5001/63059 IO[0] open 

Results in JS02:

16:01:57 HHC05063I 0:0600 CTCE: Awaiting inbound connection : 2000 <- 0:0532=192.168.1.32:5002/*
16:01:57 HHC05063I 0:0601 CTCE: Awaiting inbound connection : 2001 <- 0:0533=192.168.1.32:5003/*
16:01:57 HHC05054I 0:0600 CTCE: Started outbound connection :63056 -> 0:0532=192.168.1.32:5002
16:01:57 HHC05054I 0:0601 CTCE: Started outbound connection :63058 -> 0:0533=192.168.1.32:5003
16:01:57 HHC05070I 0:0601 CTCE: Accepted inbound connection : 2001 <- 0:0533=192.168.1.32:63049 (bufsize=62552,16)
16:01:57 HHC05070I 0:0600 CTCE: Accepted inbound connection : 2000 <- 0:0532=192.168.1.32:63048 (bufsize=62552,16)
16:01:57 HHC05054I 0:0601 CTCE: Renewed outbound connection :63060 -> 0:0533=192.168.1.32:5003
16:01:57 HHC05054I 0:0600 CTCE: Renewed outbound connection :63061 -> 0:0532=192.168.1.32:5002
16:02:14 HHC01603I devlist
16:02:14 HHC02279I 0:0600 3088 CTCE 02000/63061 <-> 0:0532=192.168.1.32:5002/63048 IO[0] open 
16:02:14 HHC02279I 0:0601 3088 CTCE 02001/63060 <-> 0:0533=192.168.1.32:5003/63049 IO[0] open 

And, like before, I encountered shutdown problems due to the crossovers.

(3) The shutdown problems you encountered after 2. were probably caused by the CTCE configuration error...

Yeah, that's what I figured. I mostly included this just for awareness and so you'd know there was a dump, in case it might prove useful.

A feature which I nearly always use is the exclusive-or operation on "rdevnum" so that each Hercules side uses the even devnum addresses for reading, and the odd ones for writing (or the other way around). The resulting CTCE configuration could then be for example : ...

In looking at your suggested configuration, it doesn't use the correct remote device numbers for JWS1 or JS02. I set it up and when I tried it, none of the instances even seemed aware of each other. I've since tried a couple variations on it, trying to get it to work with non-matching local and remote devnums, without success. The remote device number does not increment for me. Could you provide an example that uses local device numbers of 530-531 and remote device numbers of 600-601? Once I get that going, I should be able to extrapolate my other links from it. Thanks!

Hi Fish,

Unfortunately, I'm still experiencing the problem, so I'm not ready to close this issue yet. Please leave it open until for now. Thanks!

@Fish-Git
Copy link
Member

Hi Fish,

Unfortunately, I'm still experiencing the problem, so I'm not ready to close this issue yet. Please leave it open until for now. Thanks!

No problem! You are, as you said, still experiencing problems, so we will label this issue as a "Bug" and keep it open until it is resolved.

@Fish-Git Fish-Git added BUG The issue describes likely incorrect product functionality that likely needs corrected. IN PROGRESS... I'm working on it! (Or someone else is!) and removed (Invalid/PEBKAC) Likely user error. The described problem does not exist or was otherwise determined to be bogus. Waiting to close... Waiting for user to report back whether problem still exists or not before closing as resolved. labels Feb 18, 2021
@Fish-Git
Copy link
Member

Yeah, that's what I figured. I mostly included this just for awareness and so you'd know there was a dump, in case it might prove useful.

Please compress the dump and FTP upload it to my "incoming"(*) folder at:

  • ftp://www.softdevlabs.com
  • ftp://www.softdevlabs.com/incoming/

and then send me an email letting me know when you have done so. I will then download it and analyze it to try and determine what went wrong.

Thanks!


(*) PLEASE NOTE that my FTP "incoming" folder is marked write-only for security purposes. What this means is anyone can upload files into the folder, but no one will be able to download anything from that folder, even if they know the name!

In fact, you will not be able to even list the contents of the folder (i.e. you will not be allowed to do a "dir" or "ls" of the folder's contents) since that necessarily requires read-access, and as just explained, the folder does not have read-access, only write access, so you can be assured whatever you upload to there is 100% secure. Only I will be able to download from that folder since only I am the administrator.

@Peter-J-Jansen
Copy link
Collaborator

Peter-J-Jansen commented Feb 18, 2021

Jeff and Fish,

Thanks for your follow-up comments. Jeff certainly uncovered what I consider 2 bugs :

The remote device number does not increment for me.

  1. That is indeed a bug. As it turns out, I must have never tested this without specifying a remote CCUU or "exclusive or" parameter in front of the equal sign (=). So that feature is not usable until I have corrected it. In Jeff's case, this = feature is not very useful, so hopefully he will not have to wait for that fix.

  2. The 2nd bug might be debatable, but I'd rather accept it as a bug. The problem occurs when remote CTCE incoming connection requests are being matched against local CTCE link configurations. In case the local CTCE link configuration specifies a remote devnum, then the matching currently is only based on that remote devnum plus remote IP address; the remote port number, either specified or defaulted (3088), is then not used. Hence that the current JS05 configuration for the remote CTCE's will not work, as for both JW01 and JS02 the same remote devnum and IP address is specified.
     
    One solution is to specify different devnums, which Jeff already confirmed to work. A solution so that the same devnums can be used is however also possible, by using the CTCE v1 method to identify the remote CTCE's using only the remote port numbers, and leaving out the remote devnums including the equal sign (=) in front of the remote IP address.
     
    I successfully tested this configuration :

PANTITLE JS05
# CTC links
# using localhost and default lport=3088 on JS05, remote side rport identified
0530   CTCE           localhost    01600 # link to JWS1/600
0531   CTCE           localhost    01601 # link to JWS1/601
0532   CTCE           localhost    02000 # link to JS02/600
0533   CTCE           localhost    02001 # link to JS02/601

PANTITLE JWS1
# CTC links
# using localhost and default rport=3088 to match JS05 default lport
0600   CTCE 01600 530=localhost          ATTNDELAY 200 # link to JS05/530
0601   CTCE 01601 531=localhost          ATTNDELAY 200 # link to JS05/531

PANTITLE JS02
# CTC links
# using localhost and default rport=3088 to match JS05 default lport
0600   CTCE 02000 532=localhost          ATTNDELAY 200 # link to JS05/532
0601   CTCE 02001 533=localhost          ATTNDELAY 200 # link to JS05/533

The above omits JS05 lport specifications in favor of using the 3088 default, as well as the JWS1 and JS02 rports (thus also defaulting to 3088), but that is not important, they can still be specified as well. Also, I've used "localhost" instead of actual IP addresses, but that is immaterial as well. The only important thing is leaving out the remote devnum specifications 600= and 601= for JS05.

I think I should correct this by ensuring that a remote CTCE incoming connection should always include a matching remote port number, whether specified or defaulted (3088), and never rely on a matching remote devnum alone. That should actually be simpler to fix than the 1st bug.

Fish wrote about another possible issue :

It looks to me like you have a locking problem somewhere:

That might be correct, although I think that in Jeff's case it's caused by the 2nd bug. At least with the circumvention for that bug as shown above, I could no longer reproduce it. All 3 Hercules instances closed down correctly, in whatever order I tried that.

That being said, I do recall that during CTCE development, I did encounter sometimes CTCE lock issues as a result of either manual devinit <ccuu> commands, or when the CTCE built-in recovery triggers them. However, I considered them cosmetic only, and was actually unable to reliably avoid them. They were caused, if I remember correctly, by such devinit commands not being able to revert back to the same initial situation as prior to the Hercules configuration file processing, as these devinit commands must be able to rely on the previous configuration parameters. The problem only became apparent after some updates related to the lock processing. When I work on the CTCE bugs identified, I will revisit this possible issue. OK?

Cheers,

Peter

@wably
Copy link
Member

wably commented Feb 18, 2021

Hi Peter,

Regarding the issue of trying to properly match up identically numbered devnums with multiple Hercules instances on the same host, I have a different suggestion. What if you disassociated the devnum value from the actual physical devnum and instead treated it as a symbolic identifier? This means instead of specifying 530=localhost, specify something unique, like HOST05=localhost, or in Jeff's case, JS05530=localhost. In other words, use the value as a name to provide uniqueness rather than tying it to an actual device number (which may not be unique with three or more Hercules instances on the same host).

The chosen name could be anything the user wants that helps him identify the connection to both sides. It also be a CDRM name, or NJE node name, or whatever helps to identify which connection is intended to go where. This of course means the other side's configuration must have that same name coded as well. This would most probably mean you would need an additional configuration parameter on the CTCE statement.

This method is exactly what the TCPNJE 2703 device uses. It's parameter specifications include an RNODE= and LNODE= values. Most of us use the actual NJE node names on each end for these values, because they self document. But it is not required to be the NJE node names. It is perfectly ok to code RNODE=A and LNODE=B while the actual NJE node names used on that connection are something else entirely. TCPNJE uses the A and B values to associate the right connection only and has no bearing on the actual data traffic that will flow.

By disassociating the devnum from the actual devnum, users could still code the devnum value if they wish - it is now just a symbolic name. But in cases where further uniqueness is required, this offers a way to specify it.

Perhaps something like this could be used to resolve the duplicative devnum problem?

Regards,
Bob

@Peter-J-Jansen
Copy link
Collaborator

Peter-J-Jansen commented Feb 18, 2021

Hi Bob,

Thanks for your suggestion on how to differentiate multiple Hercules instances on the same host.

Originally, in the first CTCE implementation (which I've been referring to as CTCE v1), the only method was based on IP Address and CTCE listening port number. With many CTCE connections, managing those listening port numbers become cumbersome, hence that the second implementation (a.k.a. CTCE v2) these port numbers can be replaced with the devnums. This comes much more natural for us configuring the OS's using those devnums. But, as I specifically wanted to continue supporting the CTCE v1 approach, the port number identification is still working an supported. So, effectively, there still is the possibility to use that method, and my tests as explained in my comments to Jeff, do work; I tested them.

OK, the identification with using port numbers instead of say NJE node names, is indeed cumbersome, and needs to be carefully specified, as each port number on a given host must be unique. But it does work.

The problem as experienced by Jeff is that, if one uses that differentiating / identification port number method, currently the devnums must be omitted completely. That's the workaround I tested and provided to Jeff. The fix for that problem I have already coded, and will be testing the next few days. The beauty of that fix is that it is trivially simple.

Whether I should effectively provide an other, additional differentiating / identification method, e.g. using additional parameters like your suggested RNODE= and LNODE=, specifying NJE node names which would need to be unique, I am a bit worried about based on the complications, and the effort required to test it all, as well as the continued support for the current port number method. And all that because of Hercules instances running on the same PC (windows or linux or macos). I'd rather not add that complexity, but yes, I admit, I'm a bit lazy.

But the fix so that the devnums can be left, so that Jeff's original configuration will work, that I believe is an easy thing for me to do.

But please feel free to contradict me Bob!

Cheers,

Peter

@Fish-Git
Copy link
Member

Fish wrote about another possible issue :

It looks to me like you have a locking problem somewhere:

That might be correct, although I think that in Jeff's case it's caused by the 2nd bug. At least with the circumvention for that bug as shown above, I could no longer reproduce it. All 3 Hercules instances closed down correctly, in whatever order I tried that.

Were you able to reproduce it without the circumvention? (i.e. without your fix? i.e. with stock v4.3?)

I am unable to try doing so myself due to not having VM/SP 5 (and/or whatever other guest operating systems are involved). If I could reproduce it on my own then I could properly look into it. Since I can't, I cannot.

When I work on the CTCE bugs identified, I will revisit this possible issue. OK?

10-4. Please keep me informed of your progress and PLEASE let me know if you need any help. As I said earlier, no matter what "goes wrong" in Hercules, it should not ever crash!   (or hang, etc...)

Thanks.

p.s. No rush!

@jeff-snyder
Copy link
Author

Hi Fish,

If you look back at my most recent prior post, you'll see that the stripped down configurations I used to recreate the issue have no DASD in them and no OS IPLs were performed. You should be able to recreate the issue with those configs.

@jeff-snyder
Copy link
Author

Hi Peter,

Thanks! Based on your suggested configuration changes, I have everything up with duplicate devnums and no issues. I've IPLed all the various affected OSes and confirmed that communication works across all the links.

I used the configuration as written (i.e. leaving out the lports in JS05). Once that worked, I recreated my desired 6 instance configuration and tested that using the same format. The only deviation I made was to eliminate localhost. I've been burned too many times by Windows resolving that to an IPv6 address. I brought everything up and after changing the OS configurations back to the original, duplicate device numbers, everything is up and working great.

Thanks for your help!

Please let me know if I can help test anything as you work through solutions for the issues we discovered.

@Fish-Git
Copy link
Member

Hi Fish,

If you look back at my most recent prior post, you'll see that the stripped down configurations I used to recreate the issue have no DASD in them and no OS IPLs were performed.

For your most recent prior post, yes, that is true, but it was only in your original report that the watchdog thread on JS05 detected that Processor CP00 was hung (and automatically created a crash dump as a result), and in that particular specific instance, a guest operating system was indeed IPL'ed (VM/SP 5 from device 1C0 in this particular case).

Neither of the other two system were IPL'ed, true, but system JS05 certainly was, and since that is where the problem is that I am wanting to research (hung processor), I thus need either a copy of VM/SP 5 to be able to IPL on JS05, or else some other guest operating system in order to be able to recreate your hang (but I would feel more comfortable using the same operating system that you were using that caused the problem in the first place).

You should be able to recreate the issue with those configs.

I don't think so.   :(

Looking at the configs and logs from your most recent prior post's attached file, none of systems appear to have been IPLed.

The only way to recreate your original Processor CP00 hang (and resulting crash) as reported in your original post, is to use the exact same configs as provided in your original report, and to IPL the same guest that you did: VM/SP 5. (Or, as explained, some other guest operating system that, when IPLed, is also able to recreate the problem/hang.)

@Fish-Git
Copy link
Member

The only deviation I made was to eliminate localhost. I've been burned too many times by Windows resolving that to an IPv6 address.

I too make it a habit of never using "localhost". Instead, I always use "127.0.0.1", which is essentially the exact same thing.

@jeff-snyder
Copy link
Author

Hi Fish,

For your most recent prior post, yes, that is true, but it was only in your original report that the watchdog thread on JS05 detected that Processor CP00 was hung (and automatically created a crash dump as a result), and in that particular specific instance, a guest operating system was indeed IPL'ed (VM/SP 5 from device 1C0 in this particular case).

OK. I wasn't clear on what you were trying to recreate. Hercules hangs in both cases, but since there was no IPL to activate a CP in the second case, there is no CP hang. I thought you were just trying to recreate the hang. To be honest, I don't know if it would eventually crash without the CP being hung, since there's no active CP for the watchdog thread to monitor. I've never waited more than a minute or two after it hangs, I just use Windows' Task Manager to end the task.

@jeff-snyder
Copy link
Author

Hi Peter,

Fish's comment about using 127.0.0.1 spurred a thought.

Another possible resolution for this would be to allow the user to configure the local IP address. Since all the IPs in the 127.0.0.0 range refer to the local host, the user could assign unique "localhost addresses" to each instance and achieve the unique lookup value that way. For example, I could use 127.0.0.1 for JWS1, 127.0.0.2 for JS02 and 127.0.0.5 for JS05 in my configurations and the CTCE code would see them all as unique host/devnums.

@Peter-J-Jansen
Copy link
Collaborator

Hi Jeff,

Thanks for your positive feedback re. my workaround.

Yes, that would work, but I do not know of how to ensure that packets to 127.0.0.1 / .2 / .5 would be correctly routed / delivered to the correct Hercules instances / processes, or put differently, how to establish these localhost address to ensure that. And to top it off, how to ensure the same technique could work for all Hercules' supported platforms, Windows, Linux, and MacOS.

One approach which I think could be made to work, at least under Linux and MacOS (up to MacOS version 10.x.y, but not 11.x.y), is establishing additional TAP interfaces, and bridge these together (also the hardware NIC) under a master, and give each TAP interface its own unique address in the same LAN as the NIC, and then configure the Hercules instances with those addresses accordingly. But whether this configuration overhead is less effort than managing unique lport addresses (after my upcoming patch for it that is), is, I think, questionable.

As soon as my upcoming patch is available, your very initial configuration with all remote devnums specified should also work, so my workaround should then no longer be needed. I propose to keep this issue open until we've been able to confirm that.

Cheers,

Peter

@Peter-J-Jansen
Copy link
Collaborator

Fish wrote :

The only deviation I made was to eliminate localhost. I've been burned too many times by Windows resolving that to an IPv6 address.

I too make it a habit of never using "localhost". Instead, I always use "127.0.0.1", which is essentially the exact same thing.

An interesting workaround I saw by Rob Prins in turnkey-mvs@groups.io to ensure "localhost" is always the IPv4 127.0.0.1, is to add an entry for that in the etc/hosts file (thanks Rob !) :

127.0.0.1 localhost

Cheers,

Peter

@Fish-Git
Copy link
Member

Fish-Git commented Feb 19, 2021

(paraphrased):

An interesting workaround is to ensure "localhost" always resolves to the the IPv4 address "127.0.0.1" by adding an entry for it in your etc/hosts file:

127.0.0.1   localhost

Yes, that would work too. But why go to that trouble when IMO it's easier to simply use 127.0.0.1 in your Herc config file instead?

Bottom line: you can either: a) update your etc/hosts file leaving your existing Herc config file alone, or b) simply change your Herc config file instead. Either way, you still need to change something, and IMO it's easier to simply always use 127.0.0.1 in your Herc config file instead.

Six of one, half a dozen of the other.   <shrug>

@Fish-Git
Copy link
Member

Fish-Git commented Feb 19, 2021

I thought you were just trying to recreate the hang.

Well, I'm marginally interested in that too, but I believe Peter probably has that well in hand. What I'm mostly interested in right now is the original Processor CP00 hang.

To be honest, I don't know if it would eventually crash without the CP being hung, since there's no active CP for the watchdog thread to monitor.

Correct: the crash would not occur since no processors were hung. Now, if a deadlock was detected, then that would certainly cause a crash. But since no deadlock was reported we know that's not the cause for the hang, and as I said it is the Processor CP00 hang that I'm mostly interested in at this point.

When I get a chance (I keep getting distracted (torn away) back and forth between several different things I'm looking into) I'll try to see if I can recreate the Processor hang using a different guest operating system (such as VM/370 SixPack maybe). If I discover anything I'll let you know.

(Oh yeah! That original crash dump you sent me? It was a bust. It unfortunately told me nothing. That's why I need to fall back to Plan B: try recreating the hang/crash for myself)

@jeff-snyder
Copy link
Author

Peter,

Yes, that would work, but I do not know of how to ensure that packets to 127.0.0.1 / .2 / .5 would be correctly routed / delivered to the correct Hercules instances / processes, or put differently, how to establish these localhost address to ensure that. And to top it off, how to ensure the same technique could work for all Hercules' supported platforms, Windows, Linux, and MacOS.

There is no additional configuration required. The packets will be delivered to the existing TCPIP stack, just as if 127.0.0.1 had been used, with the receiving Hercules instance being controlled by the destination port number. For example, with no changes to my PC, I can ping 127.0.0.1, 127.0.0.2, 127.0.0.3, etc...You can just think of all the other 127.0.0.x addresses as aliases for 127.0.0.1. The advantage is that when the packet arrives, it has a different "label" on it, the unique IP address.

Fish,

Thanks for the explanation. I was pretty sure that dump wouldn't be useful for the original issue, but I'm disappointed it wasn't helpful on the CP hang. Let me know if there's any way I can help.

Peter-J-Jansen added a commit that referenced this issue Feb 20, 2021
The 2 problems identified in Issue #369 for CTCE configurations are now corrected :
1. The "rport" parameter is now always taken into account when matching
    incoming CTCE connections, and no longer ignored when "rdevnum" is specified.
2. The "rdevnum" when specified as "=" was not incremented correctly when the
    "ldevnum.n" format was used to specify "n" multiple CTCE devices.
@Peter-J-Jansen
Copy link
Collaborator

Peter-J-Jansen commented Feb 20, 2021

I have just committed the fixes for the 2 issues that were identified. As a result, Jeff's original configuration will now work as well, as my test confirmed.

The second problem identified with the non-incrementing remote devnums when specifying multiple CTCE devices using a single config entry is now also fixed. I successfully tested also these configurations :

PANTITLE JS05
# CTC links
# using localhost and default lport=3088 on JS05, single rport per remote host
0530.2 CTCE       600=127.0.0.1    01600 # link to JWS1/600
0532.2 CTCE       600=127.0.0.1    02000 # link to JS02/600

PANTITLE JWS1
# CTC links
# using localhost and default rport=3088 to match JS05 default lport, single lport
0600.2 CTCE 01600 530=127.0.0.1          ATTNDELAY 200 # link to JS05/530

PANTITLE JS02
# CTC links
# using localhost and default rport=3088 to match JS05 default lport
0600.2 CTCE 02000 532=127.0.0.1          ATTNDELAY 200 # link to JS05/532

CTCE port number specifications are only needed for multiple Hercules instances on the same PC, but one of them (in my examples JS05) can just use the default 3088 (if that port number isn't used for something else, that is). As the device numbers on a given Hercules instance have to be unique, a single lport per instance is sufficient, which is a wee bit more efficient than an lport per CTCE device.

The devlist output from the 3 log files show (noting that the fisrt "3088" is not a port number, but the CTC device type 3088) :

18:29:44 HHC01603I devlist
18:29:44 HHC02279I 0:0530 3088 CTCE 03088/52416 <=> 0:0600=127.0.0.1:1600/52435 IO[0] open 
18:29:44 HHC02279I 0:0531 3088 CTCE 03088/52415 <=> 0:0601=127.0.0.1:1600/52436 IO[0] open 
18:29:44 HHC02279I 0:0532 3088 CTCE 03088/52818 <=> 0:0600=127.0.0.1:2000/52832 IO[0] open 
18:29:44 HHC02279I 0:0533 3088 CTCE 03088/52816 <=> 0:0601=127.0.0.1:2000/52833 IO[0] open 

18:29:18 HHC01603I devlist
18:29:18 HHC02279I 0:0600 3088 CTCE 01600/52435 <-> 0:0530=127.0.0.1:3088/52416 IO[0] open 
18:29:18 HHC02279I 0:0601 3088 CTCE 01600/52436 <-> 0:0531=127.0.0.1:3088/52415 IO[0] open 

18:29:52 HHC01603I devlist
18:29:52 HHC02279I 0:0600 3088 CTCE 02000/52832 <-> 0:0532=127.0.0.1:3088/52818 IO[0] open 
18:29:52 HHC02279I 0:0601 3088 CTCE 02000/52833 <-> 0:0533=127.0.0.1:3088/52816 IO[0] open 

In case anyone wonders why some CTCE connections are shown with a "<=>" and others with "<->", well, the "<=>" sides of CTCE connections are contention winner sides, the "<->" are the contention loser sides.

If we're all pleased with this then I propose to close this issue. OK?

Cheers,

Peter

@Fish-Git
Copy link
Member

If we're all pleased with this then I propose to close this issue. OK?

I agree if Jeff agrees.

I can work on my Processor CP00 hang issue offline at my own leisure, and simply add a new additional comment if/whenever I have something to report. There's no need to keep it open for my sake. Is that fine with you, Jeff?

@Fish-Git Fish-Git added Waiting to close... Waiting for user to report back whether problem still exists or not before closing as resolved. and removed IN PROGRESS... I'm working on it! (Or someone else is!) labels Feb 20, 2021
@jeff-snyder
Copy link
Author

Hi Peter,

Thanks for the quick work! I'll get it tested and let you know about closing the issue ASAP.

@jeff-snyder
Copy link
Author

Hi everybody.

I was able to get Hercules rebuilt and tested everything. It all looks good.

Peter,
Thanks for all your help!

Fish,
Please close this issue. Thanks! (I would have "close(d) with comment", but I wasn't sure if you had some other process you follow.)

@Peter-J-Jansen
Copy link
Collaborator

OK, thanks for the positive feedback. I'll close #369 now.

Cheers,

Peter

@Fish-Git Fish-Git removed the Waiting to close... Waiting for user to report back whether problem still exists or not before closing as resolved. label Feb 22, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
BUG The issue describes likely incorrect product functionality that likely needs corrected.
Projects
None yet
Development

No branches or pull requests

4 participants