Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] HAConnection poses a risk of leakage #8875

Closed
3 tasks done
crazywen opened this issue Oct 30, 2024 · 0 comments · Fixed by #8876
Closed
3 tasks done

[Bug] HAConnection poses a risk of leakage #8875

crazywen opened this issue Oct 30, 2024 · 0 comments · Fixed by #8876

Comments

@crazywen
Copy link
Contributor

crazywen commented Oct 30, 2024

Before Creating the Bug Report

  • I found a bug, not just asking a question, which should be created in GitHub Discussions.

  • I have searched the GitHub Issues and GitHub Discussions of this repository and believe that this is not a duplicate.

  • I have confirmed that this bug belongs to the current repository, not other repositories of RocketMQ.

Runtime platform environment

CentOS Linux 7 (Core)

RocketMQ version

rocketmq-all-5.1.4

JDK Version

java version "1.8.0_251"
Java(TM) SE Runtime Environment (build 1.8.0_251-b08)
Java HotSpot(TM) 64-Bit Server VM (build 25.251-b08, mixed mode)

Describe the Bug

在生产场景中,发现了堆内存溢出,业务量并不大。
image

内存堆栈
image

image

Steps to Reproduce

针对ha端口10912进行高频探活,压测时间一长即可出现。
`

-- coding:UTF-8 --

import sys
import socket
import time
sys.path.append(".")
def tcp_health_check(host, port):
now = time.strftime("%Y-%m-%d %H:%M:%S", time.localtime())
try:
sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
sock.settimeout(1)
sock.connect((host, port))
sock.sendall(now.encode())
print("{}TCP连接成功:{}:{}".format(now,host,port))
sock.close()
return True
except socket.error as e:
print("TCP连接失败: {}".format(e))
if sock:
sock.close()
return False

def main():
host = '127.0.0.1'
port = 8081

while True:
    result = tcp_health_check('127.0.0.1' , 10912)
    time.sleep(0.5)

if name == "main":
main()
`

What Did You Expect to See?

探活不应该造成broker内存溢出

What Did You See Instead?

应该针对这种异常场景进行风险规避,通过堆栈和ha逻辑确认,在haconnection建立,进行Read/WriteSocketService线程的启动,和追加connection到列表的顺序里面,存在先删(Read/WriteSocketService run方法执行),后追加的风险。

Additional Context

企业微信截图_1730255620288
企业微信截图_17302556638843

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant