Skip to content

Commit 54fe0fd

Browse files
committed
runtime: store bp on cgocallback as unsafe.Pointer
As of CL 580255, the runtime tracks the frame pointer (or base pointer, bp) when entering syscalls, so that we can use fpTracebackPCs on goroutines that are sitting in syscalls. That CL mostly got things right, but missed one very subtle detail. When calling from Go->C->Go, the goroutine stack performing the calls when returning to Go is free to move around in memory due to growth, shrinking, etc. But upon returning back to C, it needs to restore gp.syscall*, including gp.syscallsp and gp.syscallbp. The way syscallsp currently gets updated is automagically: it's stored as an unsafe.Pointer on the stack so that it shows up in a stack map. If the stack ever moves, it'll get updated correctly. But gp.syscallbp isn't saved to the stack as an unsafe.Pointer, but rather as a uintptr, so it never gets updated! As a result, in rare circumstances, fpTracebackPCs can correctly try to use gp.syscallbp as the starting point for the traceback, but the value is stale. This change fixes the problem by just storing gp.syscallbp to the stack on cgocallback as an unsafe.Pointer, like gp.syscallsp. It also adds a comment documenting this subtlety; the lack of explanation for the unsafe.Pointer type on syscallsp meant this detail was missed -- let's not miss it again in the future. Now, we have a fix, what about a test? Unfortunately, testing this is going to be incredibly annoying because the circumstances under which gp.syscallbp are actually used for traceback are non-deterministic and hard to arrange, especially from within testprogcgo where we don't have export_test.go and can't reach into the runtime. So, instead, add a gp.syscallbp check to reentersyscall and entersyscallblock that mirrors the gp.syscallbp consistency check. This probably causes some miniscule slowdown to the syscall path, but it'll catch the issue without having to actually perform a traceback. Fixes #69085. Change-Id: Iaf771758f1666024b854f5fbe2b2c63cbe35b201 Reviewed-on: https://go-review.googlesource.com/c/go/+/608775 Reviewed-by: Nick Ripley <nick.ripley@datadoghq.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Michael Pratt <mpratt@google.com> Reviewed-by: Cherry Mui <cherryyz@google.com>
1 parent 7b4ecaa commit 54fe0fd

File tree

2 files changed

+22
-5
lines changed

2 files changed

+22
-5
lines changed

src/runtime/cgocall.go

+7-2
Original file line numberDiff line numberDiff line change
@@ -338,9 +338,14 @@ func cgocallbackg(fn, frame unsafe.Pointer, ctxt uintptr) {
338338
// stack. However, since we're returning to an earlier stack frame and
339339
// need to pair with the entersyscall() call made by cgocall, we must
340340
// save syscall* and let reentersyscall restore them.
341+
//
342+
// Note: savedsp and savedbp MUST be held in locals as an unsafe.Pointer.
343+
// When we call into Go, the stack is free to be moved. If these locals
344+
// aren't visible in the stack maps, they won't get updated properly,
345+
// and will end up being stale when restored by reentersyscall.
341346
savedsp := unsafe.Pointer(gp.syscallsp)
342347
savedpc := gp.syscallpc
343-
savedbp := gp.syscallbp
348+
savedbp := unsafe.Pointer(gp.syscallbp)
344349
exitsyscall() // coming out of cgo call
345350
gp.m.incgo = false
346351
if gp.m.isextra {
@@ -372,7 +377,7 @@ func cgocallbackg(fn, frame unsafe.Pointer, ctxt uintptr) {
372377
osPreemptExtEnter(gp.m)
373378

374379
// going back to cgo call
375-
reentersyscall(savedpc, uintptr(savedsp), savedbp)
380+
reentersyscall(savedpc, uintptr(savedsp), uintptr(savedbp))
376381

377382
gp.m.winsyscall = winsyscall
378383
}

src/runtime/proc.go

+15-3
Original file line numberDiff line numberDiff line change
@@ -4426,7 +4426,13 @@ func reentersyscall(pc, sp, bp uintptr) {
44264426
}
44274427
if gp.syscallsp < gp.stack.lo || gp.stack.hi < gp.syscallsp {
44284428
systemstack(func() {
4429-
print("entersyscall inconsistent ", hex(gp.syscallsp), " [", hex(gp.stack.lo), ",", hex(gp.stack.hi), "]\n")
4429+
print("entersyscall inconsistent sp ", hex(gp.syscallsp), " [", hex(gp.stack.lo), ",", hex(gp.stack.hi), "]\n")
4430+
throw("entersyscall")
4431+
})
4432+
}
4433+
if gp.syscallbp != 0 && gp.syscallbp < gp.stack.lo || gp.stack.hi < gp.syscallbp {
4434+
systemstack(func() {
4435+
print("entersyscall inconsistent bp ", hex(gp.syscallbp), " [", hex(gp.stack.lo), ",", hex(gp.stack.hi), "]\n")
44304436
throw("entersyscall")
44314437
})
44324438
}
@@ -4564,14 +4570,20 @@ func entersyscallblock() {
45644570
sp2 := gp.sched.sp
45654571
sp3 := gp.syscallsp
45664572
systemstack(func() {
4567-
print("entersyscallblock inconsistent ", hex(sp1), " ", hex(sp2), " ", hex(sp3), " [", hex(gp.stack.lo), ",", hex(gp.stack.hi), "]\n")
4573+
print("entersyscallblock inconsistent sp ", hex(sp1), " ", hex(sp2), " ", hex(sp3), " [", hex(gp.stack.lo), ",", hex(gp.stack.hi), "]\n")
45684574
throw("entersyscallblock")
45694575
})
45704576
}
45714577
casgstatus(gp, _Grunning, _Gsyscall)
45724578
if gp.syscallsp < gp.stack.lo || gp.stack.hi < gp.syscallsp {
45734579
systemstack(func() {
4574-
print("entersyscallblock inconsistent ", hex(sp), " ", hex(gp.sched.sp), " ", hex(gp.syscallsp), " [", hex(gp.stack.lo), ",", hex(gp.stack.hi), "]\n")
4580+
print("entersyscallblock inconsistent sp ", hex(sp), " ", hex(gp.sched.sp), " ", hex(gp.syscallsp), " [", hex(gp.stack.lo), ",", hex(gp.stack.hi), "]\n")
4581+
throw("entersyscallblock")
4582+
})
4583+
}
4584+
if gp.syscallbp != 0 && gp.syscallbp < gp.stack.lo || gp.stack.hi < gp.syscallbp {
4585+
systemstack(func() {
4586+
print("entersyscallblock inconsistent bp ", hex(bp), " ", hex(gp.sched.bp), " ", hex(gp.syscallbp), " [", hex(gp.stack.lo), ",", hex(gp.stack.hi), "]\n")
45754587
throw("entersyscallblock")
45764588
})
45774589
}

0 commit comments

Comments
 (0)