Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SIGSEGV crashes in ponyint_heap_mark() #2850

Closed
slfritchie opened this issue Aug 2, 2018 · 1 comment
Closed

SIGSEGV crashes in ponyint_heap_mark() #2850

slfritchie opened this issue Aug 2, 2018 · 1 comment

Comments

@slfritchie
Copy link
Contributor

slfritchie commented Aug 2, 2018

Howdy, I've got a not-exactly-tiny program that can reliably crash ponyint_heap_mark() on OS X and Linux (Ubuntu Xenial) with LLVM 3.9.1.

Fetch the source at git clone --depth=1 -b slf-file-io-journal5-gh2850 https://github.com/slfritchie/wallaroo.git, then cd wallaroo and then follow the instructions at https://github.com/slfritchie/wallaroo/blob/slf-file-io-journal5/utils/dos-dumb-object-service2/main2.pony#L11-L59.

You'll also need a Python 2 runtime in order to run the TCP service that this thingie requires. The TCP use appears to be necessary.

If I compile with a Valgrind + debug version of Pony 0.24.4 (e.g., make install config=debug ponydir=/usr/local/pony/0.24.4+valgrind use=valgrind), then I see the following when running under LLDB:

% lldb -- ./dos-dumb-object-service2 use-dir-foo j-file-name
(lldb) target create "./dos-dumb-object-service2"
Current executable set to './dos-dumb-object-service2' (x86_64).
(lldb) settings set -- target.run-args  "use-dir-foo" "j-file-name"
(lldb) run
Process 12712 launched: './dos-dumb-object-service2' (x86_64)
Process 12712 stopped
* thread #2: tid = 12716, 0x0000000000487aa5 dos-dumb-object-service2`ponyint_heap_mark(chunk=0x0000000000000000, p=0x00007fffe55ec7e0) + 16 at heap.c:551, name = 'dos-dumb-object', stop reason = signal SIGSEGV: invalid address (fault address: 0x10)
    frame #0: 0x0000000000487aa5 dos-dumb-object-service2`ponyint_heap_mark(chunk=0x0000000000000000, p=0x00007fffe55ec7e0) + 16 at heap.c:551
   548 	  // external pointer in the same pass.
   549 	  bool marked;
   550 	
-> 551 	  if(chunk->size >= HEAP_SIZECLASSES)
   552 	  {
   553 	    marked = chunk->slots == 0;
   554 	
(lldb) p chunk
(chunk_t *) $0 = 0x0000000000000000
(lldb) up
frame #1: 0x000000000048c0c2 dos-dumb-object-service2`mark_local_object(ctx=0x00007ffff6fc2a48, chunk=0x0000000000000000, p=0x00007fffe55ec7e0, t=0x0000000200000000, mutability=1) + 53 at gc.c:191
   188 	  if(mutability != PONY_TRACE_OPAQUE)
   189 	  {
   190 	    // Mark in our heap and recurse if it wasn't already marked.
-> 191 	    if(!ponyint_heap_mark(chunk, p))
   192 	      recurse(ctx, p, t->trace);
   193 	  } else {
   194 	    // Do a shallow mark. If the same address is later marked as something that
(lldb) up
frame #2: 0x000000000048cf48 dos-dumb-object-service2`ponyint_gc_markimmutable(ctx=0x00007ffff6fc2a48, gc=0x00007fffe5fc14a8) + 146 at gc.c:605
   602 	      void* p = obj->address;
   603 	      chunk_t* chunk = ponyint_pagemap_get(p);
   604 	      pony_type_t* type = *(pony_type_t**)p;
-> 605 	      mark_local_object(ctx, chunk, p, type, PONY_TRACE_IMMUTABLE);
   606 	    }
   607 	  }
   608 	}
(lldb) p chunk
(chunk_t *) $1 = 0x0000000000000000
(lldb) p type
(pony_type_t *) $2 = 0x0000000200000000
(lldb) p *type
error: Couldn't apply expression side effects : Couldn't dematerialize a result variable: couldn't read its memory

At other times, both @dipinhora and I have seen this for *type:

(lldb) p *type
(pony_type_t) $2 = {
  id = 5
  size = 8
  field_count = 0
  field_offset = 0
  instance = 0x00000000004a7ad8
  trace = 0x0000000000000000
  serialise_trace = 0x0000000000000000
  serialise = 0x0000000000477680 (dos-dumb-object-service2`None_Serialise)
  deserialise = 0x0000000000000000
  custom_serialise_space = 0x0000000000000000
  custom_deserialise = 0x0000000000000000
  dispatch = 0x0000000000000000
  final = 0x0000000000000000
  event_notify = 4294967295
  traits = 0x00000000004a9d38
  fields = 0x0000000000000000
  vtable = 0x0000000000000000
}

... which let Dipin on a wild goose chase trying to figure out why a primitive like None would be involved in GC like this.

@slfritchie
Copy link
Contributor Author

Closing, due to pilot error. I now believe that this is expected(*) behavior when the Pthreads stack size is being overrun by a greedy & deep Pony behavior function that uses too much recursion.

(*) "expected" = undefined behavior = WTF OS-specific zaniness

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants