9

It's my understanding that the constructors of a type which have no fields are "statically allocated" and GHC shares these between all uses, and that the GC will not move these.

If that's correct then I would expect uses of reallyUnsafePtrEquality# on values like False and Nothing to be very safe (no false negatives or positives), because they can only be represented as identical pointers to the single instance of that constructor.

Is my reasoning correct? Are there any potential gotchas, or reasons to suspect that this could become unsafe in near future versions of GHC?

Community
  • 1
  • 1
jberryman
  • 15,764
  • 4
  • 39
  • 77
  • Might be true. On the other hand, seems like for constructors with no fields the performance win over boring old `(==)` is going to be pretty minimal... – Daniel Wagner Jul 06 '14 at 18:42
  • @DanielWagner My actual use case is working with the new CAS primops on boxed references. When using the `atomic-primops` library I'd like to be able to cache a `Ticket Nothing` (for instance) and be sure it never goes stale. – jberryman Jul 06 '14 at 18:45
  • You might also need to be careful about plugins and the like. – GS - Apologise to Monica Jul 06 '14 at 18:45
  • @GaneshSittampalam Can you elaborate on that? You mean GHC plugins that could change those kind of low-level details and make my assumptions incorrect? – jberryman Jul 06 '14 at 18:47
  • 3
    No, I meant that plugins that load code into a running Haskell program might wind up with their own copies of those nullary constructors. I don't know either way, but it's something I'd be worried about. – GS - Apologise to Monica Jul 06 '14 at 18:48
  • I think the only times that will happen are times when GHC won't consider them to be the same type anyway. It might still be a safe assumption. – Carl Jul 06 '14 at 19:27
  • 2
    I'm pretty sure nullary constructor pointers are re-written to point at the .TEXT only after they survive a GC. Their initial allocation and pointer is still to dynamically allocated space which makes the technique proposed here unsafe. – Thomas M. DuBuisson Jul 06 '14 at 20:04
  • I'm with Ganesh. I'd worry about plugins, and shared libs. – augustss Jul 06 '14 at 20:06

1 Answers1

12

I actually managed to get reallyUnsafePtrEquality to do the wrong thing.

Here's my minimal code example

{-# LANGUAGE MagicHash #-}
import GHC.Prim

-- Package it up nicely
ptrCmp :: a -> a -> Bool
ptrCmp a b = case (reallyUnsafePtrEquality# a b) of
  0# -> False
  1# -> True

main = do
  b <- readLn
  let a  = if b then Nothing else Just ()
      a' = Nothing
  print $ a == a'     -- Normal
  print $ ptrCmp a a' -- Evil

And doing something like

 $ ghc --version
   The Glorious Glasgow Haskell Compilation System, version 7.8.2
 $ ghc unsafe.hs
 $ ./unsafe
   True
   True
   False

So... yes, reallyUnsafePtrEquality is still evil.

Daniel Gratzer
  • 49,751
  • 11
  • 87
  • 127
  • Worth noting that I have absolutely no clue **why** this happens. Just disproved this conjecture through a combination of dumb luck and trying pathological cases. If anyone could shed some light.. – Daniel Gratzer Jul 07 '14 at 02:06
  • oh awesome, thanks for this. I got a little ahead of myself when I suggested it might be "very safe", since we still at least have the problem of comparing thunks with values, and all the complications that inlining might cause there. I would guess something like that is going on here, but I can't exactly see what... – jberryman Jul 07 '14 at 03:06
  • I thought using `if b then a' else Just ()` would surely make it work. I was wrong. It's indeed really evil :) – chi Jul 07 '14 at 10:16
  • Oddly enough, adding a bang pattern to `a` makes `ptrCmp` return True. I think I understand why it doesn't work without the bang pattern: it's evil. – John L Jul 07 '14 at 23:42
  • It's because of laziness! The ternary expression is a suspended computation. When you compare the pointers, `a` is a pointer to a thunk, not a pointer to `Nothing`. Forcing `a` by putting a bang in `ptrCmp` forces the thunk, which explains @JohnL's result. – Benjamin Hodgson Feb 07 '17 at 19:59
  • 1
    @BenjaminHodgson That explanation doesn't seem complete to me; doesn't `print (a == a')` force `a` and `a'`, leaving them non-thunks by the time `ptrCmp` is called? – Daniel Wagner Feb 07 '17 at 20:16
  • Hmm, that's a good point, I guess I didn't read the code carefully enough. In that case, I dunno. – Benjamin Hodgson Feb 07 '17 at 20:40
  • For what it's worth: putting a `performMinorGC` after `print (a == a')` but before `print (ptrCmp a a')` causes it to print `True` for the second one. So it looks like @ThomasMDuBuisson had it right in his comment above. – Daniel Wagner Feb 07 '17 at 21:22
  • Ok, but is it possible to get a false positive (ie `ptrCmp a b == True && (a == b) == False`)? Maybe equality performance can be imporoved by `fastSafeEq a b = ptrCmp a b || a == b` (for deterministic objects)? – fakedrake Nov 25 '19 at 19:08