| Background: Arc is implemented in Racket/MzScheme, which recently introduced immutable pairs. However, Arc is supposed to allow for list mutation. This was hacked around (by Eli Barzilay of PLT) using some unchecked pointer operations in MzScheme to alter the supposedly immutable pairs. That seemed to work well, but a year ago, akkartik noticed something going wrong with queues. Occasionally, adding or deleting an item from the queue would fail. This suggested something was going wrong, somehow, with the list mutation. Here is his original thread: http://arclanguage.org/item?id=11347 Recently, I conducted a more detailed investigation, and found that the problem was caused when garbage collection happened in the middle of the execution of the hack "set-car!" or "set-cdr!" functions. Here was my thread: http://arclanguage.org/item?id=13518 Today, I continued my investigation and found that, specifically, the Racket function "ptr-ref", which was used to return a C pointer to the pair to be mutated, caused memory allocation (and, therefore, possibly garbage collection). This is the main part of Eli's code: (define (set-ca/dr! offset who p x)
(if (pair? p)
(let ([p* (get-ptr)])
(ptr-set! p* _scheme p)
(ptr-set! (ptr-ref p* _pointer 0) _scheme offset x))
(raise-type-error who "pair" p)))
(define (n-set-car! p x) (set-ca/dr! 1 'set-car! p x))
(define (n-set-cdr! p x) (set-ca/dr! 2 'set-cdr! p x))
Here, Eli uses a global C pointer, (get-ptr), and sets that to point to the pair to be modified. The entire 'let expression could be replaced by this: (ptr-set! (cast p _scheme _pointer) _scheme offset x))
Here, 'cast returns a new pointer. It's slightly more efficient to reuse a global pointer, hence what Eli did. But either way, if you want to use ptr-set!, you'll need to have something return a C pointer to the pair-to-be-modified, which means malloc'ing. Apparently all C pointers
returned as values are malloc'd objects.[To be precise about what goes wrong, the set-ca/dr! function first makes a C pointer to the pair-to-be-modified, and then, if a garbage collection happens at this point, the function will follow the now-invalid pointer and modify the car or cdr of what used to be the pair. That location might be
garbage, causing the operation to have no effect--or it might even be memory no longer owned by the Racket process, causing a segmentation fault. This completely explains my previous results.] So I looked for another way to do it, one that didn't malloc. And I found one. I have found a flawless way to implement set-car! and set-cdr! on Racket's immutable pairs. The solution is to treat them as vectors and use Racket's unsafe vector ops. [Edit: rntz points out below that one can also use unsafe-set-mcar! and unsafe-set-mcdr!, which work just as well, and are
likely to be better supported in future versions of Racket. That seems much better and easier and simpler to me, and clearly I need to RTFM more thoroughly. For posterity, I've left some information on how the unsafe-vector-set! version works.] > (define xs '(1 2 3))
> (define ys '(1 2 3))
> (require racket/unsafe/ops)
> xs
'(1 2 3)
> (unsafe-vector-set! xs 0 ys)
> xs
'(1 1 2 3)
> (unsafe-vector-set! xs -1 ys) ;yeah, -1
> xs
'((1 2 3) 1 2 3)
> (unsafe-set-mcar! xs 22) ;this works too!
> xs
'(22 1 2 3)
We see, using a Racket version of my 'mem-use macro, that this causes no malloc'ing. (define-syntax mem-use
(syntax-rules ()
((_ body ...)
(let ((u (current-memory-use)))
body ...
(- (current-memory-use) u)))))
> (mem-use (unsafe-set-mcar! xs 15))
0
> xs
'(15 1 2 3)
> (mem-use (unsafe-vector-set! xs -1 300))
0
> xs
'(300 1 2 3)
And now for the implementation in ac.scm. I will present the unsafe-set-mca/dr! version. Throw away (or comment out) the old definitions of ptr, get-ptr, set-ca/dr!, n-set-car!, n-set-cdr!, x-set-car!, and x-set-cdr!, and put this in: (require racket/unsafe/ops)
(define (x-set-car! p x)
(if (pair? p)
(unsafe-set-mcar! p x)
(raise-type-error 'set-car! "pair" p)))
(define (x-set-cdr! p x)
(if (pair? p)
(unsafe-set-mcdr! p x)
(raise-type-error 'set-cdr! "pair" p)))
The fact that it explicitly says (require racket...) implies that this must be Racket, so I have dropped the check for native set-ca/dr!.This fixes the queue bug. I've run akkartik's test program and a few versions thereof, and there have been no problems. Additionally, list mutation is faster, to the point where sorting a large list is about 3x as fast as it was before (5.5 secs instead of 18.7 secs for 500k random numbers), and the destructive list reversal function 'nreverse (from Common Lisp) is actually faster than 'rev, not to mention it creates almost zero garbage: (def nrev (xs)
(let u nil
(whilet head xs
(= xs cdr.xs)
(scdr head u) ; faster than (= cdr.head u) for various reasons
(= u head))
u))
arc> (mem-use:no:= xs nrev.xs) ;where xs is a 100k-element list
168
It's possible that the implementation of Racket pairs or mutable pairs (or vectors if we prefer that version) will change in the future. And it would be nicer if what we're doing were officially supported by the language. But I see no reason to expect either of those changes anytime soon, and right now, this works, as far as I've tested it. So we can rejoice and be merry. |