I'm curious, why do you think this is inapplicable to Arc? In a prototype-based class system, you could use these hidden classes to model namespaces for a purely functional space, and then function dispatch would be really fast.
Further, it opens up interesting possibilities for modules.
It's hard to imagine a more sub-optimal implementation than the one we currently have (short of porting it to Ruby 1.8's interpreter, lolslow!) :)
Hmm... the reason I was thinking it wouldn't help is because arc objects are just glorified hash tables. This means that if we implemented an arc2js program it would use a hash table on the js end...
It's possible though that you could implement arc hash tables using js objects, since they, too, are just glorified hash tables in a different way.
Looking briefly on the web, it looks like it might be possible to fully implement hashtables with js objects. In that case, arc objects could benefit from hidden classes. OTOH, regular hash tables may underperform due to innappropriate attempts to force hidden classes on them.
; scheme!!
(define (ar-funcaller1)
(lambda (f arg1)
;dispatch based on f, argument types, etc.
...))
(I've actually already implemented this experimentally, but the performance didn't increase much - however this was before I implemented serious multimethods (which I haven't actually completed implementing anyway...), so I suppose I can bring this back to amortize multimethod lookup.)
By using a callsite lambda object, we can keep track of previous multimethod etc. lookups.
And then we can make the compiler emit code especially for (foo 'bar) forms:
callsite1 can then cache lookups on hidden classes.
(define (always-false . rest) #f)
(define (ar-quotecaller1 quoted)
; use only a single variable to prevent
; race conditions in multi-threaded code
; (might not always work if using true
; threads instead of green ones though)
(let ((checker-and-dispatch (cons always-false #f)))
(lambda (ob)
; copy to our own local variable to
; prevent race conditions (assuming
; the read is atomic anyway)
(let* ((my-checker-and-dispatch checker-and-dispatch)
(checker (car my-checker-and-dispatch)))
(if (not (checker ob))
(begin
(set! my-checker-and-dispatch
(patch-quotecaller1 ob quoted))
(set! checker-and-dispatch my-checker-and-dispatch)))
((cdr my-checker-and-dispatch) ob quoted)))))
(define (patch-quotecaller1 ob quoted)
(cond
((table-with-hidden-class? ob) ; implemented however you will
(let* ((hidden-class (hidden-class-get ob))
(hidden-class-index (hidden-class-index-lookup hidden-class quoted)))
(cons
; checker
(lambda (ob _)
(and (table-with-hidden-class? ob)
(eq? hidden-class (hidden-class-get ob))))
; dispatcher
(lambda (ob _)
(hidden-class-ref ob hidden-class-index)))))
(... ;other conditions: functions, methods, other data types...
)))
I was actually going to implement this, but then I thought that using arc hash tables as objects - and in particular the (tb 'key) form - wasn't actually that common enough (at least in Arc) to justify optimizing for it. Hmm.
Am I the only person who thinks something is wrong when a simple blog like that one evidently requires 212 different perl modules?
One thing I've noticed about CPAN and the Perl community is that they take module and library creation to the level of an autistic activity. Even the most banal and trivial of things has multiple implementations in CPAN of varying quality.
Personally, I find a system like MzScheme or Ruby, which has fewer libraries of higher quality (and the difficulty of the problems solved by such libraries is higher) to be of more value than sifting through years of accreted code and having to install multiple libraries that do the same thing out of the desire to integrate two disparate components.
Besides the fact that he used 212 diferent modules (I wonder what kind of things does his blog do behind the curtains...), the point of the post is that the Perl community has a real libraries culture: when someone solves a task, he/she puts the solution inside a module and usually documents it. This is a great thing, because it encourages code reuse more than any other programming technique (e.g. OO programming) and it is something that the Lisp community has always missed, in particular the documentation part.
If the Arc community could learn that important lesson from Perl, that would be a huge leap forward for the Lisp world.
People do this in Ruby, and it can be fairly useful when building distributed systems. Another approach is to make something like
(load 'foobar)
Check all the local codepaths for foobar, and if those fail then check the remote codepaths. An example of this approach was used in my projects, here's a link to the code that did it in ruby: http://github.com/KirinDave/fuzed/tree/376747981d7801d3e8ba2...
The thing is, tech wise Common Lisp is in very good shape. It could very much take C++'s place and everyone would be happier and probably get similar performance with cleaner code.
Likewise, Scsh shows that a lisp-1 could encroach on territory typically occupied by Ruby and Perl.
These aren't Lisp's problems. Lisp's problems are social. It has no anchor point, and most of the modern, successful languages have an anchor point either in a tightly knit community, a corporate backer, or a charismatic personality. pg could assume this roll, but thus far he hasn't opened up the arc development enough to really allow that to happen. Given his duties at Y-combinator, I guess this is understandable, but it's not helping Arc at all.
Arc seems pretty open to me. There's nothing stopping people from creating countless useful libraries. I'm guessing that many people have turned away from Arc after seeing the dirth of libraries. Paradoxically, that's one of the things that attracts me to Arc. There are many opportunities to make an early impact within the Arc community.
An associative array primitive is one of the major benefits of Perl and Ruby for writing readable code that still runs in a reasonable amount of time (not fast).
Removing it for conceptual purity weakens the language. The hash table was complex years ago, but by now we should come to regard it as a primitive.
In the case of Python, yes, it is prejudice. Have you ever coded in Python? If anything throws you off, it's going to be thinking in Python idioms, not the whitespace. In earlier languages, "significant indentation" meant something entirely different (columns 0-20 are comments, etc.). Python's use of indentation means that the interpreter uses the same code-block delimiters that the people reading and writing the code use.
Significant indentation is great. Wait, come back! Don't put this in the language yet! I'm not done! Significant indentation is great for languages like Python, in which code blocks are different from expressions and never go smack in the middle of a function call.
I've seen attempts to do significant indentation in Lisp, and instead of getting simple and pretty like Python, it gets all tangled and ugly. Ideas that are great in one context may be horrible in another. Need I remind you what happened when Larry Wall tried to combine all the great ideas from a bunch of different languages into one?
If you claim it's great for languages like python, and that's fine if you want it. I do know python, but I don't use it actively because I detest its engineer engineering and its significant whitespace.
But Lisps and Arc are not Python, or even a language like Python. And making a terminator or separator character that is both invisible and munged differently on many platforms seems like an extremely bad idea to me.
No, it's common sense. Call me old-fashioned, but I prefer being able to tell if code is syntactically correct just by looking at it instead of having to perform a hex dump. Never mind being able to reconstruct indentation from syntax when it eventually gets mangled in email/forum postings/code restructuring.
> Have you ever coded in Python?
Yes I have. Apart from the above caveat it's quite a nice language, although I find Ruby more elegant.
> Need I remind you what happened when Larry Wall tried to combine all the great ideas from a bunch of different languages into one?
Yes, it got hugely popular. You got a problem with that?
> No, it's common sense. Call me old-fashioned, but I prefer being able to tell if code is syntactically correct just by looking at it instead of having to perform a hex dump.
Are you trying to say that source code (either with or without meaningful indentation) requires a hex dump to read? Or are you talking about object code (in which case I am pretty sure indentations is entirely irrelevant).
>> Need I remind you what happened when Larry Wall tried to combine all the great ideas from a bunch of different languages into one?
> Yes, it got hugely popular. You got a problem with that?
I don't know about you but Perl's conglomerate syntax, combined with all that extra magic, makes Perl pretty unmanageable in my opinion after a couple hundred lines of code.
From pg's essay Arc at 3 Weeks:
"We're also going to try not to make the language look like a cartoon character swearing. [...] (Dan Giffin recently observed that if you measure Perl programs by the number of keys you have to press, they don't seem so short.)"
hehe sure you can borrow that part. i don't know any python so you'd be burning the null brain. the indentation sensitivity is actually the reason i never got interested in that language. it's probably just the fact that things aren't explicitly closed -- something i wouldn't mind in lisp but feels like standing on a plank in the middle of the ocean in a structured language like python, probably due to the mixture of symbolic and visual notation which totally weirds me out man
though i'm sure it's really just an exaggeration in my mind due to only hearing about that aspect of the language, and i probably would have gotten used to it easily enough
Please, lets not get into a "yes it is"/"no it isn't" argument here!
I am aware of some rather horrid languages that forced the use of meaningful indentation, but that doesn't make indentation evil in and of itself.
As for in Lisp, a number of people have tried to add it to Arc (including pg) without too much success. It seems that meaningful indentation lends itself much better to statement based languages than expression based ones.
I hope that arc keeps a canonical and authoritative implementation. One of the big problems keeping many people from deploying Lisp applications is the uncertainty and difficulty in choosing a particular implementation. It can also end up fragmenting developer time and you will inevitably end up with incompatible but extremely similar arc implementations everywhere.
One or two canonical implementations is fine, but there should be one authority for the language and it should be a fairly good implementation.
I think the axiomatic approach is intended to stop this. Lisp suffered because the spec was complex and ambiguous. To port Arc, just port the axioms, run the "spec" et voilĂ , you have a conforming implementation.
It should be noted that pg intends to implement more of Arc in Arc, when he gets the time.
In fact I have an implementation in SBCL half way (it basically can load arc.arc, but I haven't bothered with network/thread functions yet). To solve the call/cc problem I got to the point of doing a CPS transform, and since Arc has only 4 special forms (very few axioms, as you say) this was 'easy'...
And to expand on that, the answer is generally yes. MzScheme is a great version of scheme for its completeness, not its speed. However, several schemes are directly competitive SBCL and CMUCL. Arc could be ported to one of these without too much effort.
As for speed, Stalin Scheme is amazing for example. Not very complete and well documented for example. However, CL is full of "efficiency hacks". Scheme is full of purity, not always easy to implement efficiently. Optimizing CL is easier than Scheme, IMHO.
Why do you think CL is easier to optimize that Scheme? This isn't a hostile question; I'm just curious. Intuitively, I feel like "purer" languages should be easier to optimize.
First, CL has in the standard many possibilities for making declarations reagrding optimization : for the functions you want, you can compile code, declare types (e. g. this var only holds fixnums), declare you want to optimize the speed and ignore type safety, etc. This way, you end up writing code the way you would write it in C. There is no such thing in the Scheme standard. Individual implementations could, of course, but as far as I know no one does.
Abother example is call/cc. This is a very interesting beast, only existing in Scheme. But it is hard to implement efficiently.
The last example I can think of is 'nil. Using nil as false and the empty list is very interesting in this regard : you can implement nil as the NULL pointer, which is also 0, the false boolean. Less manipulations to do on the bare metal. Distinguishing between #f and '(), on the contrary, implies making more tests at the lower levels.
re: call/cc - I think a bit of the lambda the ultimate series of papers eventually boils down to the realization that a machine language jump-to-subroutine is equivalent to a call/cc, and the target of the call/cc just has to access the return address on the stack as a function address.
I don't get it. There's the whole stack copying for call/cc, so call/cc is much more expensive.
(I read the "Lambda the ultimate GOTO" paper you referenced earlier; it's about goto vs structured programming, not call/cc. As an aside, it's very interesting to reflect on just how controversial structured programming was.)
Implementing call/cc efficiently has been well-reasearched in the Scheme community. For a very well-written account of a non-stack-copying implementation see
Then continue at ReadScheme at "Compiler Technology/Implementation Techniques and Optimization"
to see further developments (Look especially for Clinger's papers).
In a function call, the state (the current computation being done) is saved anyway, and therefore "copied" if that is your preferred term. So ideally, call/cc should have the same overhead as an ordinary function call; the only difference is that in call/cc the continuation state is the value given to the function, while in a function call it's just one of the values given to the function.
Note however that much of the theoretical analyses of call/cc make a basic assumption of a "spaghetti stack", which would mean that partially unwound stacks would be saved implicitly as long as any continuation exists which refers to that stack, and all stacks themselves are subject to garbage collection. Most machines don't actually have a spaghetti stack and can't make a spaghetti stack anyway ^^. That said a spaghetti stack could be implemented as a simple list, with push == cons and pop = cdr.
Alternatively store the local variables on a garbage-collected heap, and include a reference to the local variables with the continuation/return address (you'll probably need to save the pointer-to-local-variables anyway, since the target function is likely to use that pointer for its own locals). Again, no additional overhead over plain function calls, except perhaps to restructure the return address and the pointer-to-local-variables.
Don't know about MIT Scheme, but if I were to implement call/cc on stock hardware and compiling down to machine language that's what I'd do ^^
I'm still totally not understanding your claim that call/cc should have the same overhead as an ordinary function call.
I read the Clinger "Implementation Strategies for Continuations" paper and they found call/cc about 10 times slower than function calls on the tak/ctak tests. I tried those tests on PLT Scheme and the overhead I saw is even worse: .7 seconds with function calls vs 51.8 seconds with continuations on (tak 24 16 8).
Clinger says about the stack strategy: "When a continuation is captured, however, a copy of the entire stack is made and stored in the heap. ... Variations on the stack strategy are used by most implementations of Scheme and Smalltalk-80."
Components of function call: (1) put arguments somewhere (2) put return address somewhere (3) jump to function.
Components of call/cc: (1) put return address as argument somewhere (2) put return address somewhere (3) jump to function.
That said, continuations generally assume that "put X somewhere" means somewhere != stack. Yes, even return addresses are not intended to be on the stack when using continuations. The main problem is when compiling down to C, which always assumes that somewhere == stack. If you're compiling down to machine language where you have access to everything, then you can just ignore the stack, but not in C, which inherently expects the stack.
Ah, I get it. I suppose typing is one of the big issues affecting speed. If the language standard insists on dynamic typing, there might be no way to get certain kinds of optimizations.
And yeah, call/cc is probably always going to be a bear. But man is it cool. :)
I suppose this goes against what a few of us (including me) were saying above -- that the language and the speed are really separate issues. Or maybe it's more coherent to say that language standards (as opposed to "languages" generally understood) can have a profound effect on speed. If they don't give implementors a lot of choice, they can box people into certain corners.
It's interesting that Scheme actually mandates optimization in at least one case (tail recursion). I don't know how many language standards make those kinds of demands, but I suspect there aren't many.
Tail recursion is interesting as it is not especially an optimization for speed but as a way to make programmers rely primarily on functional programming : if you don't have it, functional programming is rapidly a dead-end as you can make the stack explode really fast. As a bonus, it is faster :)