Hi, finally replying to this thread. So, I, a long time ago, found that the + function was allocating memory every time it was called; this a) made things slower than they had to be (according to my old thread, incrementing a variable was half as fast as it should have been), b) triggered garbage collections, which were annoying pauses at the REPL, and c) triggered garbage collections when I was doing, like, (time:repeat 100000 <test procedure>) (note that "repeat" is implemented with "for", which uses +).
The problem with (c) is that it made the time to completion rather variable (depending on whether a GC happened, or sometimes on how many GCs happened), so the results were difficult to interpret. It interfered with my measuring and comparing the performance of my procedures.
So it annoyed me. And I found that just using the Scheme + fixed it, stopped the malloc'ing. To know that this crap was caused by a feature I didn't use and that Paul Graham had seemed to declare was a bad idea, a feature that seemed it shouldn't even be there anymore... I took it upon myself to eradicate it from the source code. And I felt better afterwards.
But now this case-lambda version of + (which is polymorphic) seems to not malloc at all, and a bit of testing by me indicated it performed approximately as well as the raw Racket +; I couldn't see any difference. (Btw, I think I perceived that Racket doesn't malloc for argument lists when you call "apply" or something, but it does malloc--probably copies the arglist--if you access it with car or cdr.)
So polymorphic + is no longer so obnoxious to me, and I no longer feel the need to eradicate it. I'd probably still prefer to use "string" to join/coerce to strings and "join" to join lists, myself, and I'd probably prefer "join" as a generic join operator. But I won't kill concatenate-+ on sight. Since overloading itself seems not to be the cause of the performance problems, I might warm to the idea... maybe overload + for my polynomials--or for matrices, I've been playing with them recently--though I've actually only multiplied them, never added them--perhaps I should overload *...
You may find the following definitions extremely useful. (I do.) Evolved from my previous "mem-use" macro.
;note that "memory" = $.current-memory-use
; and "current-gc-milliseconds", "current-process-milliseconds"
; were supplied in ac.scm
(= gc-msec current-gc-milliseconds
process-msec current-process-milliseconds)
(mac utime body
(w/uniq (gtime ggc gmem)
`(with (,gtime (msec) ,ggc (gc-msec) ,gmem (memory))
(do1 (do ,@body)
(prn "time: " (- (msec) ,gtime)
" gc: " (- (gc-msec) ,ggc)
" mem: " (- (memory) ,gmem))))))
(= time utime)
You might expect some memory or time overhead just from using this, but in fact there doesn't seem to be any:
arc> (time:- 1 2)
time: 0 gc: 0 mem: 0 ;zero overhead
-1
;now I copy the old definition of + from ac.scm to clipboard
arc> (= + (eval:list '$ (read:pbpaste)))
#<procedure>
arc> (time:+ 1 2)
time: 0 gc: 0 mem: 352 ;probably from JIT-compiling +
3
arc> (time:+ 1 2)
time: 0 gc: 0 mem: 64
3
arc> (time:+ 1 2)
time: 0 gc: 0 mem: 64 ;by now this is clearly the usual mem-use
3
arc> (time:+ 1 2)
time: 0 gc: 0 mem: 64
3
arc> (= + $.+) ;Racket +
#<procedure:+>
arc> (time:+ 1 2)
time: 0 gc: 0 mem: 0
3
arc> (time:+ 1 2)
time: 0 gc: 0 mem: 0
3
;now I copy this definition to clipboard:
(define (ar-+2 x y)
(cond ((char-or-string? x)
(string-append (ar-coerce x 'string) (ar-coerce y 'string)))
((and (arc-list? x) (arc-list? y))
(ac-niltree (append (ar-nil-terminate x) (ar-nil-terminate y))))
(#t (+ x y))))
arc> (eval:list '$ (read:pbpaste))
#<void>
arc> (time ($.ar-+2 1 2))
time: 0 gc: 0 mem: 416 ;JIT
3
arc> (time ($.ar-+2 1 2))
time: 0 gc: 0 mem: 0 ;so it allocates zero memory
3
arc> (time ($.ar-+2 1 2))
time: 0 gc: 0 mem: 0
3
Now for more playing around. A Racket loop that counts to 1 million.
arc> (time:$:let loop ((n 0)) (if (> n 1000000) 't (loop (+ n 1))))
time: 3 gc: 0 mem: 760
t
arc> (time:$:let loop ((n 0)) (if (> n 1000000) 't (loop (+ n 1))))
time: 5 gc: 0 mem: 920 ;kinda weird, it alternates 760 and 920
t
;now up to 10 million
arc> (time:$:let loop ((n 0)) (if (> n 10000000) 't (loop (+ n 1))))
time: 36 gc: 0 mem: 760
t
;now we test ar-+2
arc> (time:$:let loop ((n 0)) (if (> n 10000000) 't (loop (ar-+2 n 1))))
time: 1020 gc: 0 mem: 1096
t
arc> (time:$:let loop ((n 0)) (if (> n 10000000) 't (loop (ar-+2 n 1))))
time: 1019 gc: 0 mem: 776
Apparently it makes a significant difference (30x) inside a tight Racket loop. For fun, let's try the unsafe ops too.
arc> ($:require racket/unsafe/ops)
#<void>
arc> (time:$:let loop ((n 0)) (if (unsafe-fx> n 10000000) 't (loop (unsafe-fx+ n 1))))
time: 28 gc: 0 mem: 760
t
Hmm, I had an idea. Put the number check first. New definition:
(define (ar-+2 x y)
(cond ((number? x) (+ x y))
((char-or-string? x)
(string-append (ar-coerce x 'string) (ar-coerce y 'string)))
((and (arc-list? x) (arc-list? y))
(ac-niltree (append (ar-nil-terminate x) (ar-nil-terminate y))))
(#t (+ x y))))
arc> (eval:list '$ (read:pbpaste))
#<void>
arc> (= + $.ar-+2)
#<procedure:ar-+2>
arc> (time:repeat 1000000 nil)
time: 122 gc: 0 mem: 3256
nil
arc> (time:repeat 1000000 nil)
time: 121 gc: 0 mem: 1880
nil
arc> (time:$:let loop ((n 0)) (if (> n 10000000) 't (loop (ar-+2 n 1))))
time: 310 gc: 0 mem: 776
t
arc> (time:$:let loop ((n 0)) (if (> n 10000000) 't (loop (ar-+2 n 1))))
time: 323 gc: 0 mem: 936
t
What, now the Arc loop goes faster than it did with the Racket +. Probably because this function assumes two arguments, while Racket + assumes any number. And now the Racket loop is only 10x slower than with Racket +. (I expect Racket knows how to optimize its own 2-argument +.) By the way, in case you think the Arc loop goes faster than Racket, note that the Arc loop goes to 1 million and the Racket loop to 10 million; Racket is here going 3x as fast as Arc (and if Racket used Racket +, it'd be going 30x as fast).
Oh, um, I should test the case-lambda thing. I copy to clipboard my definition of + from the Optimizations page:
Virtually no difference from the raw ar-+2. Just for reference, so I can be sure ar-+2 only goes faster than Racket + in the Arc loop because ar-+2 effectively declares that it takes only two arguments:
Good, I'm not going crazy. Anyway, that final version of ar-+2 is my official recommendation.
Oh, incidentally, this "time" thing (though I think it was just "mem-use" back then) also helped me figure out that "<", ">", and "is" were allocating memory (provoking my work behind the Optimizations page). For example, I found that (no x) malloc'd; I eventually realized that this is because "is" mallocs, and (no x) is defined as (is x nil).
Also, here's some further relevant usage of "time":
arc> (time:no:= xs (n-of 1000000 0))
time: 2351 gc: 331 mem: 32513488
nil
arc> (time:no:zap nrev xs)
time: 152 gc: 0 mem: 632 ;hell yeah, nrev works well
nil
arc> (time:no:zap rev xs)
time: 254 gc: 0 mem: 32000584
nil
;now here's when it GC's
arc> (time:no:zap rev xs)
time: 371 gc: 117 mem: 31824248
nil
arc> time:len.xs ;this is my fixed "len"
time: 8 gc: 0 mem: 0
1000000
;now, for comparison, let's see the old definition of "len" from ac.scm
arc> (= len (eval:list '$ (read:pbpaste)))
#<procedure>
arc> time:len.xs
time: 461 gc: 261 mem: 68936656 ;OMG this is ridiculous
1000000
aw's thread (bugs and gotchas) has nudged me into putting this up. :-)
As a general note, I think I'll want to put up a bunch of stuff on the wiki (eventually). Seems like a good place to post code and explanation, and to allow for future people (who might or might not be me) to improve upon it. Sort of like my old thread[1], except this allows for editing.
Where should discussion take place, though? I think the answer is "on this forum". The arclanguagewiki site has comments, but I dislike that option, because (if I perceive things correctly) the comments will be there forever, and if someone posts a comment to say "X should be this way", and people discuss it, agree, and incorporate the change into the wiki, then those comments will be obsolete but they'll still be there. Come to think of it, the Arc Forum is a forum and it's optimized for discussion (and sharing of links) in the first place, so this seems a good state of affairs.
All Lisp readers I know of treat '(a . (b c)) the same as '(a b c).
arc> (read "(a . (b c))")
(a b c)
The reason is, '(a . something) = (cons 'a 'something), and if "something" is a list like '(b c d), then '(a . (b c d)) = (cons 'a '(b c d)) = '(a b c d). You can even write code like this.
* (+ 1 . (2 3)) ;SBCL
6
This has the advantage of making it really easy to write a basic print function:
(def print (x)
(if (acons x)
(do (disp "(")
(print (car x))
(disp " . ")
(print (cdr x))
(disp ")"))
<handle remaining cases: x is a symbol, number, string...>))
I'm not too strongly opposed to this change, but it would break compatibility with probably all existing Lisp readers, including Racket's.
This is a bit off-topic, but technically, Racket's syntax reader treats (a b c) and (a . (b c)) differently. It wraps just about every node of the parse tree in a syntax object, but the exceptions are the nodes which, from a certain point of view, aren't even part of the parse tree: The list tails which aren't written using their own parentheses. In particular, the (b c) part of (a b c) isn't wrapped, but the (b c) part of (a . (b c)) is.
Welcome to Racket v5.1.
> (define abc (read-syntax #f (open-input-string "(a b c)")))
> (define a.bc (read-syntax #f (open-input-string "(a . (b c))")))
> (define abc.null (read-syntax #f (open-input-string
"(a b c . ())")))
> abc
#<syntax::1 (a b c)>
> a.bc
#<syntax::1 (a b c)>
> abc.null
#<syntax::1 (a b c)>
> (syntax-e abc)
'(#<syntax::2 a> #<syntax::4 b> #<syntax::6 c>)
> (syntax-e a.bc)
'(#<syntax::2 a> . #<syntax::6 (b c)>)
> (syntax-e abc.null)
'(#<syntax::2 a> #<syntax::4 b> #<syntax::6 c> . #<syntax::10 ()>)
True, the first version is more explicit, and the parens make it clearly distinct from the required arguments, but then again, it does add an extra level of (arguably useless) parens. I still think it looks nicer than Arc's current version, though:
If you're going to create a special syntax for optional arguments, then I agree with you. But why uphold the required arg / optional arg distinction at all? If you make function arguments always optional (as they are in JavaScript) then you need no special syntax, and it is more concise.
As somebody who has programmed for years in JavaScript, I am certainly used to that style, but I actually like required arguments, even if simply for error checking. If I call a function, and it spits back an error, "function expected 3 arguments (2 given)" then I have a pretty good idea of what to do. When the function call fails silently, however, it can lead to some icky debugging situations.
On the other hand, even if you made all arguments default to nil, you could still have some error checking:
(def foo (a b c))
(foo) ; (a b c) are (nil nil nil)
(foo 1 2) ; (a b c) are (1 2 nil)
(foo 1 2 3) ; (a b c) are (1 2 3)
(foo 1 2 3 4) ; error: expected 0 to 3 arguments (4 given)
In other words, it would still be possible to throw an error when calling with too many arguments, but not when calling with too few.
I'm pretty okay with that tradeoff, actually. It gives up a bit of safety and control, but in exchange it avoids the whole icky mess with "what syntax do we use for optional arguments?" I might try that approach with my interpreter.
In fact, we could reverse the question. Rather than asking, "how do we define optional arguments?" we can instead ask "how do we define required arguments?"
In other words, all arguments default to nil, but it's possible to specify certain arguments as required, or to change what the default is. That may be a better approach than specifying what arguments are optional.
There is one problem with that approach, though. You'll likely want a way to specify what the default for an argument is, which would require some sort of construct. But then if you want to specify a required parameter, you'll need a second (different) construct.
If you have required arguments by default, then you can use a single construct to specify both optional arguments and what their defaults are. But if you have optional arguments by default, you need two constructs.
Unless you're suggesting to not have required arguments at all, and require the programmer to manually write an (if (no a)) check and manually throw an error? That could work, assuming most people don't need required arguments most of the time.
You could even wrap it up in a macro or something, like this:
(def foo (a b c)
(require a))
Which could be extended for type information as well:
(def foo (a b c)
(require a 'sym))
Which would specify that the first argument is required, and must be a symbol. I kinda like that approach, though it does mean running checks at runtime. On the other hand, how frequently do you actually need those checks? Not very often, right? So that could work, I think.
Here's a more functional version of require that works in the current version of arc:
(mac require (n (o typ))
`(if (and ,typ ,n (no:isa ,n ,typ))
(err:string "parameter "
',n
" must be of type "
,typ)
(no ,n)
(err:string "parameter " ',n " is required")))
That's because ((fn (nil) nil) 10) is supposed to be the same as ((fn (()) nil) 10), which destructures the first 0 elements from 10, treating it as a degenerate cons list.
I actually use this behavior in practice, 'cause it's useful to have at least some parameter syntax that doesn't bind any variables, rather than just using a conventional name like "ignored" or "_". Most of the time I use it in a much-too-big 'withs block, where I abuse one of the bindings for sanity-checking or other side effects. ^^ I wouldn't worry about it too much, though.
On the other hand, nil's behavior makes it more verbose to determine whether something's usable as a local variable; you have to check (and x (isa x 'sym) (~ssyntax x)). In Lathe, I define 'anormalsym to do just that. (It might be more accurate to change it to (and (isa x 'sym) (~in x nil t 'o) (~ssyntax x)).)
Speaking of (~ssyntax x), ssyntax is usable for local variable names, but naturally, you can only get the values from the Racket side:
Personally, I'd like for ssyntax-burdened names in parameter lists to use user-defined parameter syntaxes, like custom forms of destructuring and whatnot. And then, just for axiomatic simplicity's sake, I'd like things like 'o to be implemented using the same system, using ssyntax-burdened names like ".o".
"That's because ((fn (nil) nil) 10) is supposed to be the same as ((fn (()) nil) 10)"
nil is such a weird value... it's false, a symbol, and an empty list all at once. I actually had to do some hacky stuff to get () and '() and nil to behave properly.
"Speaking of (~ssyntax x), ssyntax is usable for local variable names, but naturally, you can only get the values from the Racket side:"
I plan to implement ssyntax at the reader level in my interpreter, so that shouldn't be an issue.
"nil is such a weird value... it's false, a symbol, and an empty list all at once."
Perhaps nil should serve as the base case for every type, rather than just some of them. In Arc, it already does this for booleans, lists and strings (""), but it could be extended to support tables (#hash()), numbers (0), symbols (||) etc. Then it would have more consistency as a concept.
I've run into issues with nil the symbol vs nil the value. It's caused some issues when I wanted to use arc's sym type to represent variable names in a class compiler project I'm working on. If the variable's name is nil, all kinds of things stop working, and I've had to resort to calling some scheme functions directly to get proper handling of symbols.
I wish that 'nil and nil could be kept somewhat separate, with the first being just a symbol, and the second being equivalent to the base value for all data structures. Otherwise the symbol type is broken and inconsistent, since you cannot accurately (or at least completely) represent code as data.
However, I can easily define nil so that it's type is 'cons, but it would throw an error if you try to assign to the car or cdr. That may break code that assumes that 'cons != nil though.
Actually, I could represent nil as an actual 'cons cell, so that assigning to the car or cdr would work. Crazy? Probably. Especially since nil is a singleton and you can't create more nil's, so it would be a global change.
Right now, nil does have a type of 'sym, but it seems weird to treat it mostly like an empty cons cell, but not have it's type be 'cons. So I figured I could play around with it and try giving it a type of 'cons and see how badly it breaks stuff.
"Actually, I could represent nil as an actual 'cons cell, so that assigning to the car or cdr would work. Crazy?"
That's a bit crazy. :)
PicoLisp does something reminiscent. Every one of its data structures (numbers, symbols, nil and conses) is implemented using the low-level cons cell structure (i.e. a pair of machine words). [1] They talk about nil's representation fulfilling its dual nature as both a symbol whose value is nil and a list whose car and cdr are nil; both the symbol predicate and the list predicate return true when applied to nil:
: (sym? NIL)
-> T
: (lst? NIL)
-> T
I'm not sure that they let nil's car and cdr be assignable though, because "NIL is a special symbol which exists exactly once in the whole system." [2]
My code has a lot of [string:or _ "nil"] to handle that kind of stuff. ^_^ I've just kinda accepted it ever since http://arclanguage.org/item?id=10793.
I do 'string lists into strings a lot, but that would continue to work if 'nil and '() were separate values.
If the variable's name is nil, all kinds of things stop working
Can you give an example? Are you trying to use nil as a variable name in Arc, or simply to have a data structure representing variables where some of those variables are named nil?
I'm creating a compiler for a class, and I've been representing variable names in the ast with symbols. I could use strings instead, and may end up doing so, but symbols seemed a more elegant solution.
So nil would be an ubertype that basically just means "empty"? One issue with that is (is 0 nil) would be t... I think it's useful to not always treat 0 as the same as nil.
Not sure about empty tables, though... maybe it would be fine to treat those as nil. We already treat an empty list as nil, after all.
Actually, what you're suggesting is very similar to how languages like JavaScript and Python do it, with 0 null undefined NaN "" etc. being falsy. So in JS, you can do this:
var foo = 0;
if (foo) {} // will not be evaluated
foo = "";
if (foo) {} // will not be evaluated
foo = NaN;
if (foo) {} // will not be evaluated
On the other hand, both those languages support actual Booleans, and they treat an empty array as true.
As far as prior work goes, I think this topic is the last time "can't rebind t" came up: http://arclanguage.org/item?id=13080 It's mainly just me linking back to those other two pages, but it's also a pretty good summary of my own opinion of the issues involved.
In my runtime project, nil and t are ordinary global variables... if someone turns out to want the rebind protection feature, I'll add it as a compiler extension (which other people could then choose to apply, or not, as they wished).
What I would do is make it print a warning, but still allow it. Something like, "hey, you! it's a bad idea to rebind nil; you'll probably break everything! use (= nil ()) to fix the mess you probably made"
Printing a warning is a good idea. Whether rebinding nil could be fixed with "(= nil ())" is an interesting question, you might (or might not, I haven't tried it) find that rebinding nil breaks Arc so badly that = no longer works... :-)
I see a potential for a contest here: how to permanently break Arc (with "permanently" meaning you can't recover by typing something at the REPL), in the most interesting way, using the fewest characters. (Non-interesting being, for example, going into an infinite loop so that you don't return to the REPL).
Quite possibly. It should work in my interpreter, though. Actually, () and nil are two different values, but they're special-cased to be eq to each other. It was the only way I found to make one print as "nil" and the other print as "()" From an external point of view, though, they should seem the same.
Hmm... the issue with that is that you might start having to quote nil or t whenever you want to actually mean nil or t, instead of just typing them in normally.
I do wish there was an easier way to tell whether or not a value was provided as nil, or was left empty and defaults to nil. Maybe doing destructuring on rest args would help solve that problem in most cases?
"I do wish there was an easier way to tell whether or not a value was provided as nil, or was left empty and defaults to nil."
I too have sometimes wished for that in JavaScript, but let me tell you a little story. I was writing a syntax highlighter, and got it working fine in Chrome and Firefox 3.5, but there was a bug in Firefox 3.0.
You see, I was using this bit of code here:
output.push(text.slice(curr.index[1], next && next.index[0]));
If `next` doesn't exist, it will pass the value `undefined` to the `slice` method. In JS, if you don't pass an argument, it defaults to `undefined`, so this is supposed to behave like as if I hadn't passed in the argument at all.
But in Firefox 3.0, the slice method behaves differently depending on whether you pass it `undefined`, or don't pass it any arguments. So, I had to use this instead:
if (!next) {
output.push(text.slice(curr.index[1]));
} else {
output.push(text.slice(curr.index[1], next.index[0]));
}
This was (thankfully) fixed in 3.5. The moral of the story: most of the time it doesn't matter whether the caller passed nil, or didn't pass anything. You can treat the two situations as the same.
Consider this hypothetical example in Arc:
(your-func 5 (and x y z))
If x, y, or z are non-nil, it will be passed in as usual. On the other hand, if any of them are nil, it will be like as if you had used (your-func 5 nil).
By behaving differently when nil is passed in vs. not passing in an argument, you might cause the above example to break. Or perhaps it would work, but the behavior would be subtly different... introducing bugs.
By having different behavior depending on whether an argument is passed or not, you force callers to do this, instead:
(iflet val (and x y z)
(your-func 5 val)
(your-func 5))
Note the redundancy. In fact, this is even more important in Arc (compared to JavaScript) because you can use any expression, such as (if), a macro call, etc.
So... let me ask: what situations do you really need to know whether the caller actually passed in nil, or didn't pass anything at all?
Great point. In fact, I don't check whether an optional argument was passed very often, and the times I do, I usually expect to regret it at some point, for exactly that reason. ^_^
"I do wish there was an easier way to tell whether or not a value was provided as nil"
I share this sentiment. One thing we could do is have a space of hidden-from-the-programmer variables which tell you whether other variables have been bound. They can be accessed using a macro:
(= given-prefix* (string (uniq) "-given-"))
(mac given (var)
; NOTE: I don't think this will work properly for nil, but nil is
; never a local variable name anyway.
(sym:+ given-prefix* var))
The implementation of argument lists would need to be aware of 'given-prefix* and bind the prefixed variables at the same time as the regular ones.
---
"Maybe doing destructuring on rest args would help solve that problem in most cases?"
Well, if you use a rest arg for all optional values, and then use some form of destructuring bind on that list to extract your optional arguments, then you can tell whether or not they were passed in or merely defaulted to nil by just searching the arg list.
(def test args
(if (assoc 'c args)
(pr "c was passed")
(pr "c was not passed")))
I still don't follow. We can already manage the argument list manually, but in most of the suggestions here, we can only do it if we don't destructure it in the signature (unless we use more complicated kinds of destructuring).
; Current options:
(def test args
(let ((o c)) args
(pr:if (len> args 0)
"c was passed"
"c was not passed")))
(let missing list.nil ; This is just something unique.
(def test ((o c missing))
(pr:if (is c missing)
"c was not passed"
"c was passed")))
; Some hypothetical options and non-options:
(def test (& (c))
(pr "no way to tell if c was passed"))
(let missing list.nil
(def test (& (c))
(pr "still no way to tell if c was passed")))
(def test (& args)
(let ((o c)) args
(pr:if (len> args 0)
"c was passed"
"c was not passed")))
(def test (& (&both args (c))) ; Destructure twice.
(pr:if (len> args 0)
"c was passed"
"c was not passed"))
(def test ((o c nil c-passed))
(pr:if c-passed
"c was passed"
"c was not passed"))
(def test ((o c))
(pr:if given.c
"c was passed"
"c was not passed"))
(def test (c) ; Parameter lists are just destructuring.
(pr:if given.c
"c was passed"
"c was not passed"))
(def test (&both args (c))
(pr:if (len> args 0)
"c was passed"
"c was not passed"))
It fails on functions that take required and rest args, though:
(defreq foo (x y . args) (list x y args)) -> error
Err... right, you were talking about detecting if an argument was nil or not given... but I realized that the same technique could be used to write a version of def that implements required arguments even in a language where every argument is optional.
Only if you actually rebind them. It's like using `quote` as a variable name: you can do it, but most people won't because that's silly. I just think it's nice to allow it, on the off chance it's actually useful. It just feels weird to arbitrarily say "you can't rebind nil and t" but allow rebinding of everything else.
The and is required to make sure that a type was provided, otherwise it will always fail the type check. Also, if you leave n out of the and clause, it will still pass if the required type is sym. Maybe that should be fixed.
Hm... yes, you're right. Odd, I remember it working fine when I tested it earlier. This should work correctly:
(mac require (n (o t))
`(if (no ,n) (err:string "parameter " ',n " is required")
(and ,t (no:isa ,n ,t)) (err:string "parameter " ',n " must be of type " ,t)))
Is it possible to store, view, and diff previous versions of each page? When I hear "wiki", I expect that ability (though I might conceivably be persuaded it's not necessary here).
Yes. If you're signed in as a user with write access, you will see buttons "Create Page", "Edit Page", and "More actions", and under more actions is "Revision History".
If you email me at andrew.wilcox@gmail.com, I'll add you as a user with write access :-)
Ah, I found a setting which enables viewing the revision history even for people who aren't logged in. In the page footer there's now a "Revision History" link.
I love defmemo. It is amazingly convenient. I also like my math library. Here's the code I wrote. It is equivalent to your last solution.
(defmemo f (n d)
(if (is d 1)
1
(- (expt d n)
(sum (fn (i)
(* (choose d i)
(f n i)))
1 dec.d))))
arc> (time:f 12 5)
time: 1 gc: 0 mem: 27928
165528000
Note that you don't really need the "if d > n, return 0" part. An empty sum--or, in your code, (reduce + (map <something> <an empty range>))--will be 0 anyway.
It is a big fat Arc file on my computer. :-) In fact, the math stuff is just a subset of what's in that file. I suppose "library" probably isn't a good term for it. Should I perhaps put it in a form useful to the general public?
Well, here is a link to my arc3.1 folder (with some irrelevant things removed). The "a" file is the abovementioned big fat Arc file that has everything.
I've done some nonstandard things, like changing the order of expansion of ssyntax so a:b.c = (a:b c), and importing a bunch of Racket functions with $ (which at least is Anarki standard), so you probably can't paste in all my code and have it work, without, so to speak, pasting my version of ac.scm as well. I therefore supply this link, so you can run things directly. The code should probably be fairly understandable (if you know what the function's supposed to do) and it probably won't be very hard to port individual things you want to standard Arc.
By the way, my version of "sum" is simple: (sum f a b) = (reduce + (map f (range a b))), although it's implemented with a loop rather than map/reduce, which is more efficient. arc3.1 has a built-in function called "sum", which is different: (sum f xs) = (reduce + (map f xs)). Somewhere, akkartik had the idea of calling my "sum" function "sumn" (probably similar to "mapn", as in (mapn f a b) = (map f (range a b))), which is probably a reasonable compromise for standardization.
"changing the order of expansion of ssyntax so a:b.c = (a:b c)"
I've wondered about this myself. When wart started out I did what you do just because that was simplest (no precedence). Now I think I do as arc does. Do people have an opinion on what a:b.c should expand to?
I have a very strong opinion that it should expand to (a:b c), rather than (compose a b.c). I think I've once had a use for the latter, whereas I use the former all the time. Some examples (all of them were run with the former interpretation of a:b.c), found with this shell command:
Have you used ~f.g much? I ended up switching because arc.arc uses ~testify.f in a couple of places. And it seems inconsistent for that to work the other way..
Interesting case. No, I've never used that; actually, I never use the ~ operator, preferring "no:" (as you can see above). (I take it this is a different arc.arc than the one in arc3.1, as I don't find ~testify.f in there.)
Perhaps you could make ~ a function by itself: the "complement" function.[0] Then you could write ~:testify.f.
[0] Hmm, for some reason "complement" is defined as a macro in arc.arc. I don't think it needs to be that way. I think this would work fine:
I think it's that way so that 'complement can be used with macros. I don't think I ever use it that way, though. Either I use (~foo ...), where the metafn meaning takes control, or I use ~foo by itself with foo being a function. I don't give it much thought, 'cause there are good alternatives to ~, like the ones you're talking about.
I don't think this can be used with a macro. It applies f:
(mac complement (f)
(let g (uniq)
`(fn ,g
(no (apply (unquote f) (unquote g))))))
... Yet it actually does work.
arc> (mac no-mac (x) `(no ,x))
#(tagged mac #<procedure: no-mac>)
arc> ((complement no-mac) 2)
t
It's clearly using some kind of black magic with the Arc compiler, because macex-ing it makes it fail.
arc> (macex1 '(complement no-mac))
(fn gs2704 (no (apply no-mac gs2704)))
arc> ((fn gs2704 (no (apply no-mac gs2704))) 2)
Error: "Function call on inappropriate object #(tagged mac #<procedure: no-mac>) (2)"
...Ohai, special cases, printed right there at the top of ac.scm.
; the next three clauses could be removed without changing semantics
; ... except that they work for macros (so prob should do this for
; every elt of s, not just the car)
((eq? (xcar (xcar s)) 'compose) (ac (decompose (cdar s) (cdr s)) env))
((eq? (xcar (xcar s)) 'complement)
(ac (list 'no (cons (cadar s) (cdr s))) env))
((eq? (xcar (xcar s)) 'andf) (ac-andf s env))
Well. That explains it, at any rate...
arc> ((complement no-mac 2 3 4) (is 2 3))
nil
Heh.
I do like the side effect that (a:b:c x) is the same as (a (b (c x))) even if a,b,c are macros. That's something I'd hate to give up. This seems it could be implemented in either of some ways:
1. [current] A call to 'compose in functional position is special-cased by the compiler. This seems like a bit of a hack, although it's served pretty well and isn't likely to fail in the near future.
2. The thing that expands ssyntax will expand (a:b c ...) to (a (b c ...)). This would be hard and would suck, because then ssyntax would no longer be just intra-symbol. (An "ssexpand" function that just takes a symbol argument would be insufficient.)
2a. If ssexpansion is done by the reader (which I do believe it should be), like how [f _] becomes (fn (_) (f _)), then this might be tolerable. Being a reader change, this will and should apply everywhere, even inside quotes; for example, "(a '(b:c d.e))" would get read as (a (quote (b (c (d e))))). I think this might be something to aim for.
3. We make macros first class, so they can be applied, and probably make the compiler capable of optimizing such cases. I think this, combined with (2a) as a cheap way to optimize, would be good.
That's the metafn meaning I was talking about. >.>
---
This seems it could be implemented in either of some ways
My preferred way, which you didn't mention, is to have macros return intermediate things that can either be unwrapped into expressions or applied as syntax operators themselves. That way "compose" can be a syntax operator and "(compose a b c)" can also be a syntax operator. I describe this idea in context with several of my other ideas in http://arclanguage.org/item?id=13071, "A rough description of Penknife's textual syntax."
In the past couple of days, I've also been toying with the idea of having : be a read macro. It can start a list and stop once it reaches an ending bracket, without consuming that bracket. This would be similar in effect to http://arclanguage.org/item?id=13450.
(accum acc: each x args: acc:* 2 x)
One thing it gives up is the ability to say (map a:b c); you have to say (map [a:b _] c) instead. Also, it doesn't give you a:b.c, and it's not clear to me how to indent it. :-p
I don't see a good read macro equivalent for a.b; a read macro has to come before the things it reads, like "!a b", but chaining with "!!!!a b c d e" wouldn't be nearly as nice. Any reader-level solution for a.b ssyntax would probably need to ditch the Racket reader (or at least ditch it within certain delimiters), and it would probably use something like Penknife's or ar's approach, biting off a block of text somehow--using brackets, line breaks, whitespace, or whatnot--then parsing it all at once. ...Technically we might be able to replace Racket's list syntax with a syntax that keeps a public stack of subexpressions, and then have . pop off of that stack, but that strategy doesn't help at the top level.
Yeah, in wart call started out as just funcall then turned into a macro so it could handle macros. Hmm, would I need a transform to get that if I somehow make wart a lisp-1?
Also, it always bugged me that call needed to be aware of compose. But now that I have open def it's cleaner.
1) I don't find it a problem to quote symbols and lists that should be quoted.
2) It would be extremely bad to write code that depends on unquoted symbols getting treated as quoted symbols when they happen to be unbound. There's a reason people are generally careful about picking global variable names (to the extent of sometimes putting * at the end of the name), and why "def" and "mac" give warnings if the new definition clobbers an old one. (It'd probably be a good idea to make a global-assignment thing called something like "var" warn when assigning to an already-bound variable; currently this is called "safeset", and "def" and "mac" are implemented with it.)
If, say, you use (a b c) to mean the list '(a b c) in your code, and then you test things out at the REPL, you'd better be careful not to give "a" a value, or it'll break your code. I think the cognitive overhead of worrying about that far outweighs the cost of using the ' operator.
3) I have similar objections to the other ideas. I think my comment here[1] directly addresses #6, and #8 is extremely similar to #6; the pedagogical digression addresses #4, and the whole comment applies in general. Also, get off my lawn.
I'm really glad you linked to that thread. I did remember it, and I meant to link there myself.
> 2) It would be extremely bad to write code that depends on unquoted symbols getting treated as quoted symbols when they happen to be unbound.
Yes. It would be a sort of inadvertent variable capture and could be the cause of some insidious bugs. Not any more insidious than the bugs you can get from an abuse of unhygienic macros or mutation though.
> (It'd probably be a good idea to make a global-assignment thing called something like "var" warn when assigning to an already-bound variable; currently this is called "safeset", and "def" and "mac" are implemented with it.)
That is a good idea! How about warning you when you assign to a variable that existing code depends on? For example:
> (def foo () (a b c))
#<procedure: foo>
> (= a 5)
Warning: Oh no, what are you doing? You're either
being really clever or you forgot that your previous
code depends on this variable being unbound. Why
didn't you just use the ' operator? It's only one
character, for god's sake.
Of course, at this point you're not really screwed. You can save yourself by either resetting the variable with (= a 'a) or redefining foo to use explicit quotation.
> If, say, you use (a b c) to mean the list '(a b c) in your code, and then you test things out at the REPL, you'd better be careful not to give "a" a value, or it'll break your code. I think the cognitive overhead of worrying about that far outweighs the cost of using the ' operator.
I think the warning we just talked about could go a long way toward eliminating this cognitive overhead. Don't worry about it, just deal with it if you get the warning. Or don't use it in the first place because...
The proposed changes are backwards-compatible with Arc 3.1, since all they attempt to do is provide sensible defaults for things that presently raise errors. I want to emphasize that the goal here is not to eliminate or replace quote and quasiquote. Rather it's to enhance these operators by giving you the ability to use them implicitly sometimes, instead of requiring that you be explicit in cases where it's the only useful meaning possible.
My new way of thinking about this is quote inference, a la type inference.
> How about warning when you assign to a variable that existing code depends on?
Sometimes you do that on purpose, though. E.g. you define a function that relies on a second function, and define the second function later. (rocketnia gives an example, but I'll proceed with this one anyway.)
> (def mod-expt (a n m)
(fast-expt-* a n 1 (fn (x y) (mod (* x y) m))))
#<procedure: mod-expt>
> (def fast-expt-* (a n one *)
(xloop (a a n n tt one)
(if (is n 0)
tt
even.n
(next (* a a) (/ n 2) tt)
(next a dec.n (* a tt)))))
Warning: ...
Perhaps you could make "def" smart so it wouldn't set off the warning. What if you happened to give a function the same name as an unquoted symbol, though? Maybe you'd be careful not to do that. And what if you used a global variable that you planned to define later? The warning would be inappropriate. Perhaps you'd learn to ignore it. Or perhaps you could name your global variables in a certain way and have the warning thing recognize it. And tell everyone who uses Arc to name global variables the same way, or to at least come up with a naming scheme that can be mechanically understood by the warning thing--and to stick to it. Bah humbug.
> The proposed changes are backwards-compatible with Arc 3.1, since all they attempt to do is provide sensible defaults for things that presently raise errors.
It is nice when you can introduce something without breaking old things. However, I think this thing is bad: it's fragile and shallow, and I think most programmers would just not use it and resent the time it took to understand it.
Imagine if, say, whenever mathematical operations (e.g. sqrt) were called with a list argument, then, instead of throwing an error, the function was instead applied to the car of the list. (Mapping it over the list is more likely to be useful; applying it to the average is also possible.) Or if the expression (a < b) evaluated to (< a b) when "a" evaluated to a number or anything else passable to the < function. Or if, whenever you used the variable "it" inside a then-expression in a call to "if", and "it" was otherwise unbound, it bound "it" to the if-expression (as in "aif")? Or why not all of these at once, and more?
In principle, you might be able to ignore extra little "features" like this. I think it'd annoy me, though--in the case of "sqrt" et al. being applied to lists, I'd probably think about it every time I dealt with math and lists (which I do a lot). It adds one more case to deal with to mentally evaluate any mathematical function call. The best thing that could happen is that I'd never use it, or encounter it in anyone else's code, and my mind would freed of the impulse to worry. But even if it was never used in correct code, I'd still have to think about it whenever I made a mistake and had to diagnose a problem.
By the way, perhaps you are just looking for something you'd use at the REPL, instead of a new language feature. Maybe a REPL with capabilities like Clisp's:
[1]> (+ 1 achtung)
*** - SYSTEM::READ-EVAL-PRINT: variable ACHTUNG has no value
The following restarts are available:
USE-VALUE :R1 Input a value to be used instead of ACHTUNG.
STORE-VALUE :R2 Input a new value for ACHTUNG.
ABORT :R3 Abort main loop
Break 1 [2]>
It'd make it easy to put in the symbol as the value of the unbound variable. You could tweak it so that would be the default option, and you'd just have to press return again or something; or you could even make it set the unbound variable to the symbol by default. This would be on your customized REPL, of course. :-P
> Not any more insidious than the bugs you can get from an abuse of unhygienic macros or mutation though.
There's an entire style of programming devoted to minimizing and isolating mutation, and languages exist which try to disallow it entirely. There has been a lot of work done about trying to implement hygienic macros. But these things are useful enough that it's difficult to get rid of them entirely (mutation more so; many languages don't have macros at all). This idea seems it would make every variable reference (in foreign code) and every global assignment (in code you write) a potential headache, and the payoff seems to me almost zero.
Hmm... it seems that each of us has certain conveniences we want and certain sacrifices we're willing to make in order to allow for the conveniences. But which things are the conveniences and which are the sacrifices is different depending on our personal preferences. Taking your example:
> Sometimes you do that on purpose, though. E.g. you define a function that relies on a second function, and define the second function later. (rocketnia gives an example, but I'll proceed with this one anyway.)
> (def mod-expt (a n m)
(fast-expt-* a n 1 (fn (x y) (mod (* x y) m))))
#<procedure: mod-expt>
> (def fast-expt-* (a n one *)
I would have always defined fast-expt-* before mod-expt and not the other way around. I think it was aw's essay on linearizing code dependencies [1] that finally persuaded me this is a good guarantee to have. When I'm reading my code, I value the confidence of knowing that everything below line n is unnecessary for getting everything above line n to work [2].
I can't remember the last time I intentionally wrote code that didn't conform to this principle. And if you're willing to write your code this way, then it eliminates the largest problems you all have identified with quote inference. But you and rocketnia seem to place value on being able reference functions before they've been defined. So while I would be willing to trade that ability for implicit quotation, it seems you would prefer the reverse.
I do usually put definitions of dependencies first (fast-expt-* is in fact before mod-expt in my Arc file), but sometimes I shuffle my code around, and I'd be annoyed if it complained when I did that. I like having the freedom to do it, though I don't use it all that much. But mutually recursive functions are a good example too--it's impossible to have both functions come after their dependency.
> But you and rocketnia seem to place value on being able reference functions before they've been defined. So while I would be willing to trade that ability for implicit quotation, it seems you would prefer the reverse.
It is true that I'd prefer being able to permute my definitions over having implicit quotation, but that's by far not my only reason for disliking the idea. As I said, it's a fragile, shallow add-on feature that would confuse me (by giving me weird results for erroneous code) and that would make code harder for me to reason about (whenever I see a variable reference, either that refers to something that's been defined, OR it refers to a symbol! and I can establish the latter only by ensuring that it's not defined anywhere).
By the way, why do you think this feature is a good idea? I find two quotes that seem to suggest your reasoning:
1. "My sense is that something like this would rate highly in both complexity of implementation and convenience for programming." How do you get this sense? Do you write a lot of code that uses quoted symbols? Let's take a count in my big fat Arc file:
$ grep -o "'" a | wc -l
107
$ egrep -o "[^' ()]" a | wc -l
31362
$ egrep -o "[a-zA-Z-+$*/]+" a | wc -l
8261
Even discounting whitespace and parentheses, quotes account for about 0.3% of the characters I type. If we count symbols, about 1.25% of the symbols I use are quoted. Are you working on something that uses a bazillion quoted symbols--symbols which would have to be quoted individually (e.g. the list '(a b c d) just requires one quote)?
2. "I just think we're missing out on such valuable real estate here!" See the paragraphs in my grandparent post beginning with "Imagine if" and "In principle". Tacking things on just because you can isn't a good idea.
> But mutually recursive functions are a good example too--it's impossible to have both functions come after their dependency.
You can do it by extending one of them:
(def foo ())
(def bar () (foo))
(extend foo () t (bar))
I'm not saying this is a better way to define mutually recursive functions. Just pointing out that it's possible.
> add-on feature
Not sure what you mean by "add-on". Is xs.0 list access an add-on? This is whatever that is.
> why do you think this feature is a good idea?
It could unify the "." and "!" ssyntaxes to some degree by allowing table access with h.k instead of h!k in most cases. alists that you presently have to express with '((x 1) (y 2)) could be contracted to (x.1 y.2).
I haven't worked out all the useful applications of this yet but I'm finding it interesting and think there's some potential.
> fragile, shallow [...] Tacking things on just because you can isn't a good idea.
I'm generally disappointed by your flamey response to my interest in exploring a core language possibility that could make arc programs shorter. Your ideas and even complete disagreement are very welcome, but your overall tone is insulting. Perhaps I misread you.
> I'm generally disappointed by your flamey response... your overall tone is insulting.
I'm sorry, my intent wasn't to insult you. (I'm glad you explained that, though.) I thought my words were clear. Let me explain:
I called the idea "fragile" because it would be easy to break code that depended on it--just by defining a new variable. You suggested a thing that would warn upon defining a previously-used-as-unquoted-symbol variable, but rocketnia and I brought up cases where the warning would be a false positive. I considered a more sophisticated warning system--one that had some kind of mechanical procedure for guessing whether an unbound variable was supposed to be an unquoted symbol or a function to be defined later--one that required the programmer to follow one naming scheme for unquoted symbols and another for functions to be defined later. My conclusion was that for this system to work, the programmer would basically have to tiptoe around it and be very careful, or else things would break. Hence, I thought the word "fragile" was appropriate, and used it.
I called it "shallow" because the maximum benefit, in the best possible case, is that you don't have to type the ' character most of the time. Contrast this with, say, learning to use lists when you're used to naming every single variable. Not only does it become easier to, say, compute the sum of the cubes of five numbers, but you can write a single function to compute the sum of the cubes of any number of numbers! And then you can make matrices--again, referencing them with a single variable--and writing a single function to deal with n x n matrices for all n! Without lists or other compound data structures, it'd seem really hard and annoying just to deal with 2 x 2 matrices, and 10 x 10 would seem impossible. There are deep and rich benefits to using compound data structures. But the only thing this unquoted symbols idea can possibly be good for is letting you omit the ' character; I therefore thought the word "shallow" was a good descriptor.
Regarding "Tacking things on just because you can isn't a good idea". Your motivation seemed like it might be, at least partially, something like this: "We can get strictly greater functionality out of Arc if we take some things that are errors and assign them a meaning in the language. Therefore, we should do it. Let's start looking for error cases we can replace with functionality!" Your comment "I just think we're missing out on such valuable real estate here!" added weight to this interpretation. And so I attacked it with a reductio ad absurdum, giving several examples of how one might "add functionality" in this way. I hoped to show that the line of reasoning "You get strictly greater functionality, therefore it can't be a bad idea" was wrong. I summed it up by saying "Tacking things on just because you can isn't a good idea."
> I'm not saying this is a better way to define mutually recursive functions. Just pointing out that it's possible.
And there are other ways to do it[0]. But it would be impossible to do it by just writing (def foo () (bar)) and (def bar () (foo)) and putting them in the right order. Hence, this idea would make such programs more complex/verbose. (Eh, perhaps you could set up a warning system and teach it to recognize mutual recursion. I think learning about this would distract the programmer somewhat, which isn't by itself a deal-breaker but is an undesirable aspect. I suspect there are more cases yet to be covered; and you'd still have to order your definitions properly--I don't think a compiler without psychic abilities could always tell what you were going to define later; and even if you were warned when you made a function with a conflicting name, it'd be annoying to have to give your function a new name, or to change the code that used that name.)
> alists that you presently have to express with '((x 1) (y 2)) could be contracted to (x.1 y.2).
Incidentally, I do find it annoying to type out a lot of such things, and I have a routine for dealing with that. Perhaps you'd find it sufficient? So whenever I want to make a big alist, I do something like this:
(tuples 2 '(Jan 1 Feb 2 Mar 3 Apr 4 May 5 Jun 6
Jul 7 Aug 8 Sep 9 Oct 10 Nov 11 Dec 12))
;instead of '((Jan 1) (Feb 2) ...)
Note that I've redefined "(tuples xs n)" as "(tuples n xs)". It is much better this way. :-} I could also use "pair", I suppose.
Oh, and, by the way, if ssyntax were implemented at the reader level--which I think it should be; I think the current situation is just a hack that will be changed eventually--you could write '((x 1) (y 2)) as '(x.1 y.2). [I see you address this in a sister post.]
And this example suggests that your intent goes beyond merely having unbound symbols evaluate to themselves. In my post I cite at the top of this thread, I addressed problems with trying to have ((x 1) (y 2)) evaluate as '((x 1) (y 2)).
> make arc programs shorter.
That is a worthy goal, one I'd forgotten about. It's good that you brought it up. I suppose the shortness of a program is kind of a good static measure, whereas objections like "It'd confuse me" are usually only temporary, and the programmer gets used to it.
But I do believe that a) using it would create either horrible risks of breaking things or annoying false-positive compiler warnings, b) therefore I'd never use it, so it wouldn't actually make my programs any shorter, and c) it would, inevitably, make debugging harder--instead of UNBOUND-VARIABLE errors I'd get diverse results, depending on precisely what happens when a symbol is put in the place of the unbound variable.
Now, (c) also applies to having xs.0 list/table/string reference work. But (a) and (b) don't. I do use it[1], and relying on it doesn't cause any problems like the fragility I've described. And the payoffs are pretty good. Many things are significantly shorter--e.g. m.i.j for reaching into a matrix, instead of (aref m i j) or, worse, (vector-ref (vector-ref m i) j).[2] The error-obfuscation objection still applies, but I think the benefits override the objection.
[0] I was thinking you could define them in the same lexical context:
[1] In fact, arc3.1 doesn't even provide "hash-ref" or "string-ref" functions, so you kinda have to use (x n). "list-ref" at least could be implemented by the user in terms of car and cdr.
[2] I was going to add: "And since I don't need to specify the type of the data structure, sometimes I or my code can forget that detail. I could change matrices to be implemented as nested hash tables or vectors, and m.i.j would still be correct." However, this part could be done with a unified "ref" function that reached into all data structures.
My reply has been delayed because I wanted to respond to a lot of your points in detail. I haven't found time to do that yet, but I thought I should at least say this before waiting any longer.
> alists that you presently have to express with '((x 1) (y 2)) could be contracted to (x.1 y.2).
Is there a general interest in moving ssyntax functionality to the reader? I've found some past discussions about it but wanted to know the present sentiment.
If ssyntax was processed at the reader-level, it would help in this particular scenario because then '(x.1 y.2) could evaluate to ((x 1) (y 2)) instead of (x.1 y.2).
Is there a general interest in moving ssyntax functionality to the reader?
In the Arc runtime project, that was my assumption behind my choosing my matching library to implement the reader in Arc. The matching library is way more powerful than what would be needed to simply replace the Racket reader as-is; the goal is that when people want to experiment with different kinds of syntaxes or with extending ssyntax to work in more cases it will be easy to do.
Yeah, I like being able to reference a function before it's defined. (Macros annoy me a little for not allowing that.) For me it's a matter of the concept of "definition" being an ambient thing, where something counts as being defined if it's defined anywhere, even later on. It's like how, in a mathematical proof or prose argument, a broad claim may be reduced into a bunch of littler inferences, some handled one-by-one systematically and some left to the reader to fill in. I've read (or tried to read) a bunch of mathematical papers or books that start out building lemma after lemma and climax in a theorem, and those might even be in the majority, but sometimes I have to approach them backwards, and then I have to backtrack to figure out what their terminology means, and it's pretty frustrating.
In education, lots of the time new topics are built upon the foundations the old topics provided, but sometimes they're built upon established motivations and provide all-new foundations, like an analysis course justifying the calculus courses that came before it, or a mechanics course casting Newton's laws in a new light.
For me, the motivation comes pretty early on relative to the implementation. I could decide to put the main control flow algorighm at the top to set the stage for the rest, or I could decide to arrange things according to the order they'll be applied--or in fact I might like having them in the reverse order, the order in which they're needed to get the result from more and more convenient starting positions. That last strategy is probably closest to dependencies-come-first coding, but I don't want to be limited to it, even if I risk choosing a frustratingly haphazard strategy.
One reason to favor the dependencies-first approach in Arc is that we have mutability.
If you're only doing single assignment and side effect-free programming, then your code doesn't have a significant order [1]. But insofar as your program is imperative and performing mutations, the order is significant.
A consequence of this is that if you want to be able to take advantage of imperative features, you're making it harder by ordering your code any other way. I say this because even if your code is purely functional right now, when you try to insert some imperative code later, the order is going to start mattering more. And it's going to start seeming tangled and confused if it doesn't build up in the order of execution (at least it does for me).
So dependencies-first programming plays especially well with imperative code. I'm also particularly interested in it at this moment because I'm working on a refined auto-quote mechanism that could be hard to take advantage of if you're not programming this way. ;)
Yeah, I agree: I like to see the 'business end' of code up front. aw's article made some good points I'm still mulling over[1], but upgrading things seems like such a rare event compared to the day-to-day use of code. Especially if I manage to keep up my resolution[2] to never rely on any libraries :)
---
[1] http://github.com/awwx/ar now keeps tests in a separate file. Does that weaken the case for defining things before using them? Perhaps you could define your tests bottom-up but write your code top-down, or something.
I still want to try out a test harness that analyzes dependencies and runs tests bottom-up: http://arclanguage.org/item?id=12721. That way you could write your tests in any order and they'd execute in the most convenient order, with test failures at low levels not triggering noisy failures from higher-level code.
Not by design, as it happens. I wrote some new tests for code written in Arc, and stuck them into a separate file because I hadn't gotten around to implementing a mechanism to load Arc code without running the tests.
Though I do view writing dependencies-first as a form of scaffolding. You may need or want scaffolding for safety, or because you're working on a large project, or because you're in the midst of rebuilding.
Does that mean that you always need to use scaffolding when you work on a project? Of course not. If you're getting along fine without scaffolding, then you don't need to worry about it.
Nor, just because you might need scaffolding in the future, does it mean that you have to build it right now. For example, if I had some code that I wanted to rebase to work on top of a different library, and it wasn't in dependency order, and it looked like the rebasing work might be hard, I'd probably put my code into dependency order first to make the task either. But, if I thought the rebasing was going to be easy, I might not bother. If I ran into trouble, then perhaps I'd backtrack, build my scaffolding, and try again.
Especially if I manage to keep up my resolution to never rely on any libraries :)
I have effectively the same resolution, but only 'cause of Not Invented Here syndrome. :-p Nah, I use plenty of libraries; they just happen to be the "libraries" that implement Arc. I use all kinds of those. :-p
---
http://github.com/awwx/ar now keeps tests in a separate file. Does that weaken the case for defining things before using them?
That file is loaded after the things it depends on, right?
---
...you could write your tests in any order and they'd execute in the most convenient order, with test failures at low levels not triggering noisy failures from higher-level code.
I'm not sure I understand. Do you mean if I define 'foo and then call 'foo in the process of defining 'bar (perhaps because 'foo is a macro), then the error message I get there will be less comprehensible than if I had run a test on 'foo before trying to define 'bar?
---
In any case, aw's post mostly struck me as a summary of something I'd already figured out but hadn't put into words: If a single program has lots of dependencies to manage, it helps to let the more independent parts of the program bubble together toward the top, and--aw didn't say this--things which bubble to the top are good candidates for skimming off into independent libraries. If you're quick enough to skim them off, the bubbling-to-the-top can happen mentally.
Lathe has been built up this way from the beginning, basically. It's just that the modules are automatically managed, and it acts as a dependency tree with more than one leaf at the "top," rather than something like yrc or Wart with a number on every file.
I'm interested in making a proper unit test system for Lathe, so we may looking for the same kinds of unit test dependency management, but I'm not sure yet about many things, like whether I want the tests to be inline or not.
Well, Lathe has an examples/ directory, which I've ended up using for unit tests. It's kind of interesting. Lathe's unit tests have become just like its modules over time, except that they print things to tell you about their status. Being a module, an example automatically loads all its dependencies, and you can load it up and play around with the things defined in it at the REPL, which is occasionally useful for debugging the example itself. But it's pretty ad-hoc right now, and I don't, for instance, write applications so that they load examples as they start up, like you might do.
"Do you mean if I define 'foo and then call 'foo in the process of defining 'bar (perhaps because 'foo is a macro), then the error message I get there will be less comprehensible than if I had run a test on 'foo before trying to define 'bar?"
If bar depends on foo (foo can be function or macro), and some tests for foo fail, then it's mostly pointless to run the tests for bar.
---
"That file is loaded _after_ the things it depends on, right?"
Yeah well, you gotta load code before you can run the tests for it :)
My understanding of aw's point was this: if you load your code bottom-up, then you can test things incrementally as you define them, and isolate breakage faster. Defining the tests after their code is irrelevant to the argument because it's hard to imagine an alternative.
If you put your tests in a separate file and run them after all the code has been loaded, you can still order them bottom-up. So to answer my own question, no, keeping the tests in a separate file doesn't weaken aw's argument :)
There is a small difference: if you've loaded only the code up to the point of the definition which is being tested when you run the test (either by writing tests in the same source code file as the definitions, or by using some clever test infrastructure), then you prove that your definitions aren't using anything defined later.
Of course you can probably tell whether code is in prerequisite order just by looking at it, so this may not add much value.
Something I've been thinking about, though I haven't implemented anything yet, is that there's code, and then there's things related to that code such as prerequisites, documents, examples, tests, etc. The usual practice is to stick everything into the source code file: i.e., we start off with some require's or import's to list the prerequisites, doc strings inline with the function definition, and, in my case, tests following the definition because I wanted the tests to run immediately after the definition.
But perhaps it would be better to be able to have things in separate files. I could have a file of tests, and the tests for my definition of "foo" would be marked as tests for "foo".
Then, for example, if I happened to want to run my tests in strict dependency order, I could load my code up to and including my definition of foo, and then run my tests for foo.
"the tests for my definition of foo would be marked as tests for foo."
In java or rails each class file gets a corresponding test file in a parallel directory tree. I find it too elaborate, but it does permit this sort of testing classes in isolation.
> (def maximin (x) (check x number (apply max (map minimax x))))
#<procedure: maximin>
> (def minimax (x) (check x number (apply min (map maximin x))))
Warning: Oh no, what are you doing?
> O_O
The proposed changes are backwards-compatible with Arc 3.1, since all they attempt to do is provide sensible defaults for things that presently raise errors.
They're not compatible with Arc programmers who want to get those errors. Not all errors signify places where extensions can roam free. For instance, extending the 'err function itself would be particularly silly. Where and how clearly to draw the line is a matter of opinion, but I think waterhouse and I are both in the "we want errors" camp here. ^^;
As I said before (http://arclanguage.org/item?id=13830), I'm in the camp of "this is probably a bad idea, but let's try it anyway and see where it leads." It's my new resolution: to try harder to not be conservative in trying out design choices with wart: http://arclanguage.org/item?id=13694
I still plan to do 'production' work with wart, so I'm going to try not to go off the deep end. One experiment at a time, try to falsify new changes as fast as possible, etc.
And I'm making Penknife so I can explore things like this in their own quarantined namespaces where they can coexist with my other experiments in the same program. ^^ I'm not sure if my strategy's any good though; I fear it won't put enough pressure on certain experiments to see what their real strengths are. It kinda banks on the idea that people other than me will use it and apply their own pressure to their own experiments.
A strategy I'm not comfortable with? Is this also hedging? :-p Nah, in this case I'm not also taking any other strategy I'd prefer to succeed. Or is that necessary? Maybe I'm hedging, but not diversified. Probably I'm just in a bad place. :-p
"I fear it won't put enough pressure on certain experiments to see what their real strengths are. It kinda banks on the idea that people other than me will use it and apply their own pressure to their own experiments."
Even better if you could get others to put pressure on your experiments.
"Even better if you could get others to put pressure on your experiments."
Well, that's both sides of the issue I'm talking about. I do want people to put pressure on each other's experiments, but I expect to promote that by reducing the difficulty involved in migrating from one experiment to another. Unfortunately, I expect that'll also make it easier for people not to put any more than a certain amount of pressure on any one experiment.
Or are you talking about my experiments in particular? :)
No, you understood me right. If you make it too easy to fragment the language the eyeballs on any subsurface get stretched mighty thin.
This might actually be one reason lisp has fragmented: it's too easy to implement and fork, and so it has forked again and again since the day it was implemented. Suddenly ruby's brittle, hacky, error-prone parser becomes an asset rather than a liability. It's too hard to change (all those cases where you can drop parens or curlies at the end of a function call), and it keeps the language from forking.
Upvoted for the new quoting style. I think double quotes + italics is a winner for news.arc forums. (I usually do a right angle bracket + italics, but I like the way yours looks better.)
...at the REPL, you'd better be careful not to give "a" a value, or it'll break your code. I think the cognitive overhead of worrying about that far outweighs the cost of using the ' operator.
Thank you so much for saying that. XD All of it. I feel the same way, but I can't help but think I'm just out of my element and resistant to change. Managing a single character here or there isn't a trivial problem, but I personally don't care much either way as long as I can also manage my code well on a large scale. Good to know I'm not the only one.
---
...warn when assigning to an already-bound variable...
I'm too much a fan of extensibility (where redefining things is normal) and of namespaces (where neglectable names are already protected from conflict) for 'safeset to seem like a positive thing, but in this case, it could just be me. ^^
> I'm sure you're right; we just need to figure it out for ourselves.
Maybe. :) These ideas aren't totally unheard of or untested. For example, PicoLisp has auto-quote for the number case, and it's been around since the 80s! XD
Instead of '(1 2 3) you can just write (1 2 3). This is possible because a number never makes sense as a function name, and has to be checked at runtime anyway.
PicoLisp also takes advantage of the initial bindings idea from my #3 to some extent. In PicoLisp, transient symbols are initially bound to themselves, and ordinary symbols are initially bound to nil.
I think I'd be more likely to feel positively toward completely unheard-of ideas[1]. For example, I mistrust dynamic scoping more because some variants of lisp have had it. I felt positively toward ssyntax and open defs from day 1 even though they were unheard of.
But all this is just gut instinct. I'll try anything once :)
[1] Though it was good to find out about the parentage of autoquoting, thanks. I wasn't aware of that.
I dislike default dynamic scoping because it seems like a bad idea to me. The fact that other Lisps already have it seems pretty irrelevant, except insofar that it gives us a chance to try out dynamic scope and see what it's like.
In other words: I think old ideas are fine, and new ideas are fine. Ideas should stand on their own merits. The only difference between an old idea and a new idea, is that the old idea has been tested somewhat, so we can have a better idea of whether it's a good idea or not.
the "rec" inside (fn (x) ...) refers to the global value of "rec". One might conceivably change the semantics of "let" so that this didn't happen, but then I'd be annoyed because, as rocketnia says, you wouldn't be able to say (let x (+ 1 x) ...). Also, "let" expands to a call to "fn", and one of the things about functions is that you should be able to take a function, rename one of its arguments (and all its uses in the body of the function), and get the same results:
; to use akkartik's example
(let rec (fn (x) (rec x))
(rec 3))
; expands to
((fn (rec)
(rec 3))
(fn (x)
(rec x)))
; which should be the same as
((fn (meh)
(meh 3))
(fn (x)
(rec x)))
There is an alternative semantic for functions, known as "dynamic scope", in which it does matter what names you give to your arguments, and your original "rec" example would work. The thing we have instead of "dynamic scope" is called "lexical scope". I think most people agree that lexical scope is better, though sometimes they encounter particular cases where they want dynamic scope; Common Lisp allows you to declare variables "special" with forms called "defvar" and "defparameter", while Racket/Scheme provides "parameters". Some people (aw and maybe others) have worked out ways to import Racket/Scheme's parameters into Arc; search this forum for "defvar", "implicit", "dynamic", or "parameter".
Now, it is possible to create local recursive functions without using assignment. If you want to do something like "letrec" without assignment, basically you pass the let-environment as extra arguments to your local functions:
;The following function is the same as factorial:
(fn (n)
((fn (fact)
(fact fact n))
(fn (fact k) ;the "fact" function
(if (is k 1)
1
(* k (fact fact (- k 1)))))))
;The following function is the same as "even":
(fn (n)
((fn (even odd)
(even even odd n))
(fn (even odd n) ;the "even" function
(if (is n 0)
t
(odd even odd (- n 1))))
(fn (even odd n) ;the "odd" function
(if (is n 0)
nil
(even even odd (- n 1))))))
Looking over this quickly, question: You use a variable, html-nestlev, inside a closure as a sort of hidden global variable, and use it to indicate the depth of quoting... What happens when an error or ctrl-C happens in between an (html-openq) and (html-closeq)? That might be worth handling. You might try Racket's parameters, or have a function that resets the thing.
Also, you ask in a comment: "way to warn not on stdout?" If stderr suffices for this, you might like this:
(mac to-stderr body
`(w/stdout (stderr) ,@body))
which "warn" should probably really be defined with. And if you might like to redirect stderr somewhere (like a log file), I recommend you add these lines to ac.scm, near "xdef call-w/stdout" and "call-w/stdin":
Thanks. I started to use your to-stderr when I noticed this similar function in arc.arc:
(def ero args
(w/stdout (stderr)
(each a args
(write a)
(writec #\space))
(writec #\newline))
(car args))
I'll probably use it instead just to keep this lib's dependencies to the arc3.1 core. I agree it would make sense to have call-w/stderr in ac.scm, though.
> You use a variable, html-nestlev, inside a closure as a sort of hidden global variable, and use it to indicate the depth of quoting... What happens when an error or ctrl-C happens in between an (html-openq) and (html-closeq)? That might be worth handling. You might try Racket's parameters, or have a function that resets the thing.
Really good point. Need to think about this more, as I'm feeling doubtful about that whole part of the program relating to nested quotation.
Update: For those just tuning in and looking for the variable html-nestlev in the source, it's been renamed to nestlev.
[PRO TIP: Click "link" (or "parent") on this comment, so you can view it (mostly) by itself rather than squashed against the right side of the page.]
For me, when I was writing my spider solitaire program, I had a couple of global variables, the-deck and the-piles, which represented the state of the game. the-deck was a list of randomly permuted cards, which were integers from 0 to 51 (or 0-25 with two suits, or 0-12 with one suit). the-piles was a list of piles, which were lists of cards (the zeroth element was the top of the pile), and the cards were of the form (list card revealed?), where "card" = 0-51 and "revealed" = t/nil. (Yes, a card is an integer in the deck, but a list of an integer and t/nil on the board. This does not need to be changed.)
To flip over the top card of the nth pile, I went:
(= (cadar the-piles.n) t)
To deal a card (face up) to top of each pile from the deck, I said:
(forlen i the-piles
(push (list pop.the-deck t) the-piles.i))
To move a chain of n cards from (the top of) one pile to another, I went:
(let (a b) (split the-piles.i n)
(= the-piles.i b
the-piles.j (join a the-piles.j)))
Also, I created a global variable called "prev-states", and every move the user performed would
(let (a b) pop.prev-states
(= the-deck a the-piles b))
. Now, what would you do in a "pure functional programming" language? If there were no assignment whatsoever, not even to global variables, I'd probably effectively implement global state by making every function take an extra argument representing global state, and shifting code around in other ways. If you could assign to global variables but all data structures were immutable (I think Clojure is like that, correct me if I'm wrong), then I'd have to keep rebinding the entire "the-piles" list when all I wanted to do was alter one or two piles. Revealing the top card of the nth pile would look something like this:
(= the-piles
(join (take (- n 1) the-piles)
(list:cons (list (car the-piles.n) t) (cdr the-piles.n))
(drop n the-piles)))
(Note that "push" and "pop" are legal because they just do global assignment; they don't alter any data structures). And then moving a chain of n cards from i to j would look like this:
(= the-piles
; ... oh god I have to know whether i>j
;(join (take (dec:min i j) the-piles)
; ... egad I may as well just say (if (< i j) ...
;(if (< i j)
; (join (take dec.i the-piles)
; (list:drop n the-piles.i)
; (cut the-piles i dec.j)
; (list:join (take n the-piles.i) the-piles.j)
; (drop j the-piles))
; (join ... egad the repetition is intolerable
; hmm, this'll work
(with (new-i (list:drop n the-piles.i)
new-j (list:join (take n the-piles.i) the-piles.j))
(if (< i j)
(join (take dec.i the-piles)
new-i
(cut the-piles i dec.j)
new-j
(drop j the-piles))
(join (take dec.j the-piles)
new-j
(cut the-piles j dec.i)
new-i
(drop i the-piles)))))
On the plus side, I would no longer need to use "deepcopy" when creating a backup of the game state.
(push (list the-deck the-piles) prev-states)
Which is pretty much the only benefit I'd get from outlawing modification of data structures, which I assume is what "pure functional programming" would mean. Don't even talk to me about how this would look if I had to simulate global variables...
Functional programming is a useful tactic in some cases. For example, I use a "chain-length" function, which might, say, take the list (7♥ 8♥ 9♣) and return 2, the number of revealed cards of consecutive rank of the same suit. I first thought about making it take "i" as an argument, referring to the ith pile in the-piles, but I made it take the list "xs" instead, and that allowed me to use it for other purposes: e.g. determining whether a move of n cards from pile i to j will create a longer single-suited chain. (Like, moving that 7♥ 8♥ chain onto a 9♥. I have a couple of AI-assistance procedures that look for moves satisfying conditions like that.) The expression is:
(is (+ n chain-length:the-piles.j)
(chain-length:join (take n the-piles.i) the-piles.j))
Had I written "chain-length" to refer just to the ith pile in the global "the-piles" variable, I'd... basically have to rewrite most of it. So factoring out things into individual functions (which, btw, is, I think, in the category of things called "functional programming", but isn't usually the same as making things side-effect-free) is frequently a good thing--it makes the program easier to write, to read, to understand, and to maintain.
But the goal, the reason you'd want to do it, is not because it's called "functional programming"--it's to make the program easier to write/read/understand/maintain. And if something called "functional programming" makes it harder to do those things, then I would advocate throwing it out. I think preventing all side effects, or all side effects except those on global variables, falls firmly into the latter category; I think my example demonstrates this.
; Move a chain of n cards from i to j.
(zap [let (taken leftovers) (split _.i n)
(copy _ i leftovers j (join taken _.j))]
the-piles)
; Reveal the top card of the nth pile.
(def copdate (orig . kvs)
(apply copy orig (mappend [list _.0 (_.1:orig _.0)] pair.kvs)))
(zap [copdate _ n [cons (list _.0 t) cdr._]] the-piles)
In principle I agree, disabling the programmer's ability to use mutation without reason is just frustrating. But I don't expect an example to help show this; most examples we could come up with would probably have easy-in-hindsight solutions like this. I think the point is that mutation gives us more easy ways to do things, so that we can quickly experiment and stuff.
I see you want to push the example to its limit a bit. ^_^ Well, I assume this hypothetical language would be stocked with utilities that made up for what its pureness missed out on. How believable are these?
That's definitely what I meant, and I was going to just say so, but whan I saw the ":)" I realized akkartik probably knew that and meant something like "that may not be 'oh god' complicated, but it's still not nearly as convenient as the mutating code." I don't know if that's a correct interpretation, but it made it more interesting to respond. ^_^
Yeah, I like the way you handle the-deck and the-piles. And I like this style of programming in general (i.e. a few global variables to keep state, functions corresponding to verbs in the problem space that manipulate the globals in an intuitive way). You've reminded me of how preventing all side effects could get awkward by ruling out this style of programming entirely.
To clarify, I'm not mostly interested in eliminating side effects because of some obsession with purity, but rather for performance sake. I've been pursuing so many lang design choices like fexprs/first-class macros that are supposed to really cripple performance, that I'm starting to worry my hobby implementations aren't even going to run. I'd heard that Haskell and Clojure get a big performance boost from enforced immutability/no side effects because it guarantees that most operations can safely be run concurrently, so I've just considered it an option.
I'm a real novice when it comes to optimization, and I'm certainly interested in techniques that could help me get decent performance without having to sacrifice mutability. TCO, strategic memoization, compiler hints? Any tips on this front could be really useful to me.
Egad, you are thoroughly right and I have rewritten the post. mentally smacks myself on the head Thank you, that is a much better implementation. I'm a little bit disappointed that it's so easy...
Uh-oh, the edit window has timed out on my post, and I've noticed that my version doesn't do any type-checking. That is el baddo. So far, it hasn't caused things to crash yet...
arc> (= x 'meh)
meh
arc> (= (car x) 2)
2
arc> x
meh
...but that's just luck and I wouldn't rely on it. [It does destroy the world when you set the car of a table.] Better version is here:
Man, I really wish I could continue editing my original post, 'cause without the type-checking, it's kind of a nuclear bomb in case someone accidentally uses it on a table. Perhaps I should just have left the post unchanged, or just added "Edit: Look in the comments for improvements" at the bottom.