Arc Forum | "and avoids extra nesting merely to define a new set of variables. It's the appr...

Arc Forum

2 points by rocketnia 4892 days ago | link | parent

"and avoids extra nesting merely to define a new set of variables. It's the approach of e.g. arc and python."

Arc uses an extra level of nesting, just like most Schemes and MLs do. Arc's let is different from Scheme's let, but only because it supports destructuring, does only one binding, and uses fewer parentheses.

As for Python's approach where all plain variable assignments define locals except when declared otherwise[1], it rubs me the wrong way. I'm not sure why. (Side note: Now that I've looked up Python scope, it sounds an awful lot like Kernel's mutation policy, where nonlocal variables can't be rebound without using a captured environment. I'm hopeful that this kind of scoping can make fexpr code more efficient, so... it might have similar ramifications for interpreted Python...? Fexprs in Python, scary.)

[1] http://stackoverflow.com/questions/7935966/python-overwritin...

---

"I don't see the gain of the varscope concept over just having "var" and "set" operate on the scope that surround the declarations and child scopes. [...] what's wrong with just setting the variable to null [in the outer scope] until you know what its true value should be?"

My one reason for liking var is that you don't have to trek up to the top of a variable's scope to define it. If you do still have to trek outside a child varscope, that's not a full expression of the feature.

In practice we might use the same name for every varscope, having decided that shadowing is an acceptable compromise for being able to copy a variable declaration from one part of the code to another. Given that we're choosing the same names everywhere, we might even have macros that choose them for us.

---

"It is an error to have consecutive "var" calls in the same scope."

I assume you mean multiple vars of the same name in the same scope. Otherwise I'd expect your "module m" example to fall apart. (If func doesn't expand into var, I'd rather it did.)

Whether or not that's what you mean, I don't like that error. In JavaScript, while multiple declarations of a single variable are allowed and commonly discouraged, I occasionally find myself preferring to have two "constant" variables that just so happen to have the same name. It can help emphasize that the code that uses them is almost exactly the same. (What? JS has no macros. :-p )

One consistent example of my rebellion is with loops, where I have no problem using "for ( var i = ..." for multiple loops in a single function.

---

"The Environment functionality is reused to provide python-like modules."

I like it, but I have more extreme recommendations.

I believe it's possible to implement (varscope ...) in a Scheme that has manual access to the compiler. In that language, (varscope foo ...) would compile its body in a local environment where 'foo is a macro that expands to '= and pushes the variable name to a list, and then it would embed the compiled body in a (let ...) form that bound all the listed variables to nil. A similar technique would work to implement your 'module form. (Manual access to the compiler might not be necessary if the Scheme has a suitable concept of locally scoped macros instead.)

While this in itself is nice, it would be cleaner to be able to do this without using mutation. This could be easily accomplished if there were a special compiler utility that took a tuple (varscope-label, how-to-expand-the-varscope-label, code) and returned a tuple (compiled-code, set-of-variable-names).

Keep in mind that by "compiler" I mean whatever handles the phase that expands macros. If you don't expand macros until execution time (in your "interpreter" perhaps), whoops, you've got fexprs. :-p

1 point by seertaak 4891 days ago | link

> My one reason for liking var is that you don't have to trek up to the top of a variable's scope to define it. If you do still have to trek outside a child varscope, that's not a full expression of the feature.

I'm afraid I don't understand what you mean by this. Maybe you could give a short example? As I see it, one way or another you need to mark out the "outer" scope, where you declare the lifetime of the variable (even though you don't know its value yet). Then somewhere in two or more branches of that scope the variable is set and used. IIUC you propose to write "varscope" at the top, and "var" further down. Doesn't that mean that whenever you define a variable, you need to write "varscope" before it? I somehow can't believe that, which makes me think I'm still not grokking!

> I assume you mean multiple vars of the same name in the same scope.

Yes, otherwise you quickly run in trouble with parent-scope scheme :)

> One consistent example of my rebellion is with loops, where I have no problem using "for ( var i = ..." for multiple loops in a single function.

The way that's handled in bullet is that when we encounter the for, we push a new environment, run the initialization, then push (and subsequently pop -- the interpreter holds a stack of environments representing runtime frames) an environment for each iteration of the loop. So it's ok to use the same variable name in subsequent for loops: the previous instance is not alive by the time current is reached.

> If func doesn't expand into var, I'd rather it did.

Here's the definition of "func" in bullet:

    macro func (name args :rest exprs)
      qquote
        set ,name
          fn ,args ,@exprs

So, yes, it's just a var binding. Note that this allows definition of "module" functions.

    func m.foo (): print "m.foo" 
    ==>
    set m.foo: fn (): print "m.foo"

By the way, even macros are defined as a macro:

    var macro
      tfm (name args :rest body)
        qquote: set ,name: tfm ,args ,@body

Macro's don't have any functionality analogous to Lambdas to capture variables from enclosing scopes.

As it should be clear by now, my implementation is a sort of illegitimate child of fexprs and macros. Basically, I've introduced all the weaknesses of fexprs in return for only some of the gains :)

In bullet, macros are values represented by the Transform class. Their definition is almost identical to Lambdas: they hold their formal parameters and their body as an AST. The only difference to Lambdas is how they're treated for evaluation purposes by the interpreter. Transforms (like Primitives which are basically fsubrs) receive their operands unevaluated. They are expanded at runtime, and the result of the expansion is then evaluated in the lexical environment of the call site. That means you can use macros in higher-order functions; they truly are first class.

Until now, this was an artifact of my interpreter design (emphasizing getting something up and running quickly). My intention had been to double back and fix the discrepancy with "real" lisps by doing the standard initial macroexpand traversal of the AST before evaluating. Either that or ditch the interpreter and write a compiler.

However, the material you've presented me with regarding fexprs is truly fascinating (no, I didn't know what they were before I read your post). I've just got to try these fexpr style macros; the idea of just controlling evaluation of operands, but otherwise being just like a regular function is very appealing.

In conclusion, I would appreciate if you could explain the varscope concept further. Again, what I don't get is whether need to write "varscope.." before you can bind a variable using "var..". I bang on about that because I'm loathe to introduce a feature that introduces such a high overhead for a single variable use. Or do you also have "lets" that work as in scheme?

Also, you kind of lose me in the last two paragraphs. It would be help if you could in some sense "sell" your concept it to me (please!): what extra bit of power is now available, that I can't express in my implementation? Maybe by the extra-cool use, I will understand the tradeoff in terms of extra typing for a variable declaration.

-----

3 points by rocketnia 4891 days ago | link

Turns out 'varscope has an inconsistent corner case, the way I was originally thinking about it.

  (mac foo () "macro")
  (mac id-mac (x) x)
  (varscope v
    (id-mac (v foo (fn () "function")))
    (foo))

Should this result in "macro" or "function"? What if we change it up like this?

  (mac foo () "macro")
  (mac id-mac (x) x)
  (varscope v
    (id-mac (v id-mac (fn (x) nil)))
    (id-mac (v foo (fn () "function")))
    (foo))

I'd rather have 'varscope work in a compilation phase (no dependence on fexprs), and I'd rather not make the order of compilation matter, so I'm going to make a very hackish decision: The body of a (varscope ...) form should be compiled as though it put no variables in scope. Macros from the surrounding scope will work even if the local scope shadows them at run time.

By no coincidence, this design compromise is compatible with the hypothetical implementation below. I actually only realized this flaw once I was documenting that implementation. :-p

(By complete coincidence(?), this is similar to Arc 3.1's bug where local variables don't shadow macros. In the hypothetical language(s) I'm talking about, function parameters and let-bound variables would hopefully still shadow macros, so it wouldn't be quite the same.)

---

"IIUC you propose to write "varscope" at the top, and "var" further down. Doesn't that mean that whenever you define a variable, you need to write "varscope" before it?"

Close. You can have more than one variable per varscope. But I'm guessing you knew that. :-p

That means when you define a variable, you don't necessarily need to define a varscope if a suitable one already exists. But I don't expect even a single 'varscope to appear very often in code; instead I expect convenience macros to take care of it.

---

"The way that's handled in bullet ... it's ok to use the same variable name in subsequent for loops: the previous instance is not alive by the time current is reached."

That's a good example of when a macro could take care of establishing a varscope.

  (for <init> <condition> <step>
    <...body...>)
  ==>
  (varscope var
    <init>
    (while <condition>
      <...body...>
      <step>))

For the analogous case in JS (or rather a hypothetical JS-like language whose semantics are based on 'varscope), the only thing that establishes a varscope is the "function () {}" syntax. Sibling loops of the form "for ( var i = ..." use the same i because they don't establish a new scope for themselves.

---

"Or do you also have "lets" that work as in scheme?"

Yes, I would have them both. It's hard not to have 'let since it can just be defined as a macro over 'fn. Whether one would be emphasized over the other, I'm not sure.

---

"Also, you kind of lose me in the last two paragraphs."

I was talking about a compiler for varscope bodies. It'll help to back up a bit....

In a language where macros return compiled code rather than code to compile, traditional macros are simple to implement as sugar:

  (mac when (condition . body)
    `(if ,condition ,@body))
  ==>
  (def-syntax when (condition . body) gensym123_static-env
    (compile `(if ,condition (do ,@body)) gensym123_static-env))

If you find this shockingly similar to Kernel-style fexprs, hey, me too. :-p

  (mac when (condition . body)
    `(if ,condition ,@body))
  ==>
  (def-fexpr when (condition . body) gensym123_dynamic-env
    (eval `(if ,condition (do ,@body)) gensym123_dynamic-env))

IMO, the compile phase is just an fexpr eval phase whose result is used as code. Arc macros, which don't have access to the environment, are limited in the same way as pre-Kernel fexprs.

So what I'm suggesting is that in addition to 'compile, we have a second compilation function that lets us compile the body of a (varscope ...) or (module ...) form.

In fact, here's exactly how I'd use it. I'll call it 'compile-w/vars.

  ; I'm assuming 'compile-w/vars, 'def-syntax, and 'compile exist in
  ; Arc.
  ;
  ; I'm also assuming 'mc exists in Arc as an anonymous macro syntax (so
  ; that 'mc is to 'mac as 'fn is to 'def).
  ;
  ; Last but not least, I'm assuming (nocompile <code>) exists in Arc as
  ; a way to embed compiled code inside uncompiled code. When
  ; (nocompile <code>) is compiled in a static environment <env>, it
  ; should associate any free variables in <code> with variables bound
  ; in <env>. To make this happen, both 'compile-w/vars and 'compile
  ; should accept code even if it has free variables, and compiled code
  ; should be internally managed in a format that allows for this kind
  ; of augmentation.
  
  (def-syntax varscope (label . body) env
    ; In case you're not familiar, this is a destructuring use of 'let.
    (let (new-body vars)
           ; NOTE: We compile the body in the *outer* environment, not
           ; the local environment the varscope establishes.
           (compile-w/vars
             label (mc (var val)
                     `(= ,var ,val))
             `(do ,@body) env)
      (make-compiled-let (map [list _ nil] vars)

      (compile `(with ,(mappend [do `(,_ nil)] vars)
                  (nocompile ,new-body))
               env)))
  
  (def-syntax anon-module-w/var (label . body) env
    (w/uniq g-table
      (let (new-body vars)
             ; NOTE: We compile the body in the *outer* environment, not
             ; the local environment the module establishes.
             (compile-w/vars
               label (mc (var val)
                       ; Set both a variable and a table entry.
                       `(= (,g-table ',var) (= ,var ,val)))
               `(do ,@body) env)
        (compile `(with (,g-table (obj) ,@(mappend [do `(,_ nil)] vars))
                    (nocompile ,new-body)
                    ,g-table)
                 env))))
  
  ; This anaphorically binds 'var as the module's variable declaration
  ; form.
  (mac module (name . body)
    `(= ,name (anon-module-w/var var ,@body)))

As before, I release this code for anyone to use--or rather to derive actual working code from. :-p No need for credit.

---

"So, yes, it's just a var binding."

That's a var binding even though it uses 'set? I'm confused.

---

"no, I didn't know what [fexprs] were before I read your post"

What post is that?

I had a half-written reply that started with "Come to think of it, you probably do have fexprs," and went on to explain why I suspected it, what they were, and what you might get if you embraced it or rejected it. Should I still post it? It sounds like you understand it already, but it wouldn't do to have a time paradox. :-p

Anyway, since you're an fexpr fan now, I would like to emphasize the other side: The translation of my (point ...) example into imperative code is straightforward to do during a compilation phase, and fexprs get in the way of compilation phases. :)

It may be possible to force one's way through fexprs during a compilation phase too, but I expect that algorithm to look like a cross between a) constant-folding and b) static type inference with dependent types (since eval's return type depends on the input values). Rather than simply using recursion to compile subexpressions, the algorithm would do something more like using recursion together with concurrency, so that some subgoals could wait for information from other subgoals. To complicate things further, if the program uses a lot of mutable variables, the algorithm might not be able to treat them as constants, and it might not get very far unless you run it at run time as a kind of JIT.

I find this pretty intimidating myself. I've made steps toward at least the constant-folding part of this (which I expect will be sufficient for almost all fexpr programs written in a reasonable style), but I've gotten bogged down in not only the difficulty but also my own apathy about fexprs.

-----

1 point by Pauan 4891 days ago | link

"The translation of my (point ...) example into imperative code is straightforward to do during a compilation phase, and fexprs get in the way of compilation phases. :)"

Why not have both? As in, have a way of saying "this should all be done at compile-time" that doesn't involve fexprs at all, or involves a special variant of fexprs. Reminds me of a discussion somewhere about micros (as opposed to macros)...

-----

3 points by rocketnia 4891 days ago | link

"Reminds me of a discussion somewhere about micros (as opposed to macros)..."

This probably isn't what you mean, but...

One of my oldest toy languages (Jisp) had syntactic abstractions I called "micros," and I discuss them here: http://arclanguage.org/item?id=10719

tl;dr: My micros are fexprs that not only leave their arguments unevaluated but also leave them unparsed. The input to a micro is a string.

Much like how I just said macroexpansion in the compilation phase was like fexpr evaluation, what I've pursued with Penknife and Chops is like a compilation phase based on micro evaluation.

---

"Why not have both? As in, have a way of saying "this should all be done at compile-time" that doesn't involve fexprs at all, or involves a special variant of fexprs."

One way to have both is to have two fexpr evaluation phases, one of which we call the compile phase. This is staged programming straightforwardly applied to an fexpr language... and it's as easy as wrapping every top-level expression in (eval ... (current-environment)).

However, that means explicitly building all the code. If you want to call foo at the repl, you can't just say (foo a b c), you have to say (list foo a b c).

With quasiquote it's much easier for the code that builds the code to look readable. So suppose the REPL automatically wraps all your code in (eval `... (current-environment)). Entering (foo a b c) will do the expected thing, and we can say (foo a ,(bar q) c) if we want (bar q) to evaluate at compile time.

Now let's fall into an abyss. Say the REPL automatically detects the number of unquote levels we use in a command, and for each level, it wraps our code in (eval `... (current-environment)) to balance it. Now (foo a b c) will do the expected thing because it's wrapped 0 times, (foo a ,(bar q) c) will do the expected thing because it's wrapped once, and so on. We have as many compile phases as we need.

The price is one reserved word: unquote. This would be the one "special variant of fexprs."

-----

1 point by rocketnia 4889 days ago | link

"The price is one reserved word: unquote. This would be the one "special variant of fexprs.""

Possible correction: If any kind of 'quasiquote is ever going to be in the language, it should probably have special treatment so that its own unquotes nest properly with the REPL's meaning of unquote. An alternative is to use a different syntax for non-REPL unquotes (e.g. ~ instead of ,).

Also note that ,foo could be a built-in syntax that doesn't desugar to anything at all (not even using the "unquote" name), instead just causing phase separation in a way that's easy to explain to people who understand 'quasiquote.

-----

1 point by Pauan 4889 days ago | link

"Also note that ,foo could be a built-in syntax that doesn't desugar to anything at all (not even using the "unquote" name), instead just causing phase separation in a way that's easy to explain to people who understand 'quasiquote."

Yes, I currently think that all syntax should be at the reader level, rather than trying to use macros to define syntax. As an example of what I'm talking about, in Nu, [a b c] expands into (square-brackets a b c) letting you easily change the meaning of [...] by redefining the square-brackets macro.

Or the fact that 'foo expands into (quote foo) letting you change the meaning of the quote operator... or the fact that `(foo ,bar) expands into (quasiquote (foo (unquote bar))), etc.

I used to think that was great: hey look I can easily change the meaning of the square bracket syntax! But now I think it's bad. I have both conceptual and practical reasons for thinking this.

---

I'll start with the conceptual problems. In Lisp, there's essentially three major "phases": read-time, compile-time, and run-time. At read-time Lisp will take a stream of characters and convert it into a data structure (often a cons cell or symbol), compile-time is where macros live, and run-time is where eval happens.

Okay, so, when people try to treat macros as the same as functions, it causes problems because they operate at different phase levels, and I think the same exact thing happens when you try to mix read-time and compile-time phases.

---

To discuss those problems, let's talk about practicality. quasiquote in particular is egregiously bad, so I'll be focusing primarily on it, though quasisyntax also suffers from the exact same problems. Consider this:

  `(,foo . ,bar)

You would expect that to be the same as (cons foo bar), but instead it's equivalent to (list foo 'unquote 'bar). And here's why. The above expression is changed into the following at read-time:

  (quasiquote ((unquote foo) . (unquote bar)))

And as you should know, the . indicates a cons cell, which means that the above is equivalent to this:

  (quasiquote ((unquote foo) unquote bar))

Oops. This has caused practical problems for me when writing macros in Arc.

---

Another problem with this approach is that you're hardcoding symbols, which is inherently unhygienic and creates inconsistent situations that can trip up programmers. Consider this:

  `(,foo (unquote ,bar))

You might expect that to result in the list (list foo (list 'unquote bar)) but instead it results in the list (list foo bar), because the symbol unquote is hardcoded.

---

Yet another problem is that it requires you to memorize all the hard-coded names for all the syntax. You have to remember to never define a function/macro called quote. To never define a function/macro called unquote, to never define a function/macro called square-brackets, etc... which means this will break:

  ; oops, redefined the meaning of the quote syntax
  (let quote ...
    'foo)

When the number of syntax is small, that's not really a high price to pay, but it is still a price.

---

Also, this whole "read syntax expands into macros" thing is also inconsistent with other syntax. For instance, (1 2 3) isn't expanded by the reader into (list 1 2 3). That is, if you redefine the list function, the meaning of the syntax (1 2 3) doesn't change. But if you redefine the quote macro, then suddenly the syntax 'foo is different.

The same goes for strings. Arc doesn't expand "foo" into (string #\f #\o #\o) either. So redefining the string function doesn't change the meaning of the string syntax. So why are we doing this for only some syntax but not others?

---

All of the above problems go away completely when you just realize that read-time is a separate phase from compile-time. So if you want to change the meaning of the syntax 'foo the solution isn't to redefine the quote macro. The solution is to use a facility designed for dealing with syntax (such as reader macros).

This is just like how we separate compile-time from run-time: you use functions to define run-time stuff, macros to define compile-time stuff, and reader macros to define read-time stuff.

This also means that because the only way to change the syntax is via reader macros (or similar), the language designer is encouraged to provide a really slick, simple, easy-to-use system for extending the syntax, rather than awful kludgy reader macros.

-----

1 point by Pauan 4891 days ago | link

"then push (and subsequently pop -- the interpreter holds a stack of environments representing runtime frames)"

Uh oh, my warning bells went off. If I were you, I'd put some unit tests that verify that closures work properly. In particular, this might very well break in bullet (though I won't know without testing it):

  (def foo (x)
    (fn () x))

  ((foo 4)) -> 4

---

"They are expanded at runtime, and the result of the expansion is then evaluated in the lexical environment of the call site. That means you can use macros in higher-order functions; they truly are first class."

Ewww, runtime macros. I do not like. They combine all the awfulness of macros[1] without any of the benefits of fexprs[1], while also giving up the only benefit macros have[1]. The worst of all worlds, in my opinion.

---

"My intention had been to double back and fix the discrepancy with "real" lisps by doing the standard initial macroexpand traversal of the AST before evaluating."

Good. I think Lisps should either embrace macros (warts and non-first-classness included), or embrace fexprs and dump macros since they're not needed and just get in the way. Naturally, I'm in favor of fexprs unless speed is critical, and even then I'd prefer to just make the interpreter faster rather than dump the elegance of fexprs.

---

"I've just got to try these fexpr style macros; the idea of just controlling evaluation of operands, but otherwise being just like a regular function is very appealing."

It sure is! An example of a very beautiful Lisp that uses fexprs at its very core is Kernel (though it calls them operatives and uses the $vau form to create them):

http://web.cs.wpi.edu/~jshutt/kernel.html

http://www.wpi.edu/Pubs/ETD/Available/etd-090110-124904/unre...

ftp://ftp.cs.wpi.edu/pub/techreports/pdf/05-07.pdf

There are other Lisps that use fexprs (or at least things similar to fexprs) as well, such as Picolisp and newLISP (which erroneously calls them macros), but I'm especially fond of Kernel (for many reasons), but in part due to its static (lexical) scope.

---

* [1]: I'm only slightly exaggerating... but in all seriousness, first-classness is only one of the (multiple) benefits of fexprs, and even with first-class macros, you still need to worry about hygiene, which is basically a non-issue in Kernel (that is to say, in Kernel, hygiene is so incredibly easy to achieve that it naturally happens, because the language is so incredibly well designed, so I consider this a mostly "solved problem" in Kernel).

Plus, I suspect if you're basically macro-expanding macros at runtime, you'd actually get slightly faster speed with fexprs (not that speed is a huge issue, but it can be an issue, depending on what you want to do, so I mention it for completeness and because I have a personal interest in making powerful things go fast).

As far as I can tell, the only real benefit of macros is that they're always preprocessed, so they only need to macro-expand once. That is also why they're non-first-class.

I suppose a minor benefit is it allows you to treat macros as basically a template facility, but I find that benefit to be dubious at best, especially since it's so easy to use templating facilities in fexpr (or define your own).

Another minor benefit is that you can macro-expand a macro to do things like code walkers, but... I feel that should be part of a debugger/inspection suite or something.

---

Just to make sure you don't feel like I'm railing on you: it seems to me that you were unaware of fexprs when you designed bullet, hence why bullet has macros rather than fexprs. That's totally fine, I understand. I'm mentioning all these things not only for your benefit, but also anybody else who might stumble along and read this post.

-----

2 points by seertaak 4891 days ago | link

> In particular, this might very well break in bullet

It works:

    func foo (x): fn () x
    print ((foo 4)) // prints 4

I explained incorrectly: the interpreter env stack is basically a stack of bindings representing both "true" locals (i.e. locals on JVM) and environments representing lexical scopes. The latter are held for instance by functions, macros, modules explicitly, and also get implicitly created as required in e.g. looping primitives.

I'll reply to your other points tomorrow morning! (basically, I agree :))

-----

1 point by Pauan 4891 days ago | link

Nice! So lexical environments do form a proper tree and persist even after the outer function has returned? If so, then that shouldn't be a problem.

-----

1 point by Pauan 4892 days ago | link

"whoops, you've got fexprs. :-p"

Not if said "fexprs" don't have access to the dynamic environment, though!

-----