Arc Forum | shader's comments

Arc Forum

new | comments | leaders | submit | shader's comments

3 points by shader 4570 days ago | link | parent | on: Wat, rebooted (a minimal kernel-like lisp vm on ja...

I don't like the use of json lists for code, but other than that this looks useful. Anyone want to work on something arc-like built on top of wat?

I wish he explained why he gave up on quasiquotes again; his previous blog post made sense to me - anyone see the issue?

His next few posts on things he's done with wat are also interesting, but I don't think they need separate posts.

-----

3 points by manuelsimoni 4570 days ago | link

Hi, the creator Wat here. Thanks for your interest.

The real reason for switching away from quasiquotation was not quasiquotation itself, but hygiene.

The moment you start quoting identifiers, you're in trouble. The Scheme expansion process is one example: http://www.r6rs.org/final/html/r6rs/r6rs-Z-H-13.html

What has to happen in all hygienic macro is a fake reconstruction of lexical scope at compile time. It's really like Epicycles: http://csep10.phys.utk.edu/astr161/lect/retrograde/aristotle... ('The "solution" to these problems came in the form of a mad, but clever proposal').

The only satisfying solution is the one adopted by Kernel, and it is to not quote, but rather include first-class values in the generated source. This way, you leverage the existing lexical scope machinery to achieve hygiene - as $DEITY intended.

-----

2 points by Pauan 4570 days ago | link

It is true that symbols are unhygienic, but if you use boxes instead, then you get perfect hygiene. Unlike function/vau values, boxes can exist entirely at compile-time, so they're suitable for compilation to JavaScript.

For a working example, check out my language Nulan, which compiles to JavaScript and has hygienic macros that use boxes:

https://github.com/Pauan/nulan

http://pauan.github.io/nulan/doc/tutorial.html

I can provide more details if desired. Though... you're exploring vau based semantics with wat, so I'm not sure how useful boxes would be for you.

In any case, I like what you've done with wat, though I've leaned away from vau recently.

-----

2 points by manuelsimoni 4570 days ago | link

So the idea behind boxes is that you can create boxes at compile-time for runtime stuff that doesn't exist yet?

Note that the same idea can work for Vau first-class values: at compile-time you simply create a faux value for each non-compile-time top-level (or other) definition. Macros can then insert them into generated code, even though they are not the real runtime object.

-----

2 points by Pauan 4570 days ago | link

Yes, at least in Nulan the boxes only exist at compile-time: at runtime they're normal JS variables. Of course, if you were writing a VM, the boxes could also exist at runtime, which would be faster than having to lookup a symbol: you would just unbox it!

And yes, I suppose it would be possible to use boxes in some way with vaus, especially because modern vau-based languages use lexical scope. This doesn't completely solve the "vau are slow" problem, but it at least avoids a symbol lookup for every single variable.

In fact, come to think of it... I had been playing around with this idea, but I had forgotten about it. The basic idea is to create a first-class environment/namespace, but rather than mapping symbols to values, it would map symbols to boxes.

The idea is that, this namespace exists both at compile-time and run-time, so that at compile-time it uses the namespace to replace symbols with boxes (thus gaining speed and the ability to statically verify things at compile-time), but using "eval" at runtime would first lookup the symbol in the namespace (which returns a box), and would then unbox it.

If done properly, I think such a system would appear to behave similarly to the normal environment system as used by Kernel, but it could potentially be faster, because most symbols can be replaced at compile-time with boxes. In any case, it's an implementation detail, unless the language decides to expose the boxes underneath.

-----

3 points by manuelsimoni 4570 days ago | link

Exactly.

My plan is to use these ideas not with fexprs, but with procedural macros, thereby gaining fully static expansion at compile-time, because I want a type checker, and type checking fexprs is currently not possible to my knowledge.

The macroexpansion pass would register fake bindings (very similar in spirit to your boxes) for all runtime definitions in the compile-time environment.

Macros then include these fake objects at compile-time (just as fexprs would with the real objects at runtime) into the generated code.

Thereby, I hope to maintain the pleasurable hygiene properties of fexprs in a language with preprocessing macros.

-----

2 points by Pauan 4569 days ago | link

Well then, that sounds exactly like what Nulan does. I'm glad to see these ideas spreading, whether independently discovered or not.

Since I've already explored these ideas, I would like to mention a couple things.

---

What you call "fake bindings", Nulan calls boxes. I like this name because it's short, unique, and intuitive. A box is simply a data structure that can hold a single item and supports getting that item and setting that item.

In other words, putting something into the box, and taking something out of the box. But changing the contents of the box doesn't change the box itself, obviously. And because they're mutable, boxes are equal only to themself. So even if two boxes hold the same value, they might not be equal to eachother.

Interestingly, boxes are exactly equivalent to gensyms:

  (define foo (gensym))  ; Create the box
  (eval foo)             ; Get the value of the box
  (eval `(set! ,foo 1))  ; Set the value of the box

And gensyms are equal only to themself. So they satisfy all the conditions of boxes, and in Nulan, gensyms actually are boxes! So another way of thinking about it is that Nulan replaces all symbols with gensyms.

Yet another way to look at it is that boxes are equivalent to pointers in C, but now we're straying a bit far away from Lisp...

Because I wanted Nulan to be hygienic by default, I decided to use boxes for everything. In addition, they're a first-class entity. That is, Nulan lets you run arbitrary code at compile-time, and also gives you access to boxes at compile-time.

And the way that Nulan implements macros is... basically, a "macro" is simply a box that has a &macro property. When Nulan sees a box as the first element of a list, it checks if the box has a &macro property (which should be a function), and if so, it calls it.

But boxes can have other properties as well, for instance &pattern lets you define custom pattern-matching behavior, &get is called when the box isn't a macro, and &set is called when assigning to the box:

  foo 1 2 3     # &macro
  foo           # &get
  foo <= 1      # &set
  -> (foo 1) 2  # &pattern

This means the same box can have different behavior depending on its context. And because these things are tied to the box rather than the symbol, everything is perfectly hygienic and behaves as you expect.

Boxes also have other properties as well, like whether they're local or global, and whether they were defined at compile/run-time. And the Nulan IDE (http://pauan.github.io/nulan/doc/tutorial.html) understands boxes as well, so that if you click on a box, it will highlight it. So even though two variables may have the same name, the Nulan IDE knows they're two different boxes.

By the way, in Nulan, the correct way to write the "when" macro is like this:

  $mac when -> test @body
    'if test
       | ,@body

Basically, Nulan doesn't have "quasiquote". Instead, "quote" supports "unquote" and "unquote-splicing". In addition, "quote" returns boxes rather than symbols, so it is by default hygienic.

If you want to write an unhygienic macro, you can use the "sym" function, which takes a string and converts it into a symbol:

  $mac aif -> test @rest
    w/box it = sym "it"
      'w/box it = test
         if it ,@rest

I prefer this way of doing things because it reduces the amount of syntax (no quasiquote), it means everything is hygienic by default, and it allows you to break hygiene, but you have to explicitly use the "sym" function. Because unhygienic macros are rare, I think this is a good tradeoff.

Also, you might have noticed that the above two macros don't use "unquote". As a convenience, when using "quote", if the box is local (bound inside a function), it will automatically "unquote" it, meaning that these two are equivalent:

  w/box foo = 5
    'foo

  w/box foo = 5
    ',foo

The reason for having "unquote" at all is for splicing in expressions:

  # returns (5 + 10)
  w/box foo = 5
    '(foo + 10)

  # returns 15
  w/box foo = 5
    ',(foo + 10)

---

Actually, I don't mind unhygienic macros that much, so "hygienic macros" isn't the primary reason I love boxes. The main reason I like them is because they make it really easy to implement hyper-static scope. And once your language has hyper-static scope, it's very easy to make an amazingly simple, concise, powerful, and flexible namespace system.

So boxes really kill multiple birds with one stone, in a really simple to understand and simple to implement way. They may not have the generality of vau or functions, but they're a practical data structure that does a lot for little cost.

-----

3 points by shader 4569 days ago | link

I can understand the logic for not using quotes in favor of simpler techniques for hygenic macros, but I still like having them as short hand for really common list manipulations. Or even just an easy way to make a symbol. Of course, I've also always wanted something like python's list and *dict for splicing a list onto the end of an argument set without having to do something ugly like (apply f (join (list a b) c)). (f a b @c) looks so much nicer to me...

Anyway, aside from aesthetics, boxes really do seem useful. Another thought that just occurred to me is that they should enable trivial implementation of setforms and "out args".

I.e., if (car (cdr (obj key))) returned a box, you would be able to just call '= on it, with no thought for how you found the value.

And for out args, all you need to do is have double-boxes for function arguments. If you access the outer box, 'get is passed through to the inner box and you just receive the value. If you just call '= on the outer box, the value gets shadowed like normal, because 'set would not be passed through by default. If you want to modify the external variable, all you have to do is unbox it and call 'set on the original box.

I feel like there might be a similar way to enable dynamically scoped functions with first class environments, but I'm not sure.

-----

3 points by shader 4568 days ago | link

I've been thinking about this some more, and it seems to me that boxes in the way Nulan uses them make for a completely different evaluation paradigm than the traditional metacircular evaluation.

In the traditional evaluation scheme, using environments, symbols, read, eval, and apply, the process is as follows:

  1) read turns text into code (lists of symbols and values)
  2) eval looks up a symbol in the current environment, or expands a macro, and calls apply to get the value of a function
  3) apply calls eval to get the value of the arguments, and then returns the value of the function

In a compile-time based box scenario, the environments go away and read/eval change to something like this:

  1) parse the new code into lists of appropriately boxed values, and expand macros
  2) recursively apply functions to values to get the final result.

More of a traditional compile/run separation. The main difference to note in the box formulation is that there is no such thing as an environment. The 'hyperstatic scope' that Pauan keeps mentioning is another way of saying, 'there is no scope', only variables.

Thus, there is no real way to make variations on the scoping scheme using hyperstatic scope, because that's all a read/compile time feature. Dynamic scope is impossible without tracking environments, which are otherwise unnecessary.

Now, if one were to use both first-class environments and boxes, one can theoretically have the best of both worlds, with only a little extra overhead, and most of that at compile time. Flexible scoping becomes possible by specifically referencing the variable in the environment instead of using the default boxed values. They would still be boxes, just referenced by symbol in the environment table.

Now what I'm wondering is whether there is any useful equivalence to be found between a box and an environment. Heck, just make a box an object that has a few hard coded slots (that can be accessed by array offset) and a hash table for the rest, and voila! It can replace cons cells and regular hash tables too :P All we need is a reason for making everything use the same core structure...

-----

2 points by Pauan 4568 days ago | link

Yes, that's intentional. Because I wanted to compile to fast JavaScript, I chose an evaluation model that has a strict separation between compile-time and run-time.

---

"The 'hyperstatic scope' that Pauan keeps mentioning is another way of saying, 'there is no scope', only variables."

Actually, there is still scope. After all, functions still create a new scope. It's more correct to say that, with hyper-static scope, every time you create a variable, it creates a new scope:

  box foo = 1     # set foo to 1
  def bar -> foo  # a function that returns foo
  box foo = 2     # set foo to 2
  bar;            # call the function

In Arc, the call to "bar" would return 2. In Nulan, it returns 1. The reason is because the function "bar" is still referring to the old variable "foo". The new variable "foo" shadowed the old variable, rather than replacing it like it would in Arc. That's what hyper-static scope means.

From the compiler's perspective, every time you call "box", it creates a new box. Thus, even though both variables are called "foo", they're separate boxes. In Arc, the two variables "foo" would be the same box.

---

"Thus, there is no real way to make variations on the scoping scheme using hyperstatic scope, because that's all a read/compile time feature. Dynamic scope is impossible without tracking environments, which are otherwise unnecessary."

If by "dynamic scope" you mean like Arc where globals are overwritten, then that's really easy to do. As an example of that, check out Arc/Nu, which also uses boxes:

https://github.com/Pauan/ar

In the Arc/Nu compiler, it literally only takes a single line of code to switch between Arc's dynamic scope and hyper-static scope.

If by "dynamic scope" you mean like Emacs Lisp, then... actually that should be really easy as well. You would just replace the same symbol with the same box, and then use box mutation at run-time. Of course, at that point I don't think there'd be any benefit over run-time environments... In any case, I like lexical scope, and boxes work well for lexical scope.

Also, although Nulan has hyper-static scope, it does have dynamic variables. The way that it works in Nulan is, you create a box like normal...

  box foo = 1

...and then you can dynamically change that variable within a code block:

  w/box! foo = 2
    ...

Within the "w/box!" block, the variable "foo" is 2, but outside, it is 1. You can do this with any variable in Nulan.

---

"Now what I'm wondering is whether there is any useful equivalence to be found between a box and an environment."

No. And that's a good thing. I've found one of the major benefits of boxes is that they're a single stand-alone entity: each variable corresponds to a single box. In the environment model, you have a hash table which maps symbols to values, so you have a single data structure representing many variables.

The reason I prefer each variable being represented by a separate box is that it makes namespaces really really easy to design and implement. For instance, in Arc/Nu you can selectively import only certain variables:

  (w/include (foo bar qux)
    (import some-file))

This is really easy with boxes: you simply grab the "foo", "bar", and "qux" boxes and import them. But with environments, it's a lot harder, because the environment for the file "some-file" may contain all kinds of variables that you don't want to import.

You could take the values from the environment and copy them into the current environment, but now any changes made won't show up. With boxes, the changes show up. I don't think environments work well for namespace systems. But boxes work wonderfully well.

---

"It can replace cons cells and regular hash tables too :P"

I've thought about ways to replace hash tables with boxes, but I don't think it'd be especially helpful. Hash tables serve a different purpose from boxes, so it makes sense for a language to have both data structures. If you want a box, just use a box. If you want a hash table, just use a hash table.

Interestingly enough, both Nulan and Arc/Nu have a giant hash table at compile-time that maps symbols to boxes. This is basically the compile-time equivalent of the run-time environments that languages like Kernel have. Creating a new scope or changing the existing scope is as easy as changing this hash table, which makes it really easy to play around with different namespace systems.

-----

2 points by rocketnia 4568 days ago | link

"The main difference to note in the box formulation is that there is no such thing as an environment."

Doesn't step 1 need to use environment(s)? How else would it turn text into a structure that contains references to previously defined values?

---

"The 'hyperstatic scope' that Pauan keeps mentioning is another way of saying, 'there is no scope', only variables."

The hyper-static global environment (http://c2.com/cgi/wiki?HyperStaticGlobalEnvironment) is essentially a chain of local scopes, each one starting at a variable declaration and continuing for the rest of the commands in the program (or just the file). I think the statement "there is no scope" does a better job of describing languages like Arc, where all global variable references using the same name refer to the same variable.

---

"Flexible scoping becomes possible by specifically referencing the variable in the environment instead of using the default boxed values. They would still be boxes, just referenced by symbol in the environment table."

I don't understand. If we're generating an s-expression and we insert a symbol, that's because we want that symbol to be looked up in the evaluation environment. If we insert a box, that's because we want to look up the box's element during evaluation. Are you suggesting a third thing we could insert here?

Perhaps if the goal is to make this as dynamic as possible, the inserted value should be an object that takes the evaluation environment as a parameter, so that it can do either of the other behaviors as special cases. I did something as generalized as this during Penknife's compilation phase (involving the compilation environment), and this kind of environment-passing is also used by Kernel-like fexprs (vau-calculus) and Christiansen grammars.

---

"Now what I'm wondering is whether there is any useful equivalence to be found between a box and an environment."

I would say yes, but I don't think this is going to be as profound as you're expecting, and it depends on what we mean by this terminology.

I call something an environment when it's commonly used with operations that look vaguely like this:

  String -> VariableName
  (Environment, VariableName) -> Value
  (Environment, Ast) -> Program

Meanwhile, I call something a box when it's primarily used with an operation that looks vaguely like this:

  Box -> Value

This might look meaningless, but it provides a clean slate so we can isolate some impurity in the notion of "operation" itself. When a mutable box is unboxed, it may return different values at different times, depending on the most recent value assigned to that box. Other kinds of boxes include Racket parameters, Racket continuation marks, Racket promises, JVM ThreadLocals, and JVM WeakReferences, each with its own impure interface.

When each entry of an environment needs to have its own self-contained impure behavior, I typically model the environment as a table which maps variable names to individual boxes. The table's get operation is pure (or at least impure in a simpler way), and the boxes' get operation is is impure.

You were wondering about equivalences, and I have two in mind: For any environment, (Environment, VariableName) is a box. For any box, we can consider that box to be a degenerate environment where the notion of "variable name" includes only a single name.

-----

1 point by Pauan 4568 days ago | link

"Doesn't step 1 need to use environment(s)?"

I think he's talking about run-time environments a la Kernel, Emacs Lisp, etc.

Nulan and Arc/Nu use a compile-time environment to replace symbols with boxes at compile-time. But that feels quite a bit different in practice from run-time environments (it's faster too).

---

"Are you suggesting a third thing we could insert here?"

Once again, I think he's referring to run-time environments. Basically, what he's saying is that you would use boxes at compile-time (like Nulan), but you would also have first-class environments with vau. The benefit of this system is that it's faster than a naive Kernel implementation ('cause of boxes), but you still have the full dynamicism of first-class run-time environments. I suspect there'll be all kinds of strange interactions and corner cases though.

---

Slightly off-topic, but... I would like to point out that if the run-time environments are immutable hash tables of boxes, you effectively create hyper-static scope, even if everything runs at run-time (no compile-time).

On the other hand, if you create the boxes at compile-time, then you can create hyper-static scope even if the hash table is mutable (the hash table in Nulan is mutable, for instance).

-----

2 points by Pauan 4568 days ago | link

I'm not sure I understand your argument... aside from macros, how often do you use symbols? The only other case I can think of is using symbols as the keys of hash tables:

  (= foo (obj))
  (foo 'bar)

But in Nulan, hash tables use strings as keys, so that's no problem. As for your "really common list manipulation"... my system is actually better for that. Compare:

  (cons a b)                  ; Arc 3.1
  `(,a ,@b)                   ; Arc 3.1

  (list a b c)                ; Arc 3.1
  `(,a ,b ,c)                 ; Arc 3.1

  (join (list a) b (list c))  ; Arc 3.1
  `(,a ,@b ,c)                ; Arc 3.1

  'a ,@b                      # Nulan
  {a @b}                      # Nulan

  'a b c                      # Nulan
  {a b c}                     # Nulan

  'a ,@b c                    # Nulan
  {a @b c}                    # Nulan

---

"(f a b @c) looks so much nicer to me..."

I agree. Nulan supports @ splicing for both pattern matching and creating lists:

  # a is the first argument
  # b is everything except a and c
  # c is the last argument
  -> a @b c ...

  # works for list destructuring too!
  # this function accepts a single argument, which is a list
  # a is the first element of the list
  # b is everything in the list except a and c
  # c is the last element of the list
  -> {a @b c} ...

  # in addition to working for function arguments, it also works for assignment
  # a is 1
  # b is {2 3 4}
  # c is 5
  box {a @b c} = {1 2 3 4 5}

  # and of course you can nest it as much as you like
  box {a {b c {d} @e}} = {1 {2 3 {4} 5 6 7}}

  # equivalent to (join (list a) b (list c)) in Arc
  {a @b c}

  # equivalent to (apply foo bar qux) in Arc
  foo bar @qux
  
  # equivalent to (apply foo (join bar (list qux)))
  foo @bar qux

---

"I.e., if (car (cdr (obj key))) returned a box, you would be able to just call '= on it, with no thought for how you found the value."

In Nulan, you could do what you're talking about, except it would all have to be done at compile-time, because boxes only exist at compile-time. In addition, you wouldn't be able to set the box directly, you would instead generate code that sets the box at run-time, i.e. macros.

Nulan has the restrictions that it does because I wanted to compile to really fast JavaScript. Other languages (like Arc) don't have that restriction, so it should be possible to design a language that has boxes at run-time, in which case your idea should work.

-----

3 points by shader 4568 days ago | link

I actually really like symbols, and the existence of a symbol type in lisp is one of my favorite features. Technically in Nulan a 'symbol' is replaced by a 'box', and if your box had a string property called "name" that held the name of the variable they would probably be interchangeable.

Either way, I agree that your list tools are generally more useful than quote/unquote. The main things I use it for are 1) to get a literal symbol and 2) to get something like list splicing. It looks rather arcane and adds clutter though, so in most other cases I wouldn't use it.

If there was another way to just splice in a list in the middle of the code the way Nulan seems to, I might not feel as attached to it.

-----

2 points by Pauan 4568 days ago | link

"I actually really like symbols, and the existence of a symbol type in lisp is one of my favorite features."

I too like symbols. But I think if you examine why you like symbols, you'll realize that you like them because... most other languages don't have a first-class way to refer to variables. But in Lisp you can, using symbols.

Well, Nulan has both symbols (representing unhygienic variables), and boxes (representing hygienic variables). It's just that hygienic variables are so much better in so many situations that there's not much reason to make it easy to create symbols in Nulan.

---

"Technically in Nulan a 'symbol' is replaced by a 'box', and if your box had a string property called "name" that held the name of the variable they would probably be interchangeable."

Yes boxes have a name property. This is currently only used when printing the box. Yes you could convert from a box to a symbol, but Nulan doesn't do this, because I haven't found a reason to.

-----

1 point by shader 4568 days ago | link

Yes, that is part of the reason I like them so much. The other part is that they're a way to legally use bare words as part of the syntax.

If json had a symbol type, Wat would be a lot less ugly.

-----

1 point by Pauan 4568 days ago | link

Actually, there is ONE situation I've encountered where I would have liked to use symbols... in my playlist program, I have a list of file extensions which are considered "audio". Here's how it looks in Arc:

  '(flac flv mid mkv mp3 mp4 ogg ogm wav webm wma)

In Nulan, that would have to be written like this:

  {"flac" "flv" "mid" "mkv" "mp3" "mp4" "ogg" "ogm" "wav" "webm" "wma"}

Or perhaps like this:

  "flac flv mid mkv mp3 mp4 ogg ogm wav webm wma".split " "

To work around that, I wrote a short and simple "words" macro:

  $mac words -> @args
    '{,@(args.map -> x "@x")}

  words flac flv mid mkv mp3 mp4 ogg ogm wav webm wma

-----

2 points by akkartik 4568 days ago | link

Wart uses the @splice notation. I've sung its praises before: http://arclanguage.org/item?id=17281

-----

3 points by shader 4568 days ago | link

This relates to yet another idea I think I've actually mentioned before.

Namely, making the lists that form the code/ast have reverse links, so you can mutate above the macro call level, instead of just insert arbitrary code in place. This wouldn't be feasible for general lists, as it is possible for a sub-list to be referenced in more than one place, but for code, each piece is generally considered unique even if it looks the same.

Anyway, this would allow for affects ranging from splicing to arithmetic and other much more evil and nefarious but possibly useful effects. I haven't thought through all of the implications, and I bet most of them are negative, but it would still be interesting to consider.

An implementation of intermediate splicing would be something like:

  (mac (list)
    (= (cdr list) (cdr (parent)))
    (= (cdr (parent)) list))

Where you replace (parent) with whatever technique would get the parent cons cell whose car is the macro call.

-----

3 points by rocketnia 4566 days ago | link

"An implementation of intermediate splicing would be something like[...]"

Which of these interpretations do you mean?

  (list 1 2 (splice 3 4) 5)
  -->
  (list 1 2 3 4 5)
  
  
  (list 1 2 (splice (reverse (list 4 3))) 5)
  -->
  (list 1 2 3 4 5)

I wrote the rest of this post thinking you were talking about the first one, but right at the end I realized I wasn't so sure. :)

---

"Namely, making the lists that form the code/ast have reverse links, so you can mutate above the macro call level, instead of just insert arbitrary code in place."

I'll make an observation so you can see if it agrees with what you're thinking of: The expression "above the macro call level" will always be a function call or a special form, never a macro call. If it were a macro call, we'd be expanding that call instead of this one.

---

For these purposes, it would be fun to have a cons-cell-like data structure with three accessors: (car x), (cdr x), and (parent x). The parent of x is the most recent cons-with-parent to have been constructed or mutated to have x as its car. If this construction or mutation has never happened, the parent is nil.

Then we can have macros take cons-with-parent values as their argument lists, and your macro would look like this:

  (mac splice list
    (= (cdr list) (cdr (parent list)))
    (= (cdr (parent list)) list))

Unfortunately, if we call (list 1 2 (splice 3 4) 5), then when the splice macro calls (parent list), it'll only see ((splice 3 4) 5). If it calls (parent (parent list)), it'll see nil.

---

Suppose we have a more comprehensive alternative that lets us manipulate the entire surrounding expression. I'll formulate it without the need to use conses-with-parents or mutation:

  ; We're defining a macro called "splice".
  ; The original code we're replacing is expr.
  ; We affect 1 level of code, and our macro call is at location (i).
  (mac-deep splice expr (i)
    (let (before ((_ . args) . after)) (cut expr i)
      (join before args after)))

If I were to implement an Arc-like language that supported this, it would have some amusingly disappointing consequences:

  (mac-deep subquote expr (i)
    `(quote ,expr))
  
  
  (list (subquote))
  -->
  (quote (list (subquote)))
  
  
  (do (subquote))
  -->
  ((fn () (subquote)))
  -->
  ((quote (fn () (subquote))))
  
  
  (fn (subquote) (+ subquote subquote))
  -->
  (fn (subquote) (+ subquote subquote))

-----

3 points by manuelsimoni 4570 days ago | link

As to using JSON source. Yeah, it's a bit ugly, and you have to type more than with S-expressions. BUT: the typing overhead is constant, so I don't really worry about it. After all, the real typing savings come from the macros that are enabled by this syntax.

What I really like about it is that by using JSON as syntax, Wat is JavaScript on a fundamental level. Wat can coexist within JS files, heck even within single JS functions. The ease of deployment enabled by this is immense.

-----

2 points by shader 4570 days ago | link

Yes, that is a nice feature. I also like the fact that it results in an extremely small base, and more separation-of-concerns. Someone else can write an sexp to json parser rather easily.

I'm just lazy enough to wish that someone else had already done so :P

-----

3 points by rocketnia 4570 days ago | link

"I'm just lazy enough to wish that someone else had already done so :P"

It's funny, I've done exactly that recently, with the same goal of getting away from JavaScript with as little cruft as possible. My goal was to make a hackish little language (nevertheless better than JS) and then make a better language on top of it.

Awkwardly, by the time I had a nice reader, I realized I didn't have any particular semantics in mind for the little language! So I decided to work on the semantics I really cared about instead. :-p Although this semantics is still incomplete, the reader is there whenever I'm ready for it. Maybe it could come in handy for Wat.

I've put up a demo so you can see the reader in action: http://rocketnia.github.io/era/demos/reader.html

It's part of Era (https://github.com/rocketnia/era), and specifically it depends on the two files era-misc.js and era-reader.js.

It's yet another read-table-based lisp reader. Every syntax, including lists, symbols, and whitespace, is a reader macro. My implementation of the symbol syntax is a bit nontraditional, because I also intend to use it for string literals.

(For future reference, this is the most recent Era commit as of this posting: https://github.com/rocketnia/era/tree/ab4bf206c442ecbc645b38...)

-----

2 points by Pauan 4569 days ago | link

You could also use Nulan's parser:

https://github.com/Pauan/nulan/blob/javascript/src/NULAN.par...

I specifically designed it so it can run stand-alone, without any dependencies on the rest of Nulan. It takes a string and returns a JS array of strings, numbers, and Symbols. You can easily write a simple function to transform the output so that it's compatible with wat.

You're also free to modify it, though I'm not sure what license to use for Nulan. It'll probably be either the BSD/ISC license or Public Domain.

-----

3 points by shader 4574 days ago | link | parent | on: Arc DB

If arc had support for something like DBA or SQLAlchemy built in, I might just have used it with either postgres or sqlite. However, neither of those databases really fit the arc data model very well, imo, because arc is very hash table and list oriented. Objects have very little in the way of a set schema, and hash tables map pretty well to... hash tables.

Anyway, I mostly want to leave all the objects in memory and use direct references between them; my data relations aren't that complicated, and explicit relations where necessary are actually fairly efficient. In fact, that's what most orm's a la SQLAlchemy seem to do; whenever an object is loaded, you can specify desired relations that also get loaded in memory, so you don't have to explicitly query the database each time.

Memory is cheap these days, and I was hoping for something that allowed versioning and perhaps graph-db features.

-----

2 points by akkartik 4574 days ago | link

Hmm, do you care about threading and consistency at all? If not, you could probably do everything with just arc macros over the existing flat file approach..

-----

3 points by shader 4571 days ago | link

I think that some form of scalability would be valuable, but that could easily be achieved with some sort of single threaded worker for each db 'server', and then have multiple instances running to provide the scalability. In order to make the single threaded semantics work well even in a multi-threaded application, I already have a short library for erlang-style pattern matched message passing.

Given the data volumes I've been planning on working with, I mostly want to use the permanent storage for history and fault tolerance, as opposed to live access. That could probably be handled in-memory for the most part. So maybe some form of flat file system would work without causing too many problems.

I originally started using git to effectively achieve that design without having to manage the trees, history, and diff calculation myself, but I've discovered that storing thousands of tiny objects in the git index may not be very efficient. I still think something similar is a good idea, but I would want to separate 'local' version states for each object from the 'global' version, so that it doesn't take forever to save the state of a single object. Maybe storing each object in a git 'branch' with the guid of the object as the branch name would work, since only one object would be in each index. The overhead for saving each object would be slightly higher, but it should be constant, rather than linear with the total number of objects.

Any obvious flaws with that idea that I'm missing? Have any better ideas or foundations to build off of?

-----

1 point by akkartik 4571 days ago | link

Building atop git is an interesting idea, and you clearly have more experience with it. Do you have any pointers to code?

-----

3 points by shader 4571 days ago | link

Here's the code I had written before, using the shell git interface to interact with the repo: https://github.com/shader/metagame/blob/master/git-db.arc

That code is pretty rudimentary, but allows low level access to git commands from arc, plus storage and retrieval of arc objects. After my previous comment though, I'll probably change it so that each object gets a separate branch, with 'meta branches' listing which branches to load if necessary.

-----

1 point by akkartik 4573 days ago | link

Let's build this for the LISP contest! http://arclanguage.org/item?id=17640

-----

4 points by shader 4574 days ago | link | parent | on: Regular expressions in Arc

One thing I've been thinking about recently, and might be handy for implementing regex or other features would be to allow macros access to the string contents of their original invocation. I've been in favor of meta-code features like that in order to enable help functionality and exploratory programming features, like 'src on anarki, but it might help for allowing users to write their own "reader macros" because, within the scope of the call, any macro can become a reader macro. Then, as long as you have decent string manipulation and parser library, it would be easy to just add a regex library that allows you to call (re /a?/), and add custom regex syntax that way. That or just make a standard way to add special characters to the read macro list like the #* set in racket.

How are you all currently supporting read macros, and what is your opinion on the best way to do so?

-----

2 points by akkartik 4574 days ago | link

Pauan has a Pratt parser baked into his language (https://github.com/Pauan/nulan) while my attitude for wart has been "if you want to change the syntax, hack the parser". Not to be flippant; my goal explicitly is to make the code dead simple for anybody* to hack on. Not there yet, but I'd love for you to take a stab at your regex idea with it :) Perhaps we could pair on it sometime?

* who knows C. Other terms and conditions may apply.

-----

2 points by Pauan 4574 days ago | link

I would like to point out that although it's a Pratt parser, it's been specifically modified to work better with Lisps, meaning it operates on lists of symbols rather than on a single expression. I have not seen another parser like it.

This makes it much more powerful while also being much easier to use. Using the Nulan parser, adding in new syntax is as easy as writing a macro!

For instance, in Nulan, the "->" syntax is used for functions:

  foo 1 2 3 -> a b c
    a + b + c

The above is equivalent to this Arc code:

  (foo 1 2 3 (fn (a b c)
    (+ a b c)))

And here's how you can implement the "->" syntax in Nulan:

  $syntax-rule "->" [
    priority 10
    order "right"
    parse -> l s {@args body}
      ',@l (s args body)
  ]

As you can see, it's very short, though it might seem cryptic if you don't understand Nulan. Translating it into Arc, it might look like this:

  (syntax-rule "->"
    priority 10
    order "right"
    parse (fn (l s r)
            (with (args (cut r 0 -1)
                   body (last r))
              `(,@l (,s ,args ,body)))))

The way that it works is, the parser starts with a flat list of symbols. It then traverses the list looking for symbols that have a "parse" function.

It then calls the "parse" function with three arguments: everything to the left of the symbol, the symbol, and everything to the right of the symbol. It then continues parsing with the list that the function returns.

So basically the parser is all about manipulating a list of symbols, which is uh... pretty much exactly what macros do.

Going back to the first example, the "parse" function for the "->" syntax would be called with these three arguments:

  1  (foo 1 2 3)
  2  ->
  3  (a b c (a + b + c))

It then destructures the 3rd argument so that everything but the last element is in the variable "args", and the last element is in the variable "body":

  args  (a b c)
  body  (a + b + c)

It then generates the list using "quote", which is then returned.

Basically, it transforms this:

  foo 1 2 3 -> a b c (a + b + c)

Into this:

  foo 1 2 3 (-> (a b c) (a + b + c))

As another example, this implements Arc's ":" ssyntax, but at the parser level:

  $syntax-rule ":" [
    priority 100
    order "right"
    delimiter %t
    parse -> l _ r
      ',@l r
  ]

So now this code here:

  foo:bar:qux 1 2 3

Will get transformed into this code here:

  foo (bar (qux 1 2 3))

I've never seen another syntax system that's as easy and as powerful as this.

Oh yeah, and there's also two convenience macros:

  $syntax-unary "foo" 20
  $syntax-infix "bar" 10

The above defines "foo" to be a unary operator with priority 20, and "bar" to be an infix operator with priority 10.

Which basically means that...

  1 2 foo 3 4  =>  1 2 (foo 3) 4
  1 2 bar 3 4  =>  1 (bar 2 3) 4

Here's a more in-depth explanation of the parser:

https://github.com/Pauan/nulan/blob/780a8f46cb4ff90e849c03ea...

Nulan's parser is powerful enough that almost all of Nulan's syntax can be written in Nulan itself. The only thing that can't be is significant whitespace.

Even the string syntax (using "), whitespace ( ), and the various braces ({[]}) can be changed from within Nulan.

-----

1 point by shader 4571 days ago | link

I'm actually fairly convinced that my original idea as stated above wouldn't work, because without specific syntactical support already built in for whatever new reader you're trying to add, the original reader would have no good way of knowing when your macro ended, i.e.:

  (re /\)*/)

How does the original reader know to pass all of "/\)*/" in to the 're macro? Maybe there's some clever way to tell the difference between "escaped" special characters and normal ones, but it would limit the possibilities on the temp reader.

Maybe one option would be to have more flexibility when defining macros in the first place by taking advantage of the fact that macros can be defined and evaluated at the reader stage, so they can specify their own read semantics for their evaluation if they choose. I.e. make it so that macro definitions can specify a more generic level of control than just unevaluated, pre-parsed s-exps. That would make them more of scoped 'reading macros' that shadow the existing reader, rather than 'reader macros' that just hook into it.

-----

3 points by rocketnia 4571 days ago | link

Your train of thought is very similar to a factor in several of my designs over time: Jisp[1], Blade[2], Penknife[3], Chops[4], and now the syntax I'm planning to use with Era[5].

I've always used use bracket nesting to determine where the syntax's body begins and ends. This way, most code requires no escape sequences, and the few times escape sequences are necessary, at least they stand a chance of being consistent when cutting-and-pasting across staging levels.

  (re /\)*/)       ; Broken.
  (re /(?#()\)*/)  ; Fixed with a regex comment. (Does this work?)
  (re /\>*/)       ; Fixed by redesigning the escape sequence.

Reader macros are a different approach, where the entire rest of the file/stream has unknown syntax until the reader reaches that point. That's simple in its own way, but I prefer not to make my code that lopsided in its semantics I guess. :)

(EDIT: Half an hour after I posted this, I realized I had a big mess of previous drafts tacked onto the end. I've deleted those now.)

---

[1] Jisp was one of the first toy languages I made, and it was before I programmed in Arc (or any other lisp). When it encounters (foo a b c), it resolves "foo" to an operator and sends "a b c" to that operator.

  > (if (eq "string (doublequote string)) (exit) 'whoops)
  [The REPL terminates.]

[2] Blade didn't get far off the ground, but it was meant to have a similar parser, with the explicit goal of making it easy to combine several languages into a single compiled program, with an extra emphasis on having no accidental code-order-dependent semantics at the top level. I switched to square brackets-- [foo a b c] --since these didn't require the shift key.

  [groovy
      import com.rocketnia.blade.*
      
      define( [ "out", "sample-var" ], BladeString.of( "sample-val" ) )
  ]

[3] Penknife was meant to be a REPL companion to Blade, and I developed the syntax into a more complicated, Arc-like combination of infix and prefix syntaxes (http://www.arclanguage.org/item?id=13071). Penknife was complete enough that I used it as a static site generator for a while. However, at this point I realized all the custom syntax processing I was doing was really killing the compile-time performance.

  arc.prn.q[Defining fn-iferr.]
  [fun fn-iferr [body catch]
    [defuse body [hf1:if-maybe err meta-error.it
                   catch.err
                   demeta.it]]]
  
  arc.prn.q[Defining iferr.]
  [mac iferr [var body catch]
    qq.[fn-iferr tf0.\,body [tf [\,var] \,catch]]]

[4] Chops is a JavaScript library that achieves Blade-like parsing without any goals for infix treatment or general-purpose programming. I use it as a markup language and a JavaScript preprocessor now that my static site generator runs in the browser.

  $rc.rcPage( "/", $cg.parseIn( [str RocketN[i I]A.com] ),
      "19-Nov-2012", $cg.parseIn( [str 2005[en]2010, 2012] ),
      { "title": "RocketNIA.com, Virtual Index of Ross Angle",
          "breadcrumbs": $cg.parseIn(
              [str RocketN[i I]A.com: Virtual Index of Ross Angle] ) },
      $cg.parse( [str
  
  ((This is the open source version of my site.
  [out http://www.rocketnia.com/ The online version] has a bit more
  content.))
  
  ...
  
      ] ) )

[5] Era is a module system I'm making, and I intend to compile to those modules from a lisp-like language. I've switched back to parentheses-- (foo a b c) --because smartphone keyboards tend to omit square brackets. The code (foo a b c) parses as a four-element list of symbols in the hope of more efficient processing, but the code foo( a b c) parses as a single symbol named "foo( a b c)".

-----

1 point by akkartik 4571 days ago | link

The bouncing between parens and square brackets is interesting ^_^ I weakly feel it's not worth optimizing for what's easy to type because things can change (or be changed, with keybindings, etc.) so easily. Better to optimize for how things look. But even there, parens vs brackets is in the eye of the beholder.

-----

1 point by akkartik 4571 days ago | link

"I've always used use bracket nesting to determine where the syntax's body begins and ends."

I have no idea what you mean by bracket nesting, or by staging levels. It also wasn't clear what the escape sequence is in the third example.

-----

3 points by rocketnia 4571 days ago | link

"bracket nesting"

Whoops, I can't believe I didn't use the phrase "balanced brackets" instead. ^_^

The following two pieces of text may be similar, but I'd give them significantly different behavior as code:

  (foo a b (bar c d) (baz e) f)
  (foo a b bar c d) (baz e f)

My systems don't provide any way to pass the string "a b bar c d) (baz e f" to an operator.

---

"staging levels"

Staged programming is where a program generates some code to run later on, perhaps as a second program in some sense--especially if that boundary is enforced by a need to serialize, transmit, or sandbox the second program rather than executing it here and now. Staged programming has some implications for syntax, since it's valuable to be able to see the code we're generating.

Most languages use " to denote the beginning and end of a string, so they can't also use " to represent the character " inside the string. This can makes it frustrating to nest code within code. I'll use a JavaScript example.

  > eval( "eval( \"1 + 2\" )" )
  3
  > eval( "eval( \"eval( \\\"eval( \\\\\\\"1 + 2\\\\\\\" )\\\" )\" )" )
  3

While all these stages are JavaScript code, they all effectively use different syntax. It's not easy to copy and paste code from one stage to another.

Suppose we identify the end of the string by looking for a matching bracket, possibly with other pairs of matched brackets in between. I'll use ~< and >~ as example string brackets.

  > eval( ~<eval( ~<1 + 2>~ )>~ )
  3
  > eval( ~<eval( ~<eval( ~<eval( ~<1 + 2>~ )>~ )>~ )>~ )
  3

This fixes the issue. The same code is used at every level.

In JavaScript, technically we can implement delimiters like these if we're open-minded about what a delimiter looks like. We just need a function str() that turns a first-class string into a string literal.

  > str( "abc" )
  "\"abc\""

   Open string:  " + str( "
  Close string:  " ) + "

  > eval( "eval( " + str( "1 + 2" ) + " )" )
  3
  > eval( "eval( " + str( "eval( " + str( "eval( " + str( "1 + 2" ) + " )" ) + " )" ) + " )" )
  3

Now the code is consistent! Consistently infuriating. :-p

In Arc, we delimit code using balanced ( ). The code isn't a string this time, but the use of balanced delimiters has the same advantage.

  > (eval '(eval '(eval '(eval '(+ 1 2)))))
  3

This advantage is crucial in Arc, because any code that uses macros already runs across this issue. A macro call takes an s-expression, which contains a macro call, which takes an s-expression....

Since we're now talking about macros that take strings as input, let's see what happens if Arc syntax is based on strings instead of lists.

  > (let total 0 (each x (list 1 2 3) (++ total x)) total)
  6

  > "let total 0 \"each x \\\"list 1 2 3\\\" \\\"++ total x\\\"\" total"
  6

If we use balanced ( ) to delimit strings, we're back where we started, at least as long as we don't look behind the curtain.

  > (let total 0 (each x (list 1 2 3) (++ total x)) total)
  6

If you want working code for a language like this, look no further than Penknife. :)

---

"It also wasn't clear what the escape sequence is in the third example."

Are you talking about this one?

  (re /\>*/)

The original code would be broken in my approach because it uses ) in the middle of the regex, so the macro's input would stop at "/\". This fix addresses the issue by using a hypothetical escape sequence \> to match a right parenthesis, rather than using the standard escape sequence \).

If you're talking about my Penknife code sample, the "qq." part is quasiquote, and the \, part is unquote. Quasiquotation is relatively complicated here due to the fact that it generates soup, which is like a string with little pieces floating in it. :-p Penknife has no s-expressions, so it was either this foundational kludge or the hard-to-read use of manual AST constructors.

It's hard to count these examples with a whole number. XD Let me know if you were talking about my Blade code sample (the third code block in the post) or my Jisp code sample (third if you count the two example regex fixes separately).

-----

1 point by akkartik 4571 days ago | link

Super useful, thanks. Yes, you correctly picked the code I was referring to :)

That issue with backslashes you're referring to, I've heard it called leaning toothpick syndrome.

-----

2 points by rocketnia 4571 days ago | link

"Yes, you correctly picked the code I was referring to :) "

Which of my guesses was correct? :-p

---

"That issue with backslashes you're referring to, I've heard it called leaning toothpick syndrome."

Nice, I hadn't heard of that! http://en.wikipedia.org/wiki/Leaning_toothpick_syndrome

It might be worth pointing out that the LTS appearing in my examples is more pronounced than it needs to be. The usual escape sequence \\ for backslashes creates ridiculous results like \\\\\\\". If we use \- to escape \ we get the more reasonable \--" instead, and then we can see the nonuniform nesting problem without that other distraction:

  > eval( "eval( \"1 + 2\" )" )
  3
  > eval( "eval( \"eval( \-"eval( \--"1 + 2\--" )\-" )\" )" )
  3

Here's another time I talked about this: http://arclanguage.org/item?id=14915

-----

2 points by dido 4571 days ago | link

I ran into a similar issue with regex syntax when attempting to incorporate it into Arcueid's reader. There seems to be no easy way to parse a regular expression using Perl-like /.../ syntax, not if you also want symbols that use /'s for other things, e.g. the division function. Arcueid thus uses for now, r/.../ for regular expressions, and that syntax could be more easily distinguished from other legitimate uses of symbols with a minimum of fuss.

-----

1 point by akkartik 4571 days ago | link

Wart's tokenizer already knows about backslashes inside strings, so "\"" becomes one token. It seems plausible to try tokenizing in a backslash-aware way everywhere and not just inside strings. Other than that you would have to treat slashes as a delimiter like double-quotes.

It might be an ugly design, but I think it would work, and it would be worth trying out.

-----

2 points by shader 4633 days ago | link | parent | on: State of the arc

Hello akkartik, I'm glad to be back. I've been lurking on and off for the past few years, but it will be nice to get involved again.

I guess I should be fair and try yours out too while I'm at it, though I have to say that the whitespace sensitivity makes me somewhat cautious. For some reason, I've always liked the more traditional sexpr syntax for lisp. Does wart mind if I use parens for everything?

Even more disconcerting would probably be your infix support. Maybe it won't be as much of a problem as I'm expecting, but I like being able to use random symbols in my variable names.

Also, numbered git commits strikes me as a little odd, as does the numbered source files. In the latter case its little more than taste, and I see that Pauan is doing the same thing. Maybe you have a good reason for it and I'd start doing the same if I only understood.

Anyway, enough casting of doubt on someone else's pet project. I'm certainly interested in a clean language with the possibility of fexprs to try out and maybe even keyword args. I'm not sure what I would actually need them for, but more power is never something I will turn down.

I'm also somewhat interested in the wat/taf projects, but they seem to be a bit more experimental right now.

As for my project plans, I'm thinking of doing two or three web service projects on the side, as a long term investment counter-point to my current hourly contracting job.

The first one I'm thinking of focusing on is a sort of easy, data-driven unit-testing-as-a-service concept. If you've ever seen the fit or fitnesse testing frameworks, this idea was originally based off of those. Instead of writing unit test code, you would use a website to input test values and corresponding expected results in a table format. The first row of the table specifies the function or object being tested and its arguments or property names, and each row after would give the values for that test case, with the last column or set of columns specifying the return value. Fitnesse did that for c# and java, but it had a few major flaws. In the first case, it would only interact with classes that inherited from the fitnesse test classes, so you were forced to write test harness code anyway. Second, the user had to format the tables manually using a wiki format, so it required a bit too much manual formatting and there wasn't any way to provide additional metadata or add any more dimensions to the tables.

The first few incarnations would probably be something really simple that would only be useful for testing code locally, but eventually it would be expanded to a web service that supports multiple languages and has a way to point it at any vcs repo and run tests interactively in the cloud. This would hopefully be a cheapish testing and specification solution for startups or foss projects that would be easy enough to add to existing code that people would actually do it. That and an enterprise version that can be deployed internally, which would hopefully make it so that business analysts and the QA team can write, run, and review the tests without having the dev team write a separate test harness for them.

There's a bit more to the idea, but right now none of it has been written, so I probably shouldn't advertise features that I may never get to, or might not even be feasible in the first place. Of course, I like talking even more than I like coding, so I'm sure we can discuss it if you're interested. In fact, I would be very open to discussion, as I'm sure what other people actually want/need/would be willing to pay for in a test system won't match up exactly my own ideas.

-----

2 points by Pauan 4633 days ago | link

"In the latter case its little more than taste, and I see that Pauan is doing the same thing."

The reason I numbered them was just to make it easier to navigate the code. As a user, if you see "01 nu.rkt" you know it's loaded first, and that "02 arc.arc" is loaded second. And each one builds on the stuff defined previously, so you can read it in a linear order.

I only did that for the stuff that's automatically loaded when you start up Arc. You'll notice that the "lib" and "app" stuff is un-numbered. And I don't expect user-created code or libraries to use numbers! So I definitely don't take it as far as wart does.

-----

1 point by akkartik 4633 days ago | link

Ah, I was unaware of fitnesse! Thanks for the pointer, that's a really neat idea. Tests are a huge part of wart's goal of 'empowering outsiders'[1].

Sucks that fitnesse is stuck in the java eco-system. Just porting it to lisp/arc/wart would be awesome for starters..

Arguably much of the benefit of testing comes from the same person doing both programming and testing. Organizations which separate those two functions tend to gradually suck. If you accept that, inserting a web interface between oneself and one's tests seems like a bad idea. Perhaps fitnesse-like flows would be best utilized to involve the non-programmer in integration testing, testing the entire site as a whole rather than individual functions. Perhaps script interactions with a (non-ajax for starters!) app so that the CEO/QA engineer doesn't have to know about REST and PUT/GET? Hmm, that would be cool..

Don't mind me, I'm just thinking aloud :)

[1] https://github.com/akkartik/wart/blob/531622db6a/000preface. I spend a lot of time thinking about this, and at half an opening will fill your ears about all the ways in which tests, whilst awesome, aren't sufficient. Along with ideas for complementary mechanisms.

-----

3 points by shader 4633 days ago | link

Organizations which separate those two functions tend to gradually suck

Hmm... Well, that's certainly a valid opinion, and it may even be true in a lot of cases, however I think the issue is largely due to two other related issues: 1) The requirements aren't specified clearly enough, and are dissociated from the tests as well, and 2) they just don't have good enough tools.

Tests can serve many purposes. The most basic, given the fundamental concept of a test, is to tell you when something works, or when it doesn't. TDD focuses on the former, unit testing and regression testing focus more on the later. Tests can be used at different points in the development cycle, and if you use the wrong kind at the wrong point, it's not really all that helpful.

My concern is that its too difficult to write the correct kind of test, so most developers use the wrong kind or don't use any at all. There's nothing really wrong with that, I think it's just an unfortunate inefficiency, like trying to debug without stacktraces. >.> Hmm. Something to look forward to going back to arc I suppose. Anyway, my goal is to make testing easy enough to do, either for developers who just want a quick way to check if they broke something after making a 'minor' change, or for larger companies that want to know that all their requirements have actually been met.

So, to solve the first problem I'm hoping to utilize a lot of reflection and code inspection so that at least the outline of the test cases can be generated automatically, if not more. Then it should be really easy for the programmer to just add the missing details, either as specific test vectors or by using a more general definition of requirements using something like QuickCheck's generators.

In the long run the plan is for the tool to be able to support working the other direction, from requirements to tests. Hopefully with better tool support, and more intelligent interaction with the system under test, it should be possible for the architects to specify the requirements, and the tool should be able to verify that the code works.

Yes, divorcing tests from code could mean that different people do them. Doesn't have to be the case, but it becomes a possibility. And that means that they could will have a different perspective on the operation of the system, but not necessarily a worse one. If it's the architects or BAs writing the tests, then they might actually have more information about how the system should be working than the programmers, especially in the case that the programmers are outsourced. At which point allowing someone else to write the tests is an improvement. When developers write the tests, it doesn't help if they use the same incorrect understanding of the requirements for both the tests and the code.

Hopefully by making an easy enough tool that supports rapidly filling in tests based on code analysis (which would help anyone that doesn't know much about the actual code base match it up with the requirements they have) reducing boiler plate and barriers to testing, making it a much easier to use tool for developing. Maybe if it gets easy enough, developers would find that testing actually saves enough time testing to be worth the few seconds specifying test vectors for each method. And if it can do a good enough job at turning requirements into tests in a way that is clear enough to double as documentation, it should save the architects and BAs enough time, as well as make implementation easier for developers, that I might actually be able to sell licenses :P

-----

2 points by akkartik 4633 days ago | link

"If it's the architects or BAs writing the tests, then they might actually have _more_ information about how the system should be working than the programmers,"

Oh, absolutely. I didn't mean to sound anti-non-programmer.

I tend to distrust labels like 'architect' and 'programmer'. Really there's only two kinds of people who can work on something: those who are paid to work on it, and those who have some other (usually richer and more nuanced) motivation to do so. When I said, "Organizations which separate those two functions tend to gradually suck", I was implicitly assuming both sides were paid employees.

If non-employees (I'll call them, oh I don't know, owners) contribute towards building a program the result is always superior. Regardless of how they contribute. They're just more engaged, more aware of details, faster to react to changes in requirements (which always change). Your idea sounds awesome because it helps them be more engaged.

But when it's all paid employees and you separate them into testers and devs, then a peculiar phenomenon occurs. The engineers throw half-baked crap over to testers because, well, it's not their job to test. And test engineers throw releases back at the first sign of problems because, well, it's not their job to actually do anything constructive. A lot of shuffling back and forth happens, and both velocity and safety suffer, because nobody cares about the big picture of the product anymore.

(Agh, this is not very clear. I spend a lot of time thinking about large organizations. Another analogous example involves the patent office: http://akkartik.name/blog/2010-12-19-18-19-59-soc. Perhaps that'll help triangulate on where I'm coming from.)

(BTW, I've always wondered: what's that cryptic string in your profile?)

-----

2 points by akkartik 4633 days ago | link

"Does wart mind if I use parens for everything?"

Nope, that will always work as expected: https://github.com/akkartik/wart/blob/531622db6a/004optional.... I threw it out there since you mentioned python :) but syntax is the least important experiment in wart.

Re numbered files: they keep me from ever needing to refer to filenames in code: https://github.com/akkartik/wart/blob/531622db6a/001organiza...

Don't feel like you have to be fair :) The world isn't fair, and I understand about differences in taste. My goal with wart and some other stuff has been to figure out how to empower outsiders to bend a codebase to their will and taste with a minimum of effort. For example, one experiment I'd love to perform on you is to measure how long it takes you to fork wart to toss out infix and get back your beloved hyphens :) But no pressure.

-----

1 point by shader 4634 days ago | link | parent | on: State of the arc

Interesting. An arc rebuild that's faster and cleaner while maintaining most compatibility certainly sounds attractive. I'll have to try it out at least a bit.

A slightly unrelated question: Have you ever tried compiling a standalone binary for Nu or anarki, at least for the compiler? I'm pretty sure racket can do that, but I don't think I've heard of anyone doing it.

As for fexprs, I think something can probably be done even in arc to some extent, using a copy of the original code and dynamic recompilation with memoization for speed. It would have the advantage of also enabling the self-documentation features that I added to anarki too, though it might be a bit slow.

-----

1 point by Pauan 4634 days ago | link

"Have you ever tried compiling a standalone binary for Nu or anarki, at least for the compiler? I'm pretty sure racket can do that, but I don't think I've heard of anyone doing it."

No, I haven't, though I suppose it could be done... I don't see much point to it, though. I mean, I guess if you're on Windows, but...

---

"As for fexprs, I think something can probably be done even in arc to some extent, using a copy of the original code and dynamic recompilation with memoization for speed. It would have the advantage of also enabling the self-documentation features that I added to anarki too, though it might be a bit slow."

I've experimented with using fexprs in a compiled environment like Racket, but you really really do want things to be interpreted. At the very least, using fexprs won't blend well with Arc code, because the entire Arc language has been designed around macros, to the point where adding in fexprs just makes things more clunky. If you want fexprs, I'd recommend a language that was designed from the start for fexprs, like Kernel, PicoLisp, or newLISP.

-----

2 points by shader 4633 days ago | link

Well, one of the reasons I was interested in compiling at least the standard arc library was for improving startup speed, say for use as a simple cgi script on a shared host, or a basic command line utility. It also might reduce distribution size, instead of making users install all of racket and the arc source, which can be somewhat bothersome.

I suppose its possible that trying to add fexprs to arc would just make things worse, but I was always attracted to the idea of making environments and scope easier to manipulate, and macros fit in better. As it currently is, macros are very separate from the rest of the language and stick out a little bit. I've rarely had a really good reason to need it, but it just bothers me sometimes that they aren't first class citizens.

-----

1 point by Pauan 4633 days ago | link

"Well, one of the reasons I was interested in compiling at least the standard arc library was for improving startup speed, say for use as a simple cgi script on a shared host, or a basic command line utility."

Oh! That's what you're talking about! Sure, I've experimented with that too. I think it's possible, but it would require mucking about with Racket's code loader... still, that's not a bad idea.

---

"I suppose its possible that trying to add fexprs to arc would just make things worse, but I was always attracted to the idea of making environments and scope easier to manipulate, and macros fit in better. As it currently is, macros are very separate from the rest of the language and stick out a little bit. I've rarely had a really good reason to need it, but it just bothers me sometimes that they aren't first class citizens."

Absolutely. I too like the idea of fexprs. I'm just saying that from a purely practical perspective, Arc was not designed around fexprs, so tacking them on after the fact won't work so well. It shouldn't be hard to make a variant of Arc that is built around fexprs. In fact, wart is very Arc-like but has fexprs (or something that's like fexprs anyways).

-----

2 points by shader 4634 days ago | link | parent | on: Overriding macros with lexically-scoped vars, redu...

Maybe we could make it so that only a few macros can't be overridden in call position? Like built-ins vs. user macros? Or would it be ok if we just made it so they could be overridden, and just warn people that overriding important macros is likely to have unexpected results?

-----

2 points by shader 4792 days ago | link | parent | on: Another take on M-exprs: the Readable Lisp S-expre...

Yeah, I'm sure a lot of you have seen this already. I'm really more interested on what your current opinions are concerning the various M-expression styles vs. traditional S-exprs.

-----

1 point by akkartik 4792 days ago | link

The readable/sweetexpr project is probably the most comprehensive place to see the state of the art. I'm not actually eager to use it, though; I think you can do better if you don't care about supporting existing lisps.

(I'm actually working on some ideas inspired by http://www.arclanguage.org/item?id=16717; look for a post soon. This whacky idea also seems relevant: http://arclanguage.org/item?id=16589)

-----

2 points by shader 5209 days ago | link | parent | on: Arc, Emacs, and SLIME

What makes you say that? What features would you consider "make it worth using", and why won't they work? It's quite possible that all that needs to be done is change a few variables or replace a few functions to use arc names instead of CL names. Case in point being progn -> do, etc.

-----

1 point by HankR 5196 days ago | link

swank is the lisp package that does all of the slime heavy lifting on the lisp runtime. It manages threads, executes code, interfaces with the debugger, does various reference/definition lookups, etc. It consists of a large CL package with a smaller, environment specific set of functions that are defined for each lisp environment (SBCL, Clozure, etc.) In addition, slime and swank are tightly bound, and changes to one requires changes to the other. One or the other change fairly frequently. Porting swank to work with something like ARC would be a major undertaking, maintaining it would probably be worse. It's probably not worth the effort. It would probably be just as fast to create a new arc editing interface.

-----

1 point by shader 5220 days ago | link | parent | on: Possible bug in coerce with quoted nils?

For the most part, I don't mind this behavior. There's only one case where I with this was not the case, in which I was writing a compiler in arc which attempted to represent the ast as symbols in lists. Silly me, thinking that was the right way to represent an ast in arc.

The problem is that if someone were to use a variable in that language named nil, it would be represented in my ast as 'nil, which would then be treated in all of the rest of the code (including any parts which printed it out) as if it weren't there. I guess I should have used strings in all of those cases, but I was taking advantage of the fact that I could actually bind values to the symbols to store some metadata. It turned out to be a really handy way to work with the language, unless someone used 'nil as a variable. Doing anything else would have worked better, but been more verbose and hackish.

This is the only reason I wish that nil -> '() instead of nil = '(). Because |()| != '(), while |nil| = 'nil.

-----

1 point by Pauan 5220 days ago | link

I think this should be configurable in ar, with the default being to follow Arc/3.1, but still allow the programmer to make nil different from 'nil.

Hey awwx, how hard would it be to make this change in ar right now? From what I can see, "nil" is just a global variable bound to the symbol 'nil. So, what if I did this...?

  (ail-code (racket-set! nil (uniq)))

It seems to work okay in the couple tests I did, but I wonder if it would break anything...

-----

1 point by Pauan 5220 days ago | link

Hm... I just realized ar allows for rebinding nil and t! So you can just do this:

  (= nil (uniq))

Neat.

-----

1 point by Pauan 5220 days ago | link

By the way, if you ever decide to rebind nil, and it breaks stuff, you can fix things by doing this:

  (assign nil 'nil)

You need to use `assign` rather than `=` because `=` breaks if nil isn't 'nil.

-----

1 point by rocketnia 5220 days ago | link

"I guess I should have used strings in all of those cases, but I was taking advantage of the fact that I could actually bind values to the symbols to store some metadata."

If you don't mind, how does/did it seem easier to bind values to symbols than to bind them to strings?

-----

2 points by Pauan 5220 days ago | link

I'm not shader, but I'm guessing that it's because... you can't.... bind values to strings...?

I mean, yeah, you could have a table where the keys are strings, and treat that as the metadata, but uuugh it's clunky and I've found myself disliking the whole "global table to hold data" thing more and more, recently.

Then again, I don't know how their compiler works, so I might be completely off-base here...

-----

1 point by shader 5220 days ago | link | parent | on: Emacs+arc on windows

What version of emacs are you using? Maybe you'll need to upgrade or install cl-macs.el.

-----

1 point by ly47 5218 days ago | link

Hi I'm using emacs Emacs 23.3.1 and cl-macs.el is installed. I type arc-mode and I get Arc in the menu, but when i choose Run inferior-arc error message:error: Required feature `cl-macs' was not provide. Thanks

-----

2 points by zck 5218 days ago | link

What is arc.el? There's no arc-mode for Emacs that I know of. What are you expecting to happen when you type M-x arc ? Where did you get arc.el?

I wrote up instructions on how to get SLIME working with Arc, but I don't have a Windows box to test on. Can you translate the directions to Windows, and then see if it works for you? http://arclanguage.org/item?id=14998 Notably, you don't start SLIME by typing M-x arc-mode; you start it by typing M-x slime .

-----

1 point by ly47 5217 days ago | link

Hi https://github.com/nex3/arc/tree/master/extras ly

-----

1 point by zck 5217 days ago | link

Ah. This isn't SLIME. As akkartik said, it's likely that no one here uses it. If you want to try SLIME, follow my instructions here: http://arclanguage.org/item?id=14998

-----

1 point by akkartik 5217 days ago | link

Ah, I'd forgotten about that. It hasn't been touched in over a year (http://github.com/nex3/arc/commits/master/extras/arc.el); I'm not sure if anyone here has experience with it.

-----

2 points by shader 5211 days ago | link

I definitely use arc.el on a regular basis, though I won't claim to be familiar with the finer points of its capabilities. As far as I'm concerned, it offers reasonable highlighting and indentation for arc code; I haven't tried to get it to do much else. For most of my lisp editing needs, I rely on paredit.

-----