Arc Forum | Pattern Matching in Arc: now with the pat-match: modifier

Arc Forum

Pattern Matching in Arc: now with the pat-match: modifier

13 points by almkglor 6345 days ago | 28 comments

fn, rfn, afn, def, mac, etc. now have a pat-match: modifier.

To use:

   (require "lib/defpat.arc")
   (pat-match:afn
                ((x . xs))      (cons (+ 1 x) (self xs))
                (())            nil)

Get it off nex-3's arc-wiki.git ^^

2 points by drcode 6345 days ago | link

I don't know what it is, but pattern matching like this just doesn't feel very "arcy" to me (and I'm not claiming my ideas are superior or anything...)

I think it has to do with the fact that Haskell uses pattern matching as a comprehensive way for defining all functions. Using a pattern matching library in arc means that for every function I write I would have to decide whether to use the pattern matching definition, or a regular definition, which feels inelegant...

I think an argument could be made to make pattern matching (very much like this) the default behavior for all arc functions... but short of that, I, personally, would probably not use a library like this very often...

However, it looks like an elegant implementation of the concept, so if I had to ever write some code with extremely hairy destructuring I definitely might change my mind an use it in a limited section of code...

-----

7 points by almkglor 6345 days ago | link

> if I had to ever write some code with extremely hairy destructuring I definitely might change my mind an use it in a limited section of code...

Which is the point of every library for every language, anyway. I'd be hard pressed trying to use a matrix library in an IRC bot.

Personally, I find pattern matching to be a powerful way of not having to fiddle with 'caris and what not. I just say "if the user puts this and that and this, then do it this way". I'd argue it removes some amount of bugs, too - I caught a bug in pg's 'varline function because I was trying to convert it to pattern-matching form. Mucking about with car/cdr and friends feels rather low-level to me; and if Arc is attempting to be the highest of high level languages, it certainly shouldn't force the user to think in terms of list nodes.

-----

2 points by nex3 6345 days ago | link

That is incredibly cool. Totally the right way to balance the usefulness of pattern-matching with the more general applicability of the normal semantics.

I'd suggest a shorter name, though; "pat-match" is pretty long. What about just "pat"? Or even "pm"?

-----

1 point by almkglor 6345 days ago | link

Try shortening it to "pat", then do the following:

  (let pat (table)
     (pat 1))

"pat" was my original choice, until I realized that some of my own code was using (pat ...) as a function (the defpat code, of all things). Because of this I actually lengthened the name to pat-match. Try seeing the diff in the git http://git.nex-3.com/arc-wiki.git?a=blobdiff;f=lib/defpat.ar... you'll see that I changed pat -> patn, then realized that it would be stupid to expect everyone else's code to conform. pat-match, I thought, would be less likely to collide.

-----

2 points by nlavine 6345 days ago | link

Sorry if I completely misunderstand things, but shouldn't this work? Let rebinds the name pat to the table, shadowing the old macro binding, and (pat 1) then refers to the element of the table indexed by 1 (which in this example doesn't exist, but I don't think that's the point of the example).

If there's a bug in ac, let's fix that right now, rather than hacking around it and making the rest of arc that much less elegant and simple. (Of course, you might say you want people to be able to use pattern-matching and be able to have variables named pat. That's a good criticism, but it's not how I read your post.)

-----

1 point by almkglor 6345 days ago | link

Macro bindings are never shadowed by let. Enter this into your Arc for proof:

  (let obj (table)
      (obj 1))

Then show me why it's different from:

  (let unlikely-name-for-a-macro (table)
      (unlikely-name-for-a-macro 1))

-----

1 point by nlavine 6344 days ago | link

Okay. But should they be shadowed?

(My instinct is yes.)

-----

1 point by almkglor 6344 days ago | link

So does mine. I'm currently conceptualizing a method for overriding macro definitions within (fn (macro-name) ...) forms.

-----

1 point by nlavine 6344 days ago | link

Hmm. Well, just glancing at the arc1 source code, we have these three lines in ac.scm:

  (define (ac-call fn args env)
    (let ((macfn (ac-macro? fn)))
      (cond (macfn
         ....)))

which could be replaced by

  (define (ac-call fn args env)
    (let ((macfn (and (not (memq fn env)) (ac-macro? fn))))
      (cond (macfn
         ......))),

preventing the cars of applications being automatically macroexpanded. I don't understand the other problem you brought up. Why would (fn (macroname ...) ...) not work?

-----

3 points by almkglor 6344 days ago | link

Actually, that was the point I was planning to do the hack: (let ...) and friends deduce down to (fn ...) forms, so logically the place to do replacement of variable names should be in (fn ...) forms. ac-call is involved in function calling, which is really related.

However I've since changed my mind, because the problem then becomes:

  (withs (do 1 re 2 mi 3)
    (+ do re mi))

Bonus points if you figure out why your solution will also fail the above ^^

-----

3 points by nlavine 6344 days ago | link

It fails because the outermost let form in the expansion redefines do, and the innermost let body uses do. Nice : )

I think, however, that I have made an exciting discovery: we have a hygienic macro system! Or at least half of one!

Here's the current implementation of withs:

  (mac withs (parms . body)
    (if (no parms) 
        `(do ,@body)
        `(let ,(car parms) ,(cadr parms) 
           (withs ,(cddr parms) ,@body))))

And here's an alternate implementation:

  (mac withs (parms . body)
    (if (no parms)
        (apply (rep do) body)
        (apply (rep let) (car parms) (cadr parms)
          `(withs ,(cddr parms) ,@body)))))

Because we know that macros are really just tagged procedures, we can apply them ourselves and let lexical scoping guarantee that they aren't shadowed!

Granted, it's ugly, and it only solves part of the problem, but I think it's a reasonable place to start hacking from.

-----

1 point by almkglor 6343 days ago | link

>Granted, it's ugly, and it only solves part of the problem, but I think it's a reasonable place to start hacking from.

Oh, no. Because it means you'll need to (apply (rep macro) body) whenever you need to use a macro within a different macro. That's just too much bug-prone work and is completely unreasonable.

-----

1 point by nlavine 6343 days ago | link

Yeah.

Also, on further reflection, I think perhaps the original hack is better - even given the weird behavior of (withs (do 1 re 2 mi 3) ...). If we're really making a language for quick hacking and prototyping, and giving the programmer all the tools they need, then maybe letting them redefine do is exactly the right thing to do. In your example it doesn't make sense, sure, but what if it was redefined as a function? Or a new macro that did something interesting with its body (inserted debug statements)? Maybe we should deliberately let the programmer redefine everything they want (as pg says) - and make sure not to write anything in a safer, more idiot-proof style. That's not the point of arc.

(Namespace collision is another issue. But we need a way to deal with that that only affects things when you want it to, and never when you don't.)

-----

3 points by almkglor 6342 days ago | link

Oh no, heck no squared. Because if you do, that means that every macro has to publish any symbols-in-functional-position it actually creates. And every programmer has to check every macro he or she uses to make sure that any local variables he or she has do not conflict with the symbols-in-functional-position published by the macros. Hades breaking loose and all that. Funny action at a distance.

As for redefining a macro - then let's implement 'macrolet, that way we don't have capture.

-----

1 point by nex3 6345 days ago | link

I see your point... "pat" does seem to be a pretty common variable name. What about "pm" then? Or "ptrn"? Nine characters just seems so long...

-----

3 points by almkglor 6345 days ago | link

I personally have a tendency to use short variable names; I'm pretty sure I've used "pm" as "port mapping" (this is in electronic development automation) at least once. As for ptrn - well, all it takes is one Arc programmer to stumble on that name.

Heck, maybe Arc should be a Lisp-2. with values (including functions) on one namespace and macros on another. Then select macros by, I don't know, putting some syntactic marker for "use macro for this symbol" or somesuch.

-----

3 points by nex3 6345 days ago | link

Okay, sure, but the same could be said for other short names like "cut," or "list", or "fn", "all", "mem", etc. Given that Arc is a Lisp-1, you have to make compromises between terseness and commonality. And in this case it seems pretty clear to me that this function is useful enough and "pm" rare enough as an identifier (never used once in arc-wiki) that the compromise should lean towards a shorter name.

-----

1 point by almkglor 6345 days ago | link

Nitpick: local variables named "cut", "list", "all", and "mem" will not clash with Arc built-ins, because function names which clash with local variables will simply be overridden. It's only with macros such as "fn" which will fail.

That said, I've since added lib/pm.arc which defines the "pm" macro and shadows pat-match.

-----

1 point by nex3 6344 days ago | link

Good point on the nitpick, although that still means macros that expand to use those functions and any uses within the scope of the variable will fail.

Adding an extra lib for just in case you want to use this function seems like a bad compromise to me. It seems like one of those cases where you could do a language feature one way to make one camp happy, another way to make the other camp happy, and you end up doing both which just makes the language, larger, more redundant, and confusing as a result.

We should either make the name shorter or not, not provide a shorter alias that will just confuse people.

I've just pushed a better compromise to the wiki, renaming pat-match p-m which is shorter but unobtrusive.

As a separate note, now that we have p-m, wouldn't it be better not to use defpat at all? In fact, if we use "pm" instead, pm:def is just as short.

-----

1 point by almkglor 6344 days ago | link

> As a separate note, now that we have p-m, wouldn't it be better not to use defpat at all? In fact, if we use "pm" instead, pm:def is just as short.

That's correct, although p-m:def does have a symbol on the top row of the keyboard (-) and shifted symbol on the home row (:). I'd suppose that using pattern-matching in def is significantly more common than all the other cases combined. Although in terms of readability, certainly p-m:def would raise the question of "wtf is this?" only once and introduce the pattern-matching modifier completely, whereas "p-m:afn" and "defpat" will introduce them separately.

-----

1 point by bramsundar 6345 days ago | link

I'm still a noob, and my idea could be completely off base, but I think that it could be possible to devise a system that allows something like optional namespaces. The idea would be to have each symbol in the environment map to a table rather than directly to a function or variable. The table could have one entry per type (for example, pat's table could have one entry that's a macro, one that's a function, and one that's a variable). In standard cases, the value lookup function could return the entry that was last added to the hash table and thus mimic a Lisp-1 for all purposes. However, if the programmer specifies the type of a symbol, then the table would return the corresponding entry. There could be an fairly unobtrusive syntax for this. In one such syntax, the symbol pat would be:

pat (standard usage) #pat (function pat) `pat (macro pat) @pat (variable pat)

Then, pat-match:def would become `pm:def.

-----

1 point by almkglor 6345 days ago | link

No, my friend, the problem is the order of evaluation.

  (let pm (table)
      (pm 1))
  (mac pm ...)

versus:

  (mac pm ...)
  (let pm (table)
     (pm 1))

Unless you qualifiy everything with syntactic modifiers, order of evaluation will matter, and matter horribly.

-----

1 point by bramsundar 6345 days ago | link

I'm sorry, but could you explain your example a bit more? I don't think that I understand the issue here. If pm is redefined as a table through the let, why wouldn't we be able to access the macro version through the modifier? the table would clobber the variable version (@pm), but not the macro version right?

-----

2 points by almkglor 6345 days ago | link

Local variables never override macros. In the first case above, the let will return the expected value correctly, because pm at that time hasn't been defined yet. In the second case, it will not return the expected value, because pm is defined as a macro, and (pm 1) gets replaced with whatever (pm 1) expands to, let notwithstanding.

-----

1 point by raymyers 6343 days ago | link

I like that you added gaurd syntax.

    ;; foo [a,b] | a < b = True
    (p-m:def foo (,((a b) (< a b))) t)
    (foo '(1 2))  =>  t

But is there any way to test across parameters?

    ;; foo a b | a < b = True
    (p-m:def foo (a ,(b (< a b))) t)
    (foo 1 2)  =>  Error

-----

2 points by almkglor 6342 days ago | link

Not yet, although I'm working on it. The main problem is that currently, destructuring splits on the (car ...) and (cdr ...) of each list node in the pattern, something like this:

  ;a is the variable that contains the current list node of the function input
  ;p is the pattern
  `(and (acons ,a)
        ; this creates the test
        (let ,a (car ,a) ,(self a (car p)))
        (let ,a (cdr ,a) ,(self a (cdr p))))

Obviously, I'll need to resequence them - the cdr branch has to be physically located within the car branch, so that I can capture any variables in the created guards.

-----

2 points by almkglor 6342 days ago | link

  ,((a b) (< a b)))

LOL. I didn't even realize this was legal code. But mind you, Arc will be the one destructuring the pattern within the first element of ,(... ...), meaning it won't count the number of elements and all that. Hmm. Note sure if going recursive here is a good idea. Hmm.

-----

1 point by shader 6228 days ago | link

Um, how does that work?

-----