- For the profiler to be useful, functions need to be labeled in some way so that the profiler will show which functions are which.
- To identify functions, I need to propagate source location information from an Arc source code file through to Racket, which includes propagating the source location information through Arc macros.
One option would be to create a macro system designed to propagate source location information through macros. This of course is what Racket does.
In Arc, macros are defined with list primitives (car, cdr, cons), function calls (for example, `(mac let (var val . body) ...)`), and operations that can be built out of those (such as quasiquotation).
My hypothesis is that by allowing list primitives to operate on forms labeled with source location information, we can continue to use Arc macros.
This leads to an interesting issue however...
Consider
(apply (fn args args) xs)
this is an identity operation. For any list `xs`, this returns the same list.
In Arc 3.2, and in my implementation, an Arc function compiles into a Racket function. E.g., `(fn args args)` becomes a Racket `(lambda args args)`.
Of course we don't have to do that. If we were writing an interpreter, for example, an Arc function would compile into some function object that'd be interpreted by the host language... an Arc function wouldn't turn into something that could be called as a Racket function directly.
But, if `(fn args args)` is implemented as a Racket function `(lambda args args)`, then to call the function with some list `xs` we need to use Racket's apply. But, of course, Racket's apply takes a Racket list. So in Arc 3.2, Arc's apply calls Racket's apply after translating the Arc list into a Racket list.
Leaving out a couple of steps, what in essence we end up with in Racket is the equivalent of:
(apply (lambda args args) (ar-nil-terminate xs))
where `ar-nil-terminate` converts an Arc list to a Racket list.
Now, for Amacx, I've invented my own representation for Arc lists. In my version, lists (that is, cons cells) can be labeled with where in a source code file they originated from. For example, if I read "(a b c)" from a file, I can inspect that list for source location information:
> (prn x)
(a b c)
> (dump-srcloc x)
foo.arc:1.0 (span 7) (a b c)
foo.arc:1.1 (span 1) a
foo.arc:1.3 (span 1) b
foo.arc:1.5 (span 1) c
which shows me that the list in `x` came from a source file "foo.arc" at line 1, column 0 with a span of 7 characters; that the first element "a" was at column 1, "b" was at column 3, and so on.
This is entirely internally consistent. E.g. (cdr x) returns a value which contains both the tail of the list `(b c)` and the source location information for the sublist.
But.
In a macro,
(mac let (var val . body)
`(with (,var ,val) ,@body))
I'm not seeing the source location get through the rest args.
(mac let (var val . body)
(dump-srcloc body)
`(with (,var ,val) ,@body))
Zilch. Nothing. Nada. `body` is a plain list, no source location information.
Why?
Because I stripped it.
(apply (lambda args args) (ar-nil-terminate xs))
The argument to Racket's `apply` has to be a Racket list. Not my own made-up representation for lists.
Thus my version of `ar-nil-terminate` removes source location information and returns a plain Racket list. I did this early on, because loading Arc fails quite quickly when `apply` doesn't work. I didn't realize it would mean that macros wouldn't get source location information passed to them.
So, a macro like `let` turns into the equivalent of
(annotate 'mac
(lambda (var val . body)
...))
the macro is invoked with `apply`... and there goes the source location information in `body`.
Of course, like I said, I don't have to implement an Arc rest argument with a Racket rest argument. An Arc function that took a rest argument could turn into some other kind of object where I'd pass in the rest argument myself.
But that would be slower. Probably.
I can get the profiler to work (I think), but then I'd be profiling the slower version of the code.
Though the runtime that implements the extended form of lists with source location information is slower anyway because all of Arc's builtins need to unwrap their arguments.
Where do you actually need source location information in order to get Arc function names to show up in the profiler?
Would it be okay to track it just on symbols, bypassing all this list conversion and almost all of the Arc built-ins' unwrapping steps (since not many operation have to look "inside" a symbol)?
If you do need it on cons cells, do you really need it directly on the tail cons cells of a macro body? I'd expect it to be most useful on the cons cells in functional position. If you don't need it on the tails, then it's no problem when the `apply` strips it.
Oh, you know what? How about this: In `load`, use `read-syntax`, extract the line number from that syntax value, and then use `syntax->datum` and expand like usual. While compiling that expression, turn `fn` into (let ([fn-150 (lambda ...)]) fn-150) or (procedure-rename (lambda ...) 'fn-150), replacing "150" here with whatever the source line number is. Then the `object-name` for the function will be "fn-150" and I bet it'll appear in the profiling data that way, which would at least give you the line number to work with.
If you want, and if that works, you can probably have `load` do a little bit of inspection to see if the expression is of the form (mac foo ...) or (def foo ...), which could let you create a more informative function name like `foo-150`.
There's something related to this in `ac-set1`, which generates (let ([zz ...]) zz) so that at least certain things in Arc are treated as being named "zz". Next to it is the comment "name is to cause fns to have their arc names while debugging," so "zz" was probably the Arc variable name at some point.
mmm, not sure. It'd probably be easier to start with a working version (even if slow) and then remove source information from lists and see if anything breaks.
> In `load`, use `read-syntax`, extract the line number from that syntax value
erm, so all functions forms compiled during the eval of that expression would get named "fn-150"?
"erm, so all functions forms compiled during the eval of that expression would get named "fn-150"?"
That's what I mean, yeah. Maybe you could name them with their source code if you need to know which one it is, if it'll print names that wide. :-p This isn't any kind of long-term aspiration, just an idea to get you the information you need.