Arc Forumnew | comments | leaders | submit | akkartik's commentslogin

This was my first thought as well. But why does `readc` (Racket's `read-char`) silently accept invalid utf-8?

reply


The files seem to diverge after a 0 byte. Can you check if it's the first 0 byte in the file?

Edit: never mind, I was hallucinating.

reply

2 points by zck 2 days ago | link

I just checked; it's not the first 0 byte.

Just to provide a little extra information, the image I'm testing with is my favicon: http://zck.me/favicon.ico

reply

1 point by akkartik 38 days ago | link | parent | on: 3-Dimensional Source Code

"There's a mirage when you first look at it, like with a lake. What you see is just the surface of the lake, and that's a two dimensional view. But when you put your hand in, you find that there's depth to it. It's not just one dimension or two, but a three dimensional design." -- Carl Sassenrath on Rebol, 1996 (https://web.archive.org/web/20000830133509fw_/http://www.lin...)

reply

3 points by akkartik 41 days ago | link | parent | on: 3-Dimensional Source Code

Wait, you're writing about "3-dimensional source code" and the dimensions aren't settled yet? That just makes me glad I didn't read your slides, and even less likely to put in the effort next time. I'll repeat my earlier comment: your MVPs are too M and insufficiently V.

How are you so sure that you won't settle on 2 or 4 dimensions? (Let us stipulate that 5 is right out.)

> empirically 99%+ of bugs occur in extraneous parts of the code

You'll need to show me these empirical studies.

I haven't actually ever heard a story that accounts for 99% of bugs. Pretty much every software engineering study ends up with a much flatter profile than that. You have to do many things right to eradicate 99% of bugs.

> ETNs start bringing us closer to the absolute minimum, perfect program necessary to solve a problem.

From what I can tell, ETNs are mostly about eliminating punctuation and replacing it with indentation. Is that right? If so, is your claim that "99% of bugs" are hiding in the punctuation?

Is upgrading the syntax to ETNs all that's needed to eliminate 99% of bugs? What about DRY? The value of good interfaces? Parnas's theory of information hiding? SOLID?

I'll trade my pulled-out-of-my-ass theory for yours. I think bugs arise because our representation of algorithms ("code") over-emphasizes the rules the algorithms performs, and under-emphasizes the input space that the rules are meant to operate on. Bugs arise when people modifying the code forget about rare areas of the input space, and the scaffolding around the project is unable to remind them. Nail down the input space, and bugs go down because your tests fail more often. You won't fix this problem no matter how much you tweak the superficial syntax with which you write code. (I work on this, so it was not pulled out of my ass just now: http://akkartik.name/about; https://github.com/akkartik/mu)

reply

3 points by shader 5 days ago | link

> ... trading of programming theories ...

Clear and simple syntax / representation is important; combined with matching editing tools it enables us to communicate ideas easily and fluently.

I also like the idea of well defined input spaces. Many theorems or algorithms only work under certain conditions, and much damage has been done by applying them outside of their intended domains. But I think that's only part of the problem.

My own theory is that programs are specifications, and the more clearly and precisely they specify the better. Programs can fit into a matrix of good/bad ideas and good/bad specifications. Of these, two kinds are interesting bugs:

  1) Incorrectly specified good ideas
  2) Correctly specified bad ideas
Well specified good ideas are correct programs, and incorrectly specified bad ideas are just hopelessly confused.

Improving the languages and tools will never fix bad ideas, but they can make them more obvious. Now the goal is to make programming as close as possible to 'saying what you mean'. In other words, making the semantics as explicit as possible.

Basically my goal is 'declarative programming', which turns out to be a very vague concept to most people. They all agree that it's better, but nobody seems to have a good explanation for why. I think the difference is that declarative programs specify the only the relationships which are important, leaving the rest up to the platform to optimize or interpret as it sees fit. This leads to powerful and concise languages such as SQL, but at the cost of placing the burden on the platform rather than the programmer. Good for communication and clarity, bad for development and adoption.

Basically, declarative languages can be more concise because they rely more on shared knowledge; predefined vocabulary. If the language doesn't already have a way to express the concept you want, however, it is much more work to add. Imperative / procedural programs are more flexible because they rely on implicit semantics. You just tell the computer what to do—you don't have to explain what it is doing or why. Everything the program "accomplishes" is imaginary and external to the specification. This leaves very little room for the computer to optimize your selection of operations, and leaves a lot of room for you to accidentally provide an incorrect sequence of steps.

It's like the difference between giving directions by saying "Go to the grocery store at 5th and Main" vs. "Take a left, go three blocks, take a right, go two more blocks, park on the right side of the street and enter the blue building." The first is much clearer, but places much higher expectations on the navigation abilities of the recipient, while the second can be followed by anyone even though they have no idea where they're going - and mistakes are correspondingly harder to notice.

Sadly, the nature of declarative languages makes them fairly domain specific, which may explain part of why they're so rare and hard to make. Creating a declarative language for solving a class of problems is much harder than solving a single problem imperatively; you actually have to think of how and why you're solving those problems. But I think we could probably create some general patterns and guidelines for defining them, and maybe even start building up some tools to reduce the effort required.

reply

1 point by breck 41 days ago | link

> Wait, you're writing about "3-dimensional source code" and the dimensions aren't settled yet? That just makes me glad I didn't read your slides, and even less likely to put in the effort next time. I'll repeat my earlier comment: your MVPs are too M and insufficiently V. > How are you so sure that you won't settle on 2 or 4 dimensions? (Let us stipulate that 5 is right out.)

Sorry, the language itself is fully settled, the only question is with our hardware prototypes, we've found a way to compute with the ETN programs mapped to 2-dimensions, and a machine structure where we can compute answers with a source program mapped to 3-dimensions. But really both are 3-dimensional, the former it's just the Z-axis doesn't vary.

In the 3-D version, the first word of a node (aka the head/base/instruction/type/command), is at z 1, and subsequent words go up the z-stack. In the 2-d version, subsequent words just go up the x-dimension. They actually both offer advantages, and we'll figure out which is better I'm sure in the next year or so.

Again, this stuff is at the cutting edge of the hardware research. We're talking about a whole new type of machine architecture without registers.

> You'll need to show me these empirical studies.

Totally agree. We will.

> I haven't actually ever heard a story that accounts for 99% of bugs. Pretty much every software engineering study ends up with a much flatter profile than that. You have to do many things right to eradicate 99% of bugs.

Agreed. And to hit that 99%, we're going to need the new hardware, so that is quite far off (3 - 20 years, hard to predict). But we can hit 90% fewer with ETN software alone.

> From what I can tell, ETNs are mostly about eliminating punctuation and replacing it with indentation. Is that right?

No. Forget about the punctuation of newlines and spaces. Think about it as Cartesian coordinates. ETNs are about giving source code physical dimensions. About making sure that source code could directly be built out of circuitry. Think of ETN programs like something you could build in a Voxel editor like MagicaVoxel. Each block holds a word, which is just a number from 0 to infinity, and problems are trees of these numbers connected in physical space. Sorry if that's not clear. I think the more code and tools we build the easier it will be to understand.

> Is upgrading the syntax to ETNs all that's needed to eliminate 99% of bugs?

No. To reduce bugs by 90% (99% won't be possible until we have ETN machines) you also need well designed ETNs. Which have good FPL things like no side effects, prefix notation, DRY, good naming, good interfaces, et cetera. Great question. Working on a release shortly with a lot more tools and help on building great ETNs.

> Nail down the input space, and bugs go down because your tests fail more often.

I like that! I'm a big fan of strongly typed languages and the idea they basically prove your program correct at compile time if you think more about your types.

Thanks for the feedback! I hope the next wave of ETN stuff will help start to demonstrate the benefits better.

reply

3 points by shader 5 days ago | link

I think the TN/ETN parsing model is somewhat neat in its simplicity, which means it will probably have some longevity.

However, most of the work you have done is just a simplification of the syntax; it has no relation to the semantics whatsoever, and as such is unlikely to cause a major paradigm shift.

Perhaps the coolest part of your notation is the concept of constant validity, which in this case you achieved by simplifying the notation until it matched the medium. Every atomic operation on the text (add a character, new line, or space) is also a valid atomic operation on the tree. Especially because it works with any text editor, instead of fancy semantically (or at least syntactically) aware editors. However, I think any true advances in programming will require improvements in the semantics.

reply

1 point by breck 5 days ago | link

Thanks for the feedback!

> However, most of the work you have done is just a simplification of the syntax; it has no relation to the semantics whatsoever,

Agreed. However, I think one thing that is starting to emerge from our data (17 useful ETNs now compiling to Javascript, Rust, TypeScript, Logo, Haskell, C++, LLVM IR, SQL, HTML, CSS, JSON, and Regular Expressions) is how well this Tree Notation syntax can work for every programming paradigm (functional, imperative, declarative, dataflow, oo, logic, stack ...). Perhaps it is best explained as a universal syntax. The neat thing about this is that once you learn the TN syntax, you now know the complete syntax for languages with very different semantics. So while I agree we aren't changing semantics here yet, instead just leveraging the semantics and VMs of existing languages, this universal syntax could be big in that it can lead to better cross language static tools and enable developers who generally stick to one or two paradigms to make use of more.

> Perhaps the coolest part of your notation is the concept of constant validity

Agreed! The elimination of parse errors is one of my favorite features. Of course, the user can still make errors at the ETN level like mistyping a word or providing invalid parameters to a node. To help catch and fix these kinds of errors, I just launched version 5.0 of Ohayo (Ohayo still shitty, but the core is getting really solid) which includes a revamped compiler-compiler that supports 100% type checking of every word in your program. It makes it easy to create, as you say above "well defined input spaces".

reply

2 points by shader 1 day ago | link

> this universal syntax could be big in that it can lead to better cross language static tools and enable developers who generally stick to one or two paradigms to make use of more.

An alternate syntax will not allow you to use any additional paradigms unless you also provide alternate semantics. It might enable more powerful editing tools or effective macros and metaprogramming though.

reply

2 points by breck 1 day ago | link

> An alternate syntax will not allow you to use any additional paradigms unless you also provide alternate semantics.

Right. The syntax for ETNs is the same, but the semantics are different. For example, I have a language called "Flow" that is a data flow language, passing a matrix through a series of nodes. I also have a logic language called "Project", that can solve relational issues among nodes. Different semantics, identical syntax.

Right now to use different paradigms, a user generally has to learn different semantics and different syntaxes. This eliminates the latter.

reply

1 point by akkartik 1 day ago | link

Is that a good thing, though? A classic design principle is that similar things should look similar and different things should look different. Imagine a project with both Flow and Project files. Wouldn't it be nice to be able to tell them apart at a glance?

reply

3 points by shader 5 days ago | link

While the concept of a cartesian program space is interesting, it seems largely unrelated to TNs. This is probably a good thing though, as programs require semantic relationships ("lines" between nodes) that are lacking in cartesian spaces. If there was semantic significance to adjacency or distance between points, or along each axis, that might be reasonable. Otherwise the "dimensions" are just an irrelevant and cumbersome alternative to line numbers.

Additionally, a third dimension is meaningless as long as your fundamental representation is two-dimensional. Unless you use an editor that is natively 3-dimensional, mapping a two-dimensional representation onto three-dimensions will leave a lot of redundancy or sparseness, as demonstrated by your conflation of x and z.

reply

1 point by breck 5 days ago | link

> If there was semantic significance to adjacency or distance between points, or along each axis, that might be reasonable

Yes, there is semantic significance to adjacency & distance from the y-axis (which indicates an edge that connects parent and child nodes).

We are approaching everything simultaneously from the highest abstract level and lowest logical level. We have some more stuff coming out soon that shows off the benefits of the dimensionality more. One of the cooler experiments is a new type of processor with a graph-paper-esque 2D grid that can load a high level tree program and then execute it directly (no cumbersome series of transformations to a bunch of 64 bit registers). AFAIK this is original, though I wouldn't be surprised if Lisp Machines, Thinking Machines, Alteryx, Nvidia, Intel, et cetera have dabbled in this space a bit (though to date haven't been able to find anything on machines that execute trees directly).

reply

2 points by shader 1 day ago | link

> Yes, there is semantic significance to adjacency & distance from the y-axis (which indicates an edge that connects parent and child nodes).

Actually, it seems like your tree relationships have a very confusing relationship to the coordinates. Adding a newline increments Y, and a space increments X, but children are those nodes such that that

  1) child.Y > parent.Y
  2) child.X == parent.X + 1
With additional complications that only the node with the lowest X value for a given Y becomes the child; all others on the same line become part of the content of that node.

This means that the relationships between two elements depends not just on their coordinates, but also the coordinates of nearby nodes. (6, 4) may or may not be a direct child of (5, 3); it depends on if (5, 3) is a full node, or just a content element that's actually part of (5, 2) or (5, 1).

So the coordinates do not actually define the relationships between nodes; they do not clearly relate to the tree structure at all.

reply

2 points by breck 1 day ago | link

You are right, this is great feedback thanks.

> have a very confusing relationship to the coordinates

Agreed. I sometimes get confused too.

One rule that always holds is this:

  1) One line === One node
So every node has an absolute Y coordinate (just the line number), but also a relative coordinate(s), relative to its ancestor(s).

Both are useful at various times. There's probably a better way to eliminate confusion here.

> So the coordinates do not actually define the relationships between nodes

Given an array of node coordinates {y,x} [{1,1}, {2,2}, {3,1}, {4,2}], one has enough information to define the whole tree structure of the program. But you are right, you need the full set of coordinates of a certain node's ancestors to properly know its coordinates, and having a line that begins with 1 or more spaces, it is impossible to deduce how many nodes deep it is without also having access to the previous line(s).

reply

4 points by akkartik 47 days ago | link | parent | on: Seeking new host for Try Arc

Is the REPL itself down at the moment?

reply

3 points by evanrmurphy 47 days ago | link

Ah yes it was! It's up again now.

Strange, I had just restarted it early in the day. (And that's what I had to do just now.) It used to stay on for weeks or months on average before konking out. Perhaps it was a fluke today but we'll see if there's something that's repeatedly interfering with it being allowed to run.

reply

4 points by akkartik 47 days ago | link | parent | on: Seeking new host for Try Arc

Hi Evan! This thread reminds me that we had an email discussion back in 2014 about how tryarc is hosted, and whether we can provide separate sites for Arc 3.1 and Anarki. You even gave me access to the repo, but I never did anything with it :/

I'm looking at the repo now. I wonder if we could host it on Github pages. That would be the easiest and most future-proof approach. Lately I try to host new repos away from Github, but for now it may be best to have all these related projects in the github.com/arclanguage Org. Improved discoverability.

As a first step: how do you feel about making the repo public? ^_^

reply

4 points by evanrmurphy 46 days ago | link

Yes, I recall our email thread and I saw your email on there today, which I'm copying here in case other people are interested:

> I'm curious: how do you run things on your Linode? For example, I can't find the top-level html page in the repo. It seems like the repo runs inside an iframe of the "REPL" tab? Could you provide some instructions and peripheral config files (Apache/Nginx, etc.) to help make it turn-key? I don't want to make it onerous, but I think just a couple of lines and copy-paste will go a long way.

So responding to your forum comment above and this, I would love for Try Arc to run purely client-side on a static site host like GitHub Pages. Unfortunately, as you mention here, it runs on a VPS (Linode) instead. The reason is that it's not all client-side - it actually communicates with an system arc3.1 hosted on the server.

As for making the repo public, I think that is the logical (almost) next step. The only thing preventing me from doing that right now is a security concern. Try Arc isn't the only project I have hosted on that VPS. Currently there's a measure of "security through obscurity" that helps protect the other stuff on that server. I think the next step is to move it to a server where it's the only thing running. Then as soon as that's done I'll make the repo public.

I'll respond about the iframe and other configuration a bit later.

reply

4 points by zck 43 days ago | link

> So responding to your forum comment above and this, I would love for Try Arc to run purely client-side on a static site host like GitHub Pages. Unfortunately, as you mention here, it runs on a VPS (Linode) instead. The reason is that it's not all client-side - it actually communicates with an system arc3.1 hosted on the server.

I looked around a little bit for solutions. There's a project called Whalesong, but the most up-to-date fork only runs on Racket 6.2: https://github.com/soegaard/whalesong .

In trying to find the github link for Whalesong just now, I came across Racketscript, a Racket -> Javascript compiler: https://github.com/vishesh/racketscript . I'll see if I can make it work later, but it looks promising.

reply


Yes, this happened in the course of http://arclanguage.org/item?id=20070 3 months ago. I realized it 3 weeks later and upgraded the docs: https://github.com/arclanguage/arclanguage.github.io/commit/.... Doesn't really help us veterans, though; sorry about that! And thanks also for the factoid that v6.8 works. I'd only tried it with v6.9.

reply


I use Vim as you know, but hopefully somebody else here uses Emacs.

Do you have any Arc-specific configuration in your .emacs? It may be helpful to share that in case somebody spots something wrong with it.

-----

3 points by gruseom 84 days ago | link

Pretty simple:

  (push '("\\.arc$" . lisp-mode) auto-mode-alist)
  (modify-coding-system-alist 'file "\\.arc$" 'utf-8)
There's some other stuff related to running a REPL but I'm pretty sure it's unrelated.

-----

1 point by akkartik 84 days ago | link

Thanks! Sorry I can't be more help, but I can indeed reproduce your issue.

-----


This later post by the author may be relevant: http://breckyunits.com/the-flaw-in-lisp.html

Though it's hard once again to understand. When are two nodes coincident? By definition you can only have one character in one place on the screen. Is he talking about indentation, that the same level of indentation can mean different things? That seems true of ETN as well, from what I can tell.

The pencil scratchings on the screenshots don't help either.

-----

1 point by breck 86 days ago | link

Sorry, the lines in the geometric mapping of the source code are coincident. In drawing B) in the visual proof, the edges which connect the child nodes to their parents intersect and/or are coincident. So when you put Lisp source code onto graph paper, and draw boxes around the nodes, and line segments for edges, it shows why Lisp source is not a geometric language (I define a geometric language as one where there are no intersecting or coincident line segments).

Now, figure A) shows the same Lisp code, formatted differently, in a way that is a geometric language. But as you can see, that code is standard TN/ETN. Or perhaps another way to put it is ETNs are just Lisps with a whitespace syntax and no parentheses. Another reader on HN pointed me to I expressions (https://srfi.schemers.org/srfi-49/srfi-49.html), which I hadn't seen before and is 90% of the way there to TN and ETNs. The creator of I-Expressions have communicated briefly over email now and are going to be talking soon.

Anyway, perhaps another term for Tree Notation/ETNs is "Geometric Lisp", or "2-Dimensional Lisp". I'm not wedded to the terms TN/ETNs, although I do think it's better to have new terms, because I think these will come to dominate the usage of Lisp.

Still working on more updates and evidence on why I think these will be so big.

-----


HN discussion on this post: https://news.ycombinator.com/item?id=14604269

I thought all the emphasis of how great it was made it harder for people to understand _what_ "it" was. Probably a good idea to keep an initial post like this matter-of-fact. Motivate it with just one simple strength that's easiest to communicate. Describe the broader context and implications in a separate post, later, after people understand what it is.

-----

2 points by breck 86 days ago | link

I totally agree. Although perhaps I wouldn't have gotten as much feedback had I taken a more modest approach? Hard to say. I still haven't done a good job communicating the benefits of ETNs yet, stemming from their 2D/geometric nature. Almost got version 1.1 of Ohayo done which makes another step toward that.

-----

5 points by akkartik 86 days ago | link

That motivation makes sense. Bear in mind, though, that the "less modest" approach has a limited amount of gas. It will stop working at some point.

I can relate with having these questions and considering the different strategies as well. If you really think that this is going to be your life's work, it's reasonable to burn some 'reputation' to get the word out. However, I've often been wrong before. Now I tend to err on the side of playing a long game.

Over time I've gained respect for the essential wisdom of this quote:

"We knew that Google was going to get better every single day as we worked on it, and we knew that sooner or later everyone was going to try it. So our feeling was that the later you tried it, the better it was for us because we’d make a better impression with better technology. So we were never in a big hurry to get you to use it today. Tomorrow would be better." -- Sergey Brin, as retold by Seth Godin in "The Dip"

Applicable to ideas like here just as much as products.

-----

2 points by breck 86 days ago | link

Wow, what a quote! That's exactly how I feel about Ohayo. That seems like a better approach--not to be in a big hurry to get people to use it today. Thank you akkartik. Really appreciate that advice!

My only concern pre-launch and announcement, was that I was going to get hit by a bus and the world would have to wait longer for someone else to stumble upon (and popularize) TN and ETNs. Now that it's out there and a few thousand people have seen it, I can take this more sensible approach. Fantastic advice.

Btw, just pushed version 1.1.0 if anyone's interested.

UX still needs work, but I rushed adding a "3D block" to the flow language, (using the vis.js library), so you can start to see what "3D" code looks like.

http://breckyunits.com/files/untransformed-source-etn-code-i...

I need to rev that a bit (I see some immediate bugs) but gives the basic idea and I have to run out for a little while.

-----

More