Arc Forumnew | comments | leaders | submit | shader's commentslogin
2 points by shader 8 days ago | link | parent | on: The cargo cult of versioning

Interesting discussion. I do appreciate your observations that only the major version matters, and that could be equivalent to renaming the project, since it doesn't really communicate anything useful other than "differences exist".

However, I don't really agree that preventing version pinning will somehow encourage better coding standards. Hypothetically it makes sense; without version pinning, breaking changes would be more noticeable and painful, and users would have more incentives to switch to other libraries that don't do that. Unfortunately, software is not yet and possibly never will be the kind of competitive market required for that to be effective.

The problem is that the pain is entirely felt by the users; the developers may not even realize when one of their changes is "breaking", because they aren't running the users' code. Furthermore, since libraries are usually very different even from those with similar functionality, updating to accommodate the breaking changes will almost always be easier than switching to a different one with a better reputation (assuming it even exists). Since the user can't control the behavior of the developer, the best they can do to overcome breaking changes on their end is version pinning. It's not a good solution, but it's effectively the only solution.

Maybe in the long run people would read horror stories and avoid those libraries, but that still doesn't help much, because the developers usually don't have much incentive to actually meet the needs of their users--they are not really customers, merely beneficiaries. This would be different for commercial software, but then you won't be getting the resource through a package manager anyway.

One interesting solution would be a package manager with integrated CI functionality, so any minor update to a package is automatically tested against every other package that has it as a dependency. This wouldn't catch all of the errors, since most code wouldn't be published, but it would make the developers much more aware of and sensitive to breaking changes. If they still want to make the change anyway, they can change the name.


2 points by shader 2 days ago | link

I've been thinking more about the "renaming" vs "numbered versions" for packages, and I'm now leaning more strongly towards a distinct version number—or at least a separate version field, number or otherwise.

It's true that the major version number doesn't convey much, except the vague idea that the authors believe it to be better, or they wouldn't have written it. However, as long as they are unique and you have some easy way to find the "latest" version, it doesn't really matter if they're numbers or not. Commit hashes or code names should work just as well.

However, I think there's a simple security argument in favor of making the version a subfield, to prevent spoofing by malicious or mischievous third parties. Otherwise anyone could claim that their fork was "python-4".

An alternate solution might be to add a "successor" field, so that a package could identify another package as the rightful heir, even if it wasn't developed by the same team. That should make the open-source fork-based community development a little easier. You'd still have to know what the root package was to follow the chain though.


1 point by akkartik 8 days ago | link

I updated my original post based on conversations I had about it, and my updated recommendation is towards the bottom:

"Package managers should by default never upgrade dependencies past a major version."

I now understand the value of version pinning. But it seems strictly superior to minimize the places where you need pinning. Use it only when a library misbehaves, rather than always adding it just to protect against major version changes.


CI integrated with a package manager would indeed be interesting. Hmm, it may disincentivize people from writing tests though.


2 points by shader 7 days ago | link

A reduced need for tests may not be a bad thing.

We don't write tests just to have tests, but to verify that certain important functionality is preserved across versions. The trouble with many tests, is that they are written to fill coverage quotas, and don't actually check important use cases. By using actual dependents and _their_ tests as a test for upstream libraries, we might actually get a better idea of what functionality is considered important or necessary.

Anything that nobody's using can change; anything that they rely on should not, even if it's counter intuitive.

The problem remains that most user code will still not exist in the package manager. It might be more effective if integrated with the source control services (github, gitlab, etc.), which already provide CI and host software even if it's not intended to be used as a dependency. The "smart package" system could then use the latest head that meets a specified testing requirement, instead of an arbitrary checkpoint chosen by the developers.


2 points by akkartik 7 days ago | link

Oh, I just realized what you meant. Yes, I've often wanted an open-source app store of some sort where you had confidence you had found every user of a library. That would be the perfect place for CI if we could get it.

Perhaps you're also suggesting a package manager that is able to phone home and send back some sort of analytics about function calls and arguments they were called with? That's compelling as well, though there's the political question of whether people would be creeped out by it. I wonder if there's some way to give confidence by showing exactly what it's tracking, making all the data available to the world, etc. If we could convince users to enter such a bargain it would be a definite improvement.


2 points by shader 5 days ago | link

I wasn't really considering that level of integration testing. It would certainly be cool to get detailed error reports from the CI tool. I don't see why you couldn't, since the code is published on the public package system, and you would be getting error listings anyway.

I don't think it would be creepy as long as it's not extracting runtime information from actual users. If it's just CI test results, it shouldn't matter.

Live user data would be a huge security risk. Your CI tests could send you passwords, etc., if they happen to pass through your code during an error case.


2 points by akkartik 7 days ago | link

I wasn't saying there'd be a reduced need for tests. It's hard for me to see how adding CI would reduce the need for tests. I'm worried that people will say this stupid CI keeps failing, so "best practice" is to not write tests. (The way that package managers don't enforce major versions, so "best practice" has evolved to be always pinning.)

Unnecessary tests are an absolutely valid problem, but independent :)


2 points by shader 5 days ago | link

By "reduced need for tests" I didn't mean that the absolute number of tests would decline, but rather the need and incentives for the development team to write the tests themselves. Since they have the ecosystem providing tests for them, they don't need to make as many themselves. At least, that's how I understood the discussion.

So yes, if the package manager only enforced the tests you include in your package it would incorrectly discourage including tests. But if it enforces tests that _other_ people provide, you have no way around it. The only problem is how to handle bad tests submitted by other people. Maybe only enforce tests that passed on a previous version but fail on the current candidate?


1 point by akkartik 4 days ago | link

Ooh, that's another novel idea I hadn't considered. I don't know how I feel about others being able to add to my automated test suite, though. Would one of my users be able to delete tests that another wrote? If they only have create privileges, not update, how would they modify tests? Who has the modification privileges?

These are whole new vistas, and it seems fun to think through various scenarios.


2 points by shader 2 days ago | link

It's not really the same as others being able to add tests to your automated suite. Rather, they add tests to their own package, and then the CI tool collects all tests indirectly dependent on your library into a virtual suite. Those tests are written to test their code, and only indirectly test yours. If a version of their package passes all of their tests with a previous version of your code, but the atomic change to the latest version of your code causes their test to fail, the failure was presumably caused by that change. The tests will probably have to be run multiple times to eliminate non-determinism.

It's still possible that someone writes code that depends on "features" that you consider to be bugs, or a pathologically sensitive test, so there may need to be some ability as the maintainer to flag tests as poor or unhelpful so they can be ignored in the future. Hopefully the requirement that the test pass the previous version to be considered is sufficient to cover most faulty tests though.


1 point by akkartik 1 day ago | link

Yes, this was kinda what I had in mind when I talked about an open source app store. Make it easy for libraries to be aware of how they're used.


I'm sure many of you have already seen this (it's from 2012...), but it seems relevant to the discussion we were having about the future of programming.

He makes the great point that we should be enabling creation, and tightening the feedback loop between thought and product. I've been mostly focusing on better was to represent thoughts and communicate them to the computer, but this draws attention to the purpose of programming itself.


3 points by akkartik 112 days ago | link

I was just thinking about it yesterday, after reading Very inspiring talk.


2 points by shader 106 days ago | link

That's precisely where I found out about it :D


2 points by breck 109 days ago | link

Such a great talk, like his others.

I think it would be great to live in a world where not only could you use your finger to create a sprite animation, but if curious, you could also more easily delve into all the black boxes that make that experience happen (down to the physical level).

I like the NOMODES license plate. If you all had to pick a license plate to describe your work, what would it be? I might go with NOPARENS or NOSYNTAX.


2 points by akkartik 109 days ago | link

Mine would be COPYMORE. Or NODEPS.

I think I share your vision:


2 points by shader 106 days ago | link

Do you have any references for those terms? Or a short explanation for them?


1 point by akkartik 106 days ago | link


I'm what I like to call a 'copyista': I think DRY is overrated, and abstraction is overrated, and people are too quick to create abstractions to compress code rather than for conceptual clarity. Some links:;;;;;; You don't have to read them all, but hopefully this gives you as much flavor as you want :)


This is short for "no dependencies". I think a lot of software's ills stem from people's short-sighted tendency to promiscuously add dependencies. In fact, our fundamental metaphor of libraries is wrong. Adding a library to your program isn't like plugging a new block into your Lego set. It's like hiring a new plumber. You're not just adding a few lines of code to a file somewhere, you're creating a relationship. Everytime I see someone talk about "code smells", I wait to see if they'll bring up having too many dependencies. Usually they don't, and I tune them out. And the solution is easy. When you find a library that does something useful, consider copying it into your project. That insulates you from breaking changes upstream, and frees up upstream to try incompatible changes. As a further salubrious effect, it encourages you to hack on the library and tune it to your purposes. (Without giving up the options of merging further changes from them, or submitting patches upstream.)

As it happens, this worldview of mine was actually catalyzed by conversations here in the Arc Forum, most proximally That thread led to me writing and (a little clearer)

I consider an example of exemplary library use to be how I copied the termbox library into Mu (, periodically merged commits from upstream (, gradually cleaned it up to fit better with my project (, and gradually stripped out code from it that Mu does not require (;; In the process I made some wrong turns, deleting features that I later decided I wanted ( and created bugs for myself (; But when it did things I didn't want, I was now empowered to change them ( One of my patches was useful upstream, so I submitted it: I would be in no position to submit that patch if I hadn't taken the trouble to understand termbox's internals. That's another benefit of copying and privately forking libraries: it makes you a better citizen of the open source world, because open source depends on eyeballs, and using a library blindly helps nobody except your (extremely short-term) self.

More broadly, Mu is suffused with this ethos. My goal is that if you have a supported platform you should be able to run it with three commands:

  $ git clone
  $ cd mu
  $ ./mu
(That highlights another benefit: your software becomes easier for others to try out. Without giving out binaries, because what's the point of being open-source if you do that?)

Mu's also geared to spread this idea. I want to build an entire software stack in which any part is comprehensible to any programmer with an afternoon to spare ( Which requires having as little code as possible, because every new dependency is a source of complexity if you're building for readers rather than users. In chasing this goal I'm very inspired by OpenBSD for this purpose. It's the only OS I know that allows me to recompile the entire kernel and userland in 2 commands ( People should be doing this more often! I think I'm going to give up Mu and build my next project atop OpenBSD. But that's been slow going.


Ah, here's an old HN thread where I managed to combine both these ideas:

I'd have preferred to more directly call out my hatred for compatibility constraints, but I couldn't figure out how to fit it on a license plate :)


2 points by breck 93 days ago | link

The Pike maxim "A little copying is better than a little dependency" comes to mind. I think the overhead of dependencies is underrated ("it's just a 1 line import statement!"), and often a little repetition is a good thing.


3 points by shader 125 days ago | link | parent | on: 3-Dimensional Source Code

> ... trading of programming theories ...

Clear and simple syntax / representation is important; combined with matching editing tools it enables us to communicate ideas easily and fluently.

I also like the idea of well defined input spaces. Many theorems or algorithms only work under certain conditions, and much damage has been done by applying them outside of their intended domains. But I think that's only part of the problem.

My own theory is that programs are specifications, and the more clearly and precisely they specify the better. Programs can fit into a matrix of good/bad ideas and good/bad specifications. Of these, two kinds are interesting bugs:

  1) Incorrectly specified good ideas
  2) Correctly specified bad ideas
Well specified good ideas are correct programs, and incorrectly specified bad ideas are just hopelessly confused.

Improving the languages and tools will never fix bad ideas, but they can make them more obvious. Now the goal is to make programming as close as possible to 'saying what you mean'. In other words, making the semantics as explicit as possible.

Basically my goal is 'declarative programming', which turns out to be a very vague concept to most people. They all agree that it's better, but nobody seems to have a good explanation for why. I think the difference is that declarative programs specify the only the relationships which are important, leaving the rest up to the platform to optimize or interpret as it sees fit. This leads to powerful and concise languages such as SQL, but at the cost of placing the burden on the platform rather than the programmer. Good for communication and clarity, bad for development and adoption.

Basically, declarative languages can be more concise because they rely more on shared knowledge; predefined vocabulary. If the language doesn't already have a way to express the concept you want, however, it is much more work to add. Imperative / procedural programs are more flexible because they rely on implicit semantics. You just tell the computer what to do—you don't have to explain what it is doing or why. Everything the program "accomplishes" is imaginary and external to the specification. This leaves very little room for the computer to optimize your selection of operations, and leaves a lot of room for you to accidentally provide an incorrect sequence of steps.

It's like the difference between giving directions by saying "Go to the grocery store at 5th and Main" vs. "Take a left, go three blocks, take a right, go two more blocks, park on the right side of the street and enter the blue building." The first is much clearer, but places much higher expectations on the navigation abilities of the recipient, while the second can be followed by anyone even though they have no idea where they're going - and mistakes are correspondingly harder to notice.

Sadly, the nature of declarative languages makes them fairly domain specific, which may explain part of why they're so rare and hard to make. Creating a declarative language for solving a class of problems is much harder than solving a single problem imperatively; you actually have to think of how and why you're solving those problems. But I think we could probably create some general patterns and guidelines for defining them, and maybe even start building up some tools to reduce the effort required.


4 points by shader 125 days ago | link | parent | on: 3-Dimensional Source Code

While the concept of a cartesian program space is interesting, it seems largely unrelated to TNs. This is probably a good thing though, as programs require semantic relationships ("lines" between nodes) that are lacking in cartesian spaces. If there was semantic significance to adjacency or distance between points, or along each axis, that might be reasonable. Otherwise the "dimensions" are just an irrelevant and cumbersome alternative to line numbers.

Additionally, a third dimension is meaningless as long as your fundamental representation is two-dimensional. Unless you use an editor that is natively 3-dimensional, mapping a two-dimensional representation onto three-dimensions will leave a lot of redundancy or sparseness, as demonstrated by your conflation of x and z.


1 point by breck 125 days ago | link

> If there was semantic significance to adjacency or distance between points, or along each axis, that might be reasonable

Yes, there is semantic significance to adjacency & distance from the y-axis (which indicates an edge that connects parent and child nodes).

We are approaching everything simultaneously from the highest abstract level and lowest logical level. We have some more stuff coming out soon that shows off the benefits of the dimensionality more. One of the cooler experiments is a new type of processor with a graph-paper-esque 2D grid that can load a high level tree program and then execute it directly (no cumbersome series of transformations to a bunch of 64 bit registers). AFAIK this is original, though I wouldn't be surprised if Lisp Machines, Thinking Machines, Alteryx, Nvidia, Intel, et cetera have dabbled in this space a bit (though to date haven't been able to find anything on machines that execute trees directly).


2 points by shader 121 days ago | link

> Yes, there is semantic significance to adjacency & distance from the y-axis (which indicates an edge that connects parent and child nodes).

Actually, it seems like your tree relationships have a very confusing relationship to the coordinates. Adding a newline increments Y, and a space increments X, but children are those nodes such that that

  1) child.Y > parent.Y
  2) child.X == parent.X + 1
With additional complications that only the node with the lowest X value for a given Y becomes the child; all others on the same line become part of the content of that node.

This means that the relationships between two elements depends not just on their coordinates, but also the coordinates of nearby nodes. (6, 4) may or may not be a direct child of (5, 3); it depends on if (5, 3) is a full node, or just a content element that's actually part of (5, 2) or (5, 1).

So the coordinates do not actually define the relationships between nodes; they do not clearly relate to the tree structure at all.


2 points by breck 121 days ago | link

You are right, this is great feedback thanks.

> have a very confusing relationship to the coordinates

Agreed. I sometimes get confused too.

One rule that always holds is this:

  1) One line === One node
So every node has an absolute Y coordinate (just the line number), but also a relative coordinate(s), relative to its ancestor(s).

Both are useful at various times. There's probably a better way to eliminate confusion here.

> So the coordinates do not actually define the relationships between nodes

Given an array of node coordinates {y,x} [{1,1}, {2,2}, {3,1}, {4,2}], one has enough information to define the whole tree structure of the program. But you are right, you need the full set of coordinates of a certain node's ancestors to properly know its coordinates, and having a line that begins with 1 or more spaces, it is impossible to deduce how many nodes deep it is without also having access to the previous line(s).


3 points by shader 125 days ago | link | parent | on: 3-Dimensional Source Code

I think the TN/ETN parsing model is somewhat neat in its simplicity, which means it will probably have some longevity.

However, most of the work you have done is just a simplification of the syntax; it has no relation to the semantics whatsoever, and as such is unlikely to cause a major paradigm shift.

Perhaps the coolest part of your notation is the concept of constant validity, which in this case you achieved by simplifying the notation until it matched the medium. Every atomic operation on the text (add a character, new line, or space) is also a valid atomic operation on the tree. Especially because it works with any text editor, instead of fancy semantically (or at least syntactically) aware editors. However, I think any true advances in programming will require improvements in the semantics.


1 point by breck 125 days ago | link

Thanks for the feedback!

> However, most of the work you have done is just a simplification of the syntax; it has no relation to the semantics whatsoever,

Agreed. However, I think one thing that is starting to emerge from our data (17 useful ETNs now compiling to Javascript, Rust, TypeScript, Logo, Haskell, C++, LLVM IR, SQL, HTML, CSS, JSON, and Regular Expressions) is how well this Tree Notation syntax can work for every programming paradigm (functional, imperative, declarative, dataflow, oo, logic, stack ...). Perhaps it is best explained as a universal syntax. The neat thing about this is that once you learn the TN syntax, you now know the complete syntax for languages with very different semantics. So while I agree we aren't changing semantics here yet, instead just leveraging the semantics and VMs of existing languages, this universal syntax could be big in that it can lead to better cross language static tools and enable developers who generally stick to one or two paradigms to make use of more.

> Perhaps the coolest part of your notation is the concept of constant validity

Agreed! The elimination of parse errors is one of my favorite features. Of course, the user can still make errors at the ETN level like mistyping a word or providing invalid parameters to a node. To help catch and fix these kinds of errors, I just launched version 5.0 of Ohayo (Ohayo still shitty, but the core is getting really solid) which includes a revamped compiler-compiler that supports 100% type checking of every word in your program. It makes it easy to create, as you say above "well defined input spaces".


2 points by shader 121 days ago | link

> this universal syntax could be big in that it can lead to better cross language static tools and enable developers who generally stick to one or two paradigms to make use of more.

An alternate syntax will not allow you to use any additional paradigms unless you also provide alternate semantics. It might enable more powerful editing tools or effective macros and metaprogramming though.


2 points by breck 121 days ago | link

> An alternate syntax will not allow you to use any additional paradigms unless you also provide alternate semantics.

Right. The syntax for ETNs is the same, but the semantics are different. For example, I have a language called "Flow" that is a data flow language, passing a matrix through a series of nodes. I also have a logic language called "Project", that can solve relational issues among nodes. Different semantics, identical syntax.

Right now to use different paradigms, a user generally has to learn different semantics and different syntaxes. This eliminates the latter.


3 points by akkartik 121 days ago | link

Is that a good thing, though? A classic design principle is that similar things should look similar and different things should look different. Imagine a project with both Flow and Project files. Wouldn't it be nice to be able to tell them apart at a glance?


2 points by breck 80 days ago | link

> A classic design principle is that similar things should look similar and different things should look different.

Agreed, but I think context eliminates such a need. If this comment were about cooking it would look the same. We reuse one writing system.

> Imagine a project with both Flow and Project files. Wouldn't it be nice to be able to tell them apart at a glance?

Ah, good point! So far it hasn't been a problem, but I imagine there may be issues as the number of Tree Languages (note--I took the feedback and dropped the "ETN" acronym) and combinations increase. It might emerge that there are some universal best practices so semantics won't change too markedly from one language to the next. But I think it could be that semantics vary a lot. Right now I have some languages where flow goes forwards (top down) backwards (children up), stack based, parallel, synchronous, et cetera. I personally haven't had trouble keeping them straight just knowing the context, but that is not necessarily a predictor of how it will go for other people (or even me), in the future. We shall see.

Another similar problem is when you have a file with both Flow and Project code (something that actually comes up a lot).

What happens when 2 languages use the same keyword but with different semantics and it happens that a 3rd language embeds them both? It might cause some confusion. Or even just the basic problem of doing color highlighting for one language in a node of another--how do you ensure the color schemes don't conflict? Perhaps a border or something would do the trick. Problems to solve in the future.


3 points by akkartik 80 days ago | link

> If this comment were about cooking it would look the same. We reuse one writing system.

That's true, but the fact that we're both able to make analogies just suggests that analogies aren't a good defense for your system. It isn't self-evident that "eliminating different syntaxes" is always a good thing. You need to actually take the trouble to motivate it.

In my experience the hard part of dealing with polyglot systems is juggling the different semantics. Syntax is in the noise. Should it be the same or different? It just doesn't seem worth thinking about.

Don't get me wrong, I find Lisp's uniform syntax very helpful. But Lisp is helpful also because of its (relatively) uniform semantics. While adding Lisp syntax atop say Erlang seems useful, mixing LFE and regular Scheme would be a nightmare.

> What happens when 2 languages use the same keyword but with different semantics and it happens that a 3rd language embeds them both?

Yes, I can relate to this question. For example, here's a fragment from the Mu codebase where I embed tests containing Mu programs in my C++ implementation: The Mu instruction is `return-if`, but because it's in a C++ file, just the `return` is highlighted. Super ugly.

My take-away from all this: polyglot systems are a bad idea. Mu's implementation being in C++ is hopefully a temporary state of affairs. We shouldn't be picking "the right tool for the job". Software is more malleable than past tools. We should be tweaking our one language to do everything the job needs.

So rather than try to come up with solutions for polyglot programming, I'd just discourage it altogether.

Edit: Heh, see slide 9 of


3 points by shader 127 days ago | link | parent | on: Seeking new host for Try Arc

Any updates on this?

I have a vps I'm not using for much, and wouldn't mind running tryarc on if we got it running inside a docker or rkt container for minimal isolation.


2 points by shader 127 days ago | link

I guess another advantage of the container model is that we could spin up a new instance for each client.

Really bad resource efficiency, but I don't expect to have many customers :P


This is a spec and reference implementation in the really early stages for an assembly language intended to replace javascript as the underlying 'machine code' of the internet, supported by Microsoft, Google, and Mozilla.

It's not much, but they have an sexp. based parser, so it should be at least as easy to build a lisp on top of it as vanilla js.

Here's an example:


2 points by shader 1195 days ago | link | parent | on: After i changed something

I have noticed that the server seems to go down after extended amounts of time, but I haven't figured out the source of the problem.

If anyone comes up with a permanent solution, I would be happy to include it though.


2 points by tvvocold 1192 days ago | link

Openshift gear will goes idle after 24 hours of inactivity,so i use to ping the server every 5 min,it will make other apps (i created) goes well ,but the always goes down every 2 or 3 days.

btw, here is akkartik's way:


1 point by shader 1232 days ago | link | parent | on: Shader:plz help

Force-stop shouldn't be what drops the www. The repo reset should only happen during the git push process. Otherwise I don't think it touches what's in the repo dir.

In any case, I think all of the stuff in the www folder should probably be moved to the data directory, so it doesn't get wiped regularly. I just haven't done it yet; partly because I didn't know exactly where I would put it.

Feel free to make a change to put it where you want it, and then send a pull request.


2 points by shader 1233 days ago | link | parent | on: Shader:plz help

The arc git repo is cloned under app-root/data, but it's not used as the root directory for the web application in case the user has other files they want to load.

I'll admit, the readme isn't the best. I mostly copied it from the original and changed the relevant bits.

Unfortunately, certain files like adminfile* should probably not be relative to the repo directory, because it's likely to get wiped. I'm not sure where the right place to put it would be though. Maybe data/config?