The actual dispatch happens at runtime, but the cool thing is that it verifies that multimethods cannot overlap, and it does this verification at compile-time (specifically at macro-expand-time)
As complex as Racket is (especially its macro system), this is a good example of how that complexity can sometimes enable very cool things as a library (like compile-time verification of non-overlapping multimethods)
Actually, because Node.js has a proper module/namespace system, people are instead encouraged to create a lot of micro libraries (e.g. less than 200 lines of code).
So rather than importing a single big library in your application, you would instead import a lot of micro libraries (each of which is versioned separately, and might have their own dependencies).
There are pros and cons to big libraries, and pros and cons to little libraries. I think languages should have good support for both.
Yes, I love the trend towards micro libraries because they encourage people to pull in precisely the functionality they need. That aspect of Node is not part of the problem.
I've thought about this problem for several years now.
The following properties are desirable:
- Convenience (auto-imports, imports not needing to specify the variables that are used, etc.)
- Correctness (ensuring that your code uses the variables you expect it to use, and that it doesn't easily break by changing some random unrelated file)
- Performance (only loading the functions/variables that you actually use)
Unfortunately, it is probably impossible to get all three properties at the same time.
By making code more convenient (e.g. implicit imports), you make it harder to reason about code ("I'm using the variable foo, but where does it come from?"), and you make it more likely for there to be variable conflicts (two different files defining a variable with the same name).
You also make it more likely for your program to break when upgrading a library to a new version, or even just when making a change to a seemingly unrelated file.
And if there is a variable conflict (which will happen), then you need some way of disambiguating, which generally requires you to specify which file should take precedence. So in the situation where there are variable conflicts, you lose all of the convenience benefits for that variable.
(And if you choose to not disambiguate, and instead ignore the variable conflicts, then you lose correctness, so that only works for very small projects)
Convenience also conflicts with performance. The language doesn't know which variables are defined in which file, so if you use the "foo" variable, it has to search through all of the files in order to determine which file contains the "foo" variable.
That also means your language needs to define which directories it will automatically import files from (you don't want it searching through your entire harddrive!)
Various languages (including Python[1] and Node.js[2]) have defined their own rules for determining which folders will be searched through for files, so this is a solvable problem.
But unlike Python and Node.js, implicit imports requires it to also search through all of the files inside of those folders... this is obviously an expensive operation.
You could make it work. If your language allows you to pre-compile a binary, then it only needs to lookup the files once (at compile-time). And with tree shaking, the final binary would only contain the variables that you actually use.
You would lose some correctness, and you would lose compile-time performance, but some people would consider the gains in convenience to be worth it.
Arc, however, does not have a separate compile-time, so it would need to do the folder/file lookup every single time you run any Arc program. I think the performance loss would be too great for any medium or large programs.
----
By the way, there's another interesting issue... mutually recursive functions:
(def my-even (a)
(if (is a 0)
t
(my-odd (- a 1))))
(def my-odd (a)
(if (is a 0)
nil
(my-even (- a 1))))
When the compiler is compiling the "my-even" function, it encounters the "my-odd" variable, but the "my-odd" variable hasn't been defined yet, so the compiler treats it as a global variable and tries to automatically import it.
If there is a "my-odd" variable defined in another file, then the "my-even" function will call that variable, rather than the "my-odd" variable defined in the current file.
There are various ways to fix this problem, probably the easiest of which is to define the "my-odd" variable before it is used:
(= my-odd nil)
(def my-even (a)
(if (is a 0)
t
(my-odd (- a 1))))
(def my-odd (a)
(if (is a 0)
nil
(my-even (- a 1))))
This means that all variables need to be defined before use, which is contrary to the way Arc works. Also, if you forget to define a variable before use, then your program is buggy. And it is quite easy to forget.
So I think an auto-import system could work, but your language has to be designed around it. You can't just add an auto-import system to an existing language, because of these (and other) issues.
Also, another interesting point: if you move away from our typical text-based ways of editing code, it may be possible to achieve convenience and performance:
In Unison, you do not need to import code. That's because every variable is actually the hash of its definition, and when you use that variable in your code, it automatically "imports" the hash. That also means that your code contains only the variables that you actually use.
You do still lose some correctness, though. As an example, if you have a function and you change the definition of that function, it will create a new hash (for the new definition), but the old hash is still being used in your code.
Which means that you have both the old version of the function and the new version of the function existing at the same time. So you need to manually update all of your code to use the new hash:
This problem hasn't been solved yet, but there's been discussions about providing some tools that can automate certain tasks (such as replacing an old hash with a new hash, in your entire code base).
Assuming the above problem is solved, it may actually be possible to have convenience, correctness, and performance with an auto-import system.
But Unison's approach doesn't work with existing languages, so my point that "the language needs to be designed specifically for auto-imports" still stands.
That's really cool! I'm a fan of all three of those things (Clojure syntax, ML types, and Rust memory management).
Although it's intended to be low level, one concern I have is that it's missing some kind of extension system (such as typeclasses, first class ML modules, Rust traits, etc.)
Even static function overloading (like in C++) could work.
That sounds super strange. From what I understand, Racket automatically caches modules, so they are only loaded once.
But then again, Racket does have multiple phases, so it's possible that it's loading the "compiler" file in two different phases, causing duplicate variable definitions.
In addition, Racket doesn't allow for mutually recursive modules, but Arc/Nu is using "namespace-require", which apparently bypasses that restriction. It's possible that there is a race condition that is causing the "arc/3.1/main" file to be loaded before the "compiler" file is finished loading, which then triggers a second load of the "compiler" file.
Just to clarify: have you tried inserting a "displayln" in the "compiler" file to verify that it is being loaded twice?
In my opinion, returning t/nil makes more sense than returning the file path, but Arc/Nu is supposed to be backwards compatible with Arc 3.1, so I fixed this. I also fixed "dir-exists", which has the same issue.
I just pushed out the fix to the Arc/Nu GitHub repository, please try it out and let me know if it solves the problem for you.
I would like to point out that type systems (including dynamic and static type systems) are another way to carve out the territory.
Especially in languages like Haskell, static types are used all over the place to define intentions and boundaries. Languages like Idris and Agda go even further, with dependent types and proof systems.
Unit tests are great, but there are many useful properties that unit tests cannot test, but static type systems can test. And vice versa: there are things that unit tests can test that static types cannot.
Also, things like QuickCheck[1] allow us to encode our intentions in very concise ways (much more concise than traditional unit testing).
Rather than saying this:
(reverse (reverse [])) is the same as []
(reverse (reverse [1])) is the same as [1]
(reverse (reverse [1 2])) is the same as [1 2]
(reverse (reverse [1 2 3])) is the same as [1 2 3]
(reverse (reverse [3 2 1])) is the same as [3 2 1]
(reverse (reverse [1 2 3 4 5])) is the same as [1 2 3 4 5]
...
You instead say this:
for every list, (reverse (reverse list)) is the same as list
And it will then generate the unit tests for you.
So, in general I agree with you. I just disagree with the idea that unit testing is the only way we have to specify our intentions. There are many ways, and we should use every useful tool which is available to us.
I think that the more we can specify our intentions (in a testable way), the better our programs will be.
Yes, you're right. I'm not very familiar with Haskell, but I was thinking about it when I called the title "An alternative.."
I think of Haskell as being to tests what up-front design is to Agile. It's great if you're trying to carve out right and wrong behaviors along the dimensions it's been designed to help with (and my vague sense is that linear types and other such advances are trying to expand that set of dimensions). But how would you ensure that your program meets its performance guarantees? Or is guaranteed not to block between these two memory barriers? Tests are less rigorous and more work, but if you're willing to put in the elbow grease they'll work for anything you throw at them.
That sounds about right. Static types can only guarantee a certain subset of behavior. Unit tests work for a huge amount of behavior, because they run at run-time. They are very versatile. But there are still some things that unit tests cannot test.
In particular, unit tests cannot guarantee the absence of behavior. As an example, Haskell can use its static type system to say, "this function (and any functions it uses) is guaranteed to be pure". Or you can say, "this function (and any functions it uses) is guaranteed to only use certain side effects, but other side effects do not happen".
Languages like Idris go even further and can guarantee that a function terminates (never goes into an infinite loop)!
As another example, Haskell has Software Transactional Memory, and it uses the static type system to guarantee that arbitrary I/O cannot occur, which is important because the transaction might be retried multiple times.
I don't remember which language it was in, but I remember reading about a language that used static types to guarantee that exceptions are always handled. In other words, you will never have an uncaught exception.
Nulan uses its static type system to make a distinction between blocking impure functions and non-blocking impure functions. In JavaScript, there's a big difference between blocking and non-blocking, both in terms of behavior and performance. By using the static type system, it's possible to say "this function (and any functions it uses) is always blocking, and never non-blocking".
As you are aware, Rust uses its static type system to guarantee that pointers are always used correctly, and that data races cannot occur, even with concurrency.
But the biggest reason I like static types is because they make refactoring easier. Let's say you have a type that represents a person. So you have a name property, age property, etc.
Now let's say you want to add in a new property. Some of the existing functions will need to change to accommodate this new property. Without static types, you need to carefully go through your entire code base and find the functions which use the type, and then change their behavior to work with the new property.
Unit tests won't help you with that, because the unit tests are all testing the old properties, not the new property. So in the end, you have no choice but to just bite the bullet and spend a significant amount of time searching through the code and making the changes.
But with static types, it's as simple as adding the new property, compiling, fixing any type errors, compiling, fixing any type errors, etc.
Each type error tells you the file and line where the error occurs, making it easy to fix. You don't need to search through your code to find the places which need fixing: the type system does that for you. And after all the type errors are fixed, it's basically guaranteed that your code is correct.
Another example: a function used to always return a result, but now you need to change it so that it can potentially fail. So anybody that uses that function now has to account for the possibility of failure. With a static type system, you simply make the change and fix the type errors.
With unit tests, you better hope that you have 100% code coverage, because you're going to need it to find all of the functions which need to be changed. And don't forget that the unit tests need to be updated, and you probably need to add in new tests as well. That takes time, in addition to the time spent fixing the code.
So the end result is that it's faster and easier to refactor with a static type system, and it's more likely to be correct as well.
So I think static types and unit tests complement each other: they can both do things that the other cannot. Static types can make guarantees that unit tests cannot, and static types also remove some of the burden from unit tests, which speeds up development without sacrificing correctness.
On the other hand, unit tests can make certain guarantees that static types cannot. So in the end, I think you need both.
P.S. Static typing also helps tremendously with things like smart IDEs. I think Unison is a really cool example of that:
Typeclasses are amazing. They're similar to multimethods, but more powerful, and they have the same performance as regular functions.
You can use them to solve the expression problem, giving you a huge amount of flexibility, power, and conciseness. Or you can use them to create dynamically scoped variables which actually work correctly (unlike dynamic variables in most languages).
The thing is, typeclasses are only possible with static types. They're the main reason I changed Nulan to be statically typed.
----
By the way, I would like to point out that the static type systems in languages like Haskell is very different from the static type systems in other languages like C, C++, or Java. So even if you like/dislike one of them, you might end up liking the other. I used to lump all of the static type systems together, but I think that's a mistake.
There are bugs which are not caught by C, C++, or Java, but which are caught by Haskell. And there are programs which do not type check in C, C++, or Java, but which do type check in Haskell.
Yes, I wanted to say something similar in response to your phrasing that "type systems (including dynamic and static type systems) are another way to carve out the territory". When you say "dynamic type systems", in particular, I think Python and Ruby, and the modularity guarantees there seem much inferior to tests. But Haskell and the more advanced type systems might well disprove my thesis.
You're right: dynamic type systems tend to provide very weak guarantees. But I'll still take those guarantees over silent failures.
In my opinion, static guarantees (including static types) are the best, followed by unit tests, followed by dynamic guarantees (including dynamic types).
Yes, that's exactly the trade-off I've been thinking of. When I program I like to start out writing tests at the start when I don't yet know what properties I should try to enforce. Over time I start finding places where tests aren't good enough, and start feeling the need for more rigorous checking. But by then I've painted myself into a corner with my choice of language, and it's too hard to switch to a more rigorous platform.
In Mu my plan to break this dichotomy between coverage and soundness is to allow arbitrary metadata to be attached to instruction operands. As a result you aren't stuck with the type annotations and checking that I build into the system. You can have arbitrary metadata and will be able to write programs that can deduce and check this metadata to enforce all sorts of properties. Arbitrary checkers feels to me like a generalization of Lisp macros.
Arbitrary metadata is cool. Clojure has arbitrary metadata, which the compiler uses to optimize function performance, but you can use it for other purposes as well:
13 types + 5 special forms + 1 global variable + 92 built-in functions.
And that's not including things such as first-class functions, lexical scope, first-class continuations, tail call elimination, or macros. I don't even know how to count those.
The fact is that practical programming languages need a lot of axioms. If you look at the list of built-in functions, they are all useful, and most of them are for I/O, which cannot be written in an axiomatic way.
I think it would be interesting to write an "eval" function in Arc, and see how few functions it needs in order to accomplish it. But I don't see much practical benefit to it.
I thought that was the whole thing that pg was arguing/exploring. 'What's the least number of axioms necessary for a practical programming language?' And that arc was the product of that exercise.
"Of course, as soon as McCarthy's spec fell into the hands of hackers, all this theorizing was cut short. In Lisp 1.5, read and print were not written in Lisp. Given the hardware available at the time, there is no way they could have been. But things are different now. With present-day hardware you can continue till you have a runnable spec for a complete programming language. So that's what I've been doing.
The question I'm trying to answer at the moment is, what operators do you have to add to the original seven in order to be able to write an eval for a complete programming language?
I'm not finished yet with this exercise, but so far I've been surprised by how few primitives you need to add to the core in order to make these things work. I think all you need to define new types is three new primitives (plus assignment and lexical scope). One of the new primitives replaces the original atom, so you still only end up with nine total."
1) What is the least number of axioms needed for a practical language?
2) What is the least number of axioms needed to write an evaluator for the language in the language itself?
As I demonstrated, you need a lot of axioms to support practical programming, because practical programming involves I/O, threads, sockets, exceptions, etc.
Trying to find the smallest axioms necessary for I/O is a cool idea. But any I/O axioms will be intimately tied to the hardware, and the hardware is currently more C based than Lisp based. So the result won't be very elegant.
If you ignore practical programming and I/O, and only care about mathematical elegance, then McCarthy's original Lisp is already a quite good answer for question number 2.
Arc is quite a bit more elegant than most other programming languages, but at the end of the day it is still a practical language.
So my question to you is: what are you looking for?
I'm wondering first, exactly what axioms pg settled on. And I'm also curious as you've described it, if the hardware were built for the language and rather than the language being built for the hardware, "what is the least number of axioms need for a practical language".
I can't really answer that. Somebody would need to write an "eval" function in Arc. That would give you a pretty good starting point for figuring out how many axioms you need.