Arc Forumnew | comments | leaders | submitlogin
1 point by almkglor 5887 days ago | link | parent

Hmm. Would you mind giving a more overviewish summary of the library-packing system? You seem to have stuff like "defproject" in pack.arc ; what are the relationships between projects, libraries, and packages in your scheme?

Currently the (using ...) scheme in Arc-F assumes a package is a file is a library (and if the library is too large to fit in a single file, to have various "support packages" which are related to the main library package: i.e. lib.arc source contains '(using <lib/part1>v1), and there's a file in lib/part1.arc which contains the parts of the library.). I'm interested in your take on multi-file libraries.

Also, my scheme assumes one context object per source file. A context object is equivalent to the "current-package" variable ( s/"/* ) in CL, except you can arbitrarily create such objects, and they work monadically. What is your expectation on how the packaged libraries work?



1 point by stefano 5885 days ago | link

> if the library is too large to fit in a single file, to have various "support packages" which are related to the main library package

No more. For simplicity now everything (even single files) are packaged in directories. A "library" (or "package") is a way to distribute some piece of code: everything is placed within a directory named libname.pack together with informations to load dependencies. You can load such a package and all its dependencies using a single command ('use, 'require or 'using). A "project" is a way to manage the development of such a library and is roughly a "make" for Arc. To define a project you would use the macro 'defproject, specifying a list of dependencies (single files or libraries) and a list of files composing the project. When loading a project through 'proj-load only modified files are loaded. When you have loaded a project you can build a packaged library using 'deliver-library. A proper directory will be created and populated with the relevant files. Project and libraries are independent of the namespace system. I've modified 'using to load a library using 'use if nothing else can be done, with the assumption that if such a library exists, it also defines its own namespace (e.g. (using <http-get>v1) loads the library named 'http-get, hoping that that library defines a namespace <http-get> with an interface v1.

-----

1 point by almkglor 5885 days ago | link

In the Arc-F packages/namespaces, dependencies are specified by the (using ...) metaform. Each package is conceptually a point where a library could be; if a library needs several files, each file is a package and there is a unifying file which depends on the other packages and specifies the interface.

I suppose my view of namespace-as-package is from the point of view that namespaces are the be-all and end-all of organizing libraries though. Hmm. I'll try a look-see at your pack.arc, although I think it would be nice if you could provide a few simple examples.

-----

1 point by stefano 5883 days ago | link

> dependencies are specified by the (using ...) metaform

Mmmm... I've been thinking about this. There seems to be some overlapping between your system and mine. I think pack.arc is better suited to extend a pure namespace system with a packaging/dependency system, whereas arc-f namespace system is meant to manage also dependencies.

To see a real usage of pack.arc you can look at my ftp library: http://github.com/stefano/ftp-client/tree/master It's not updated to work with arc-f (yet) but it shows how to use pack.arc: a proj.arc file for development and a automatically generated directory ftp-client.pack intended to be downloaded and copied within the search path by the end user.

-----

1 point by almkglor 5883 days ago | link

Yes. In fact I kind of rushed Arc-F a bit because I suspected that your pack.arc would overlap with the packages system in Arc-F, so I wanted to see what we could work out in order to reduce overlapping in this case.

Hmm. Project management? The Arc-F namespace system doesn't handle thinking in terms of "projects". And how about versioning? The "version" that is used in the Arc-F namespaces is more about the interface version, not necessarily the version of the actual library.

Also, I was also thinking that potentially a particular library may have multiple implementations, while those implementations share the same interface. For example, there is a reference implementation for vectors in lib/vector.arc, and (using <vector>v1) will acquire that vector interface. However, a different implementation of Arc-F - for example, arc2c - might provide a different version - in the case of arc2c, it might be a thin wrapper around a C array.

-----

1 point by stefano 5883 days ago | link

> how about versioning?

I will probably add it to pack.arc in the future. For the moment it wouldn't be very useful, since there are so few libraries and the ones that do exist are in early development stage. I'll put in pack.arc what I need now to help me develop and distribute libraries. Suggestions are always welcome of course.

The main overlapping between the two system seems to be the fact that 'using tries to load a file before importing its interface. I don't think this will create any conflict with pack.arc. I should also change its name: the term "package" is used both for "collection of files" (pack.arc) and "namespace" (arc-f).

Your example about arc2c arises a problem that Arc, in its original conception, wanted to solve: multiple implementations. Small little differences between implementations always end up hurting: for example ftp-client works with Anarki and doesn't work for Arc2 or Arc-F. There are too much implementation of Arc right now. I have nothing against arc-f, snap, arc2c, rainbow,... I like their existence and what they added (in particular arc-f), but a canonical implementation used by 99% of Arc developers should exist, and it should be "fast enough" (2x slower than python is my personal limit).

-----

1 point by almkglor 5882 days ago | link

How about project-based development then? Basically to keep related files together.

Personally I prefer a variety of smaller libraries whose components would then be composed by other libraries (which would end up being small too, because the functionality exists in other libraries).

My main design goal in Arc-F is to make the use of libraries - and in particular, the use of different libraries from different people with different design philosophies - as smooth as possible. Many of the additions in Arc-F (the ones that aren't packages) are actually subtly biased towards that main goal.

> (2x slower than python is my personal limit).

Hehehe. Looks like I'll need to start doing some teh lutimate leet hackage in the function dispatching code... Or alternatively start considering how to write an interpreter from scratch (which is a subgoal of SNAP, too) ^^

-----

1 point by stefano 5882 days ago | link

> How about project-based development then?

I don't quite understand that. With pack.arc development is project-based. Maybe we have different opinions of what a "project" is. To me, it is a directory with a file proj.arc describing the structure of the project (a 'defproject declaration).

> write an interpreter from scratch

Really difficult but really needed. The mzscheme dependency is quite big compared to how small as a language Arc is. The main efficency problem, as you said, is function dispatch, because we have to check if it is a function, a list, etc. One thing I don't like very much about SNAP is the dependency on the boost libraries: it is a huge dependency. Is it really needed?

Another problem with an interpreter from scratch is the GC: it is very difficult and time consuming to write an efficient, concurrent and stable GC. A good solution would be to use the Boehm-Weiser GC: it is easy to integrate in any interpreter (I don't know if it works with SNAP's process' model, though) and it is a really good GC. Even the mono project and gcj use it.

-----

2 points by almkglor 5881 days ago | link

> With pack.arc development is project-based.

Ah, right. Of course, that's why there's 'defproject, right?

> because we have to check if it is a function, a list, etc

As an idea: generally writes to global variables are much rarer than reads from global variables; in fact, practically speaking nearly every global variable is going to be a constant. We could move the cost of checking if a call is a function, a fake arc-f function, or a data structure to the writing of global variables rather than the read.

Basically calls where the expression in function position is a reference to a global variable are transformed to callsites which monitor that global. The callsite initially determines the type of the value in the global (or creates an error-throwing lambda if the global is still unbound) and determines the proper function to perform for that call (normal function call, or a list lookup, or a table lookup, etc). The callsite also registers its presence to the global.

If the global is written, the global also notifies all living callsites (we thus need weak references for this), which will then update themselves with the new value.

This is actually "for-free" in SNAP, because there's an overhead in reading globals (copying from the global memory-space to the process memory-space), and SNAP thus needs to monitor writes to globals so it can cache reads.

> One thing I don't like very much about SNAP is the dependency on the boost libraries: it is a huge dependency. Is it really needed?

The bits of boost I've used so far are mostly the really, really good smart pointers; while I've built toy smart pointer classes I'm not sure I'd want those toys in a serious project. Also, I intend to use boost for portable mutexes. Now if only boost had decent portable asynchronous I/O...

Alternatively we could wait a bit for C++0x, which will have decent smart pointers which I believe are based on boost.

> Boehm-Weiser GC: it is easy to integrate in any interpreter (I don't know if it works with SNAP's process' model, though)

Well, one advantage of the process-local model is that process-local memory allocations won't get any additional overhead when the interpreter is multithreaded; AFAIK any malloc() drop-in replacement either needs to be protected by locks in a multithreaded environment, or will do some sort of internal synchronization anyway. In effect we have one memory pool per process, allocating large amounts of memory from the system and splitting it up according to then needs of the process.

Since processes aren't supposed to refer to other process's memory, the Boehm-Weiser GC won't have anything to actually trace across allocated memory areas anyway.

And I probably should start using tagged pointers instead of ordinary pointers ^^. They're even implementable as a C++ class wrapping a union.

In any case a copying algorithm already exists because we need to copy messages across processes anyway: minor changes are necessary to extend it to a copying GC.

-----