Agenda (Jérémie Dimino): - today's main topic is extensibility + how do we make it possible to make custom rules more first-class in the language? + can we support more complex systems like custom type generators (ctypes, atd-gen) and other languages/backends (bucklescript, ocsigen...)? - first we'll see a first intro of the design - and possibly some other design questions and priorities discussions Jbuilder architecture and design: (Slides: slides-dimino-jbuilder-design.pdf (same basename)) Q (Anil): at the start of the day, does it parse the .opam file? A: I'm talking about the local libraries/binaries here, in src/foo/* and bin/* Q (Gabriel): when you say "rule", do you mean a template (.mli -> .cmi, ..) or an instance (foo.mli -> foo.cmi, bar.mli -> bar.cmi) A: the instances, it is a full graph of the build. Some things are dynamic, like ocamldep. Dynamic dependencies work by: declaring static dependencies, and then returning an action to perform and a list of (dynamic) dependencies. We (Jérémie) have implemented a prototype with an intern, we have a working "include" command where the rule file included may itself may be built as part of the build. Q (Anil): Right now we have a static graph, but this becomes a static dataflow graph, what kind of analysis would we be able to do? Would it be possible to have something less powerful that restores some analysis capability? A (Jérémie): we built some limitations into the framework, included files cannot declare new libraries and binaries. So we regain some analysis power by building some limitations in the mechanism. Q (Jérémie): what is the kind of analysis we have in mind? Q (Yaron): if I have to compile some code to run the analysis, is it ok? A (Anil): for example, we can collect the .cmti in opam today in a single pass. Jérémie: but then with -ppx you still need some preprocessing. Rudi: it may be good to list all the things that we can list statically. Anil: it would be nice if I had a 100 packages to be able to run all the configuration scripts at once. For example if I want to design a new kind of cross-compilation, we could have to go through all jbuild files to find which have rules that are affected. It would be really nice if we could have a central logical place to do configuration. Jérémie: what kind of configuration? Anil: ctypes is running C code on the fly, some packages need to call pkg-config. Yaron: could we have a central place that collects this? Anil: config rules - manual command-line thing where you want to set or unset facts - package managers that want to set things - environment variables that packages consult - quirks like "gmp is broken on MacOSX is this is what you need to do in that case" Gabriel: for static analysis, it may be possible to have the build system generate a description of what it did, like .install files today, so that analysis can be done after-the-fact -- we could store this output for all packages in the opam repository or in a separate place. Rudi: we want to avoid generating jbuild files. The Facebook people do that now for Infer, I think. David: we should report to projects when they do things in unpalatable ways. Rudi: but right now there is no good other way to do it. Anil: another important thing is supporting cross-compilation, currently none of the build system do cross-compilation well, if we want jbuilder to become a standard we need good cross-compilation support. Jérémie: I think we can do this with the notion of build contexts. We need some notion of cross-compilation built in to know about resolving binaries in the host context rather than the target context, but it's just a few days of work. Anil: even things like ctypes rules that are probing the host system rather than the target system? Jérémie: it needs to get its configuration dynamically. Anil: in GNU-land you supply the m.h, s.h. we could do this with ctypes to eliminate the config step entirely. Yaron: we need to be able to discover properties of the machine on which the artifacts are going to run. Anil: I think we need central rules for configuration because this is really hard. Gabriel: there are different kinds of extensibility. Letting Coq users use jbuilder, maybe that's too much, you could solve this by making the backend of jbuilder a library to let them do their own thing. But one kind of extensibility that I think should be present in the tool is being open to new OCaml tools (parser generators, ctypes-like stuff..) that don't exist yet, without making them feel like second-class citizens. Jérémie: I think we can separate these two problems indeed. For ctypes I think it should be possible to just add a couple rules, we could allow to do this within the existing jbuilder syntax. Yaron: what about stability, if packages add new jbuilder stanzas, then you update jbuilder they break. If they depend on jbuilder, it's on them to follow the changes to the API. Jérémie: indeed, I don't want to commit to a stable jbuilder API. Gabriel: in ocamlbuild we have the Menhir rules built in the main codebase; I regret this choice because François Pottier keeps adding new features to Menhir and I don't know how to adapt the rules or add new rules. It's better if the generator maintainer also distributes and maintains the builder rules. Anil: we have a problem about compatibility right now. Yaron: we have versioning in jbuilder files, but we don't use. Q (Yaron): by "plugin" you mean "extra rules that are distributed with a package?" A: yes, except that this plugin needs to be usable from the same package, without staging install/use Q (Anil): what does ctypes.stubs.exe does, it just returns rules? A: yes. Q (François): perhaps you could do stream communication with jbuilder, you keep watching stdin/stdout to keep adding stuff. Q (Anil): why not just dynlink the library? A: I'm worried about people peeking into the internals or misunderstand the execution model -- if they start using global mutable state, it will often work, until it doesn't. Q: can the .exe plugin look at the filesystem? A: that's an issue, it needs to tell jbuilder in that case. Q (Yaron): do we sandbox execution? A: we have to sandbox, because some things don't work without sandboxing. Q (Anil): what kind of sandboxing? A: I mean making a copy of the build tree with only build-relevant files, in which building is performed. Not OS-level sandboxing. Q (François): could the output of the .exe tell after-the-fact what it is looking at? The .exe may need to look at different files depending on the options it is passed. Spiros: it depends on when things are run. Yaron: maybe we could pass the list of files in the directory to the plugin, and just rerun/memoize when this changes. Gabriel: I think we want to rely on the jbuilder API for that. That does not mean dynlink necessarily, you could instead provide an *alternative* implementation of `a Build.t` that collects the dependencies and includes then in the output s-exps in the right me. Me, the plugin author, I don't want to know about it. Yaron: but we don't want to expose this API, we want to be able to keep changing it. Anil: we kind of know what the first five plugins are, atd, ctypes (Menhir), they don't need to evaluate blobs. François: all rules start like that, you look at the configuration of the rule and then you access files, etc. You propose to forbid the plugin from doing that, and I think that is an issue. NOTE: it is hard to decide for rule generators how to publish their dependencies. François: if you just add globbing in the s-exp syntax, you need to add foreach, conditionals, you are designing a new language to avoid compatibility problems with an existing language. Jérémie: let's build an example. ``` (latex ((main blah) ...)) -> foo.tex bar.tex blah.tex -> (rule ((targets (foo.pdf))) (rule ((targets (bar.pdf))) ... ``` Yeah, I suppose for this if you try to fit it in just the DSL it is going to be painful. For this it is likely that whatever we do is going to be hard. Note (Gabriel): have a look again at A Sound and Optimal Incremental Build System with Dynamic Dependencies Sebastian Erdweg, Moritz Lichter, and Manuel Weiel. In OOPSLA 2015P http://erdweg.org/publications/pluto-incremental-build.pdf Jérémie: do we want the default plugin to do the most complicated/expressive things? Yaron: maybe one way to go at this is to sit down and look at several examples of what rules you would like to write, how it should look like, and then Jérémie could look at this and tell us whether the current system can do this, and how it should be extended. Anil: ideally we would default the least expressive system that fits the current needs. Rudi: jbuilder will be extensible in more than one way eventually and that's not a bad thing. Like `make`! Yaron: we want an example not only of what the rules output by the plugin look like, but also what code you write to get these rules. When you use a ppx in your library, jbuilder does caching at a global level. If you ask for foo.ppx, it implicitly looks for ./auto/ppx/foo/ppx.exe, builds it or reuses it if it exists. I think this global cache could be made extensible, using ./auto///, where determines the rules and determines the arguments to the rule. Ppx is one use-case. Another is js_of_ocaml, for things that are already installed in the global environment it compiles them to .js on the file, and you can cache these outputs here. François: what is the difference from defining a js_of_ocaml rule where you predefine all the paths? A: You generate those rules on-demand, and you get caching. mk-jbuilder: make it easy to create custom jbuilder executables full access to internal APIs Gabriel: I think we should go further than just mk-jbuilder, make the underlying blocks that are independent and reusable available as separate library. The portable shell part, the build backend, I think those should be available. You may think it is counter-productive to let people build new build systems, but I think it is important: they can bring new ideas in. Anil: mk-jbuilder, cannot we make this with plugin support, and not lose composability? Yaron: I think it's about letting people write their own rule universe if they are inside monoliths: Facebook, Janestreet internally. Yaron: I think the plugin stuff feels more urgent than mk-jbuilder. Rudi: there are only 25 jbuilder files (over 1000) that use the OCaml syntax today. Anil: in RWO, we have 20 chapters, we just moved to use the OCaml syntax, we need to collate rules from the 20 chapters. Jérémie: some people say "for ocamlbuild I never took to learn the internal API". I think this goes into the direction of adding this very simple language. It is more important than having a powerful but completely different language. Gabriel: who is going to be writing plugins? Jérémie: two classes, the people that want to get their things done, and the power users. Yaron: I've written a ton of Jenga but almost never contributed rules. But for jbuilder I would write plugins. Pandoc; a dynamic set of things I want. Maybe we can cut off those cases by having them more directly supported (Rudi: patterns), in the current world we want plugins. Anil: with go and cargo, out of the box, the customization is incredibly good. Out of the box you can build static-linked binaries, you can cross-compile to incredibly different architectures, Windows executables. Yaron: does the plugin stuff as described now compromise cross-compilation? Anil: I feel it makes the project move slower / harder to move. Jérémie: I can try to get started with cross-compilation first. François Bobot is giving a small talk about the Frama-C build system and porting from Makefile to jbuilder. Slides: slides-bobot-frama-C.pdf (same basename) François: opam-installer doesn't have the right defaults, I think. It you just use `opam-install` you can't install inside Debian. Jérémie (headache): we do something like this inside Jane Street. If you use -ppx, you can just have a ppx plugin to check that, otherwise we allow to interpose an executable on each source file. There are many instances where people want to decide which of the several implementations of a module interface to use at build time. This is an important feature missing from jbuilder right now. --- Gabriel: what about "changing the name of the project"? ... Yaron: I will ask our lawyers to know whether there is a trademark issue. If there is, we will change the name, otherwise not. (Jérémie seems like he would actually mind changing the name, or at least he does not like jbuilder.) (Gabriel: personally I think `obuilder` is fine) "`obi` is not terrible" --- Jérémie: summarize the discussion of this morning - I think "include" can solve a lot of needs - the plugin.exe mechanism is popular, but we don't know what's a good way to track dependencies and it requires more work - Gabriel keeps mentioning the idea of reusing the build API to describe rich plugins, because it's a very nice API. --- Jérémie talks about the architecture of the tool. jbuilder is inspired by jenga. The opam files are automatically generated from the jbuild files. I took some of the dynamism out of the jenga design (which is optimized for polling), so that we have enough static information to generate opam files. The `('a, 'b) Build.t` type is an arrow, which is a more static effect/transformation type than a monad, where you cannot depend on the value of previous results, a bit like applicative functors. Out of this arrow you can get a list of target and dependencies for the action. Yaron: why `('a, string)` instead of `(unit, string)` for Build.contents? A rule is a triple (dependencies, action, targets). An Action.t in jbuilder is a small DSL for shell actions. Rudi: what about dynamic dependencies, how are they represented? Jérémie: a rule could have the type (unit, Action.t * Targets.t) arr but we write it (unit, Action.t) Build.t the Build.t accumulates those targets in state-passing style. When you call `Build.path : Path.t -> ('a, 'a) t`, the dependency is implicitly added to the dependency set. But when you have to be more dynamic, you can use `dyn_paths : ('a, Path.t list) t -> ('a, 'a) t`. Gabriel: I'm a bit confused by the type of `val path : Path.t -> ('a, 'a) t`. `path` does not return anything useful, it is used as a way to record dependencies on the side when you know that the command (the action you are returning) has implicits dependencies that don't come from you. For example the compiler will read `.cmi` file when you call it from a `.ml` to produce a `.cmo`. Rudi: the vspec/vstore part was a bit confusing to me as I started. Jérémie: so the idea is to represent (on the build-system side) a file which is used to store a value. It will be cached and not re-looked-up if you ask for it several times (from the `Vspec.t` value). Contexts: a context is stored in _build/, and in the code we carry extra information about contexts, such as the set of rules that are active there. Rudi: what is a scope? Jérémie: if you have a package, it defines a scope. If you have a package within a package, you have two scopes. These scopes are important. When you have a library that don't have a public name, you have no guarantee that the module names don't clash. With scopes you can deal with libraries with clashing names; only the public names must be unique. A library that is not in the public interface will never be installed, but it may be used as part of the build. Gabriel: so right now the contexts are completely independent from each other, but the idea for cross-compilation is to have a pair of contexts that are linked, you'd build binaries used as part of the build in the host context, and the final build artifacts in the target contexts. François: what's the difference between context and super_context? Jérémie: the super_context is the context plus the internal state: the rule database, the names, the source tree... Rudi: aliases are a bit weird, can you talk about it? Jérémie: aliases are sort of the equivalent of phony targets in make. They are commands that don't produce anything, but you want to run them for the side effects. For example `@install` builds all the files that you want to install. It's called `alias` because it is mostly used to give a name to a bundle of files. Gabriel: ok, that sounds like the stamps / itarget+otarget in ocamlbuild. Jérémie: recursive aliases are defined in any subdirectory/subtree, but there is no way for users to define those. Yaron: if all aliases were recursive by default, users could indicate their intent by explicitly indicating the subtree in which to build the alias. Jérémie: for example in my version of Camomile that uses jbuilder, I have aliases to describe all the .mar dependencies. aliases scope not just only within a jbuild file, but also in other jbuild files for your project. Discussions on scoping of recursive vs. non-recursive aliases. Jérémie: you can already explicitly use a path to refer to an alias from another file. If it is recursive, it means not just this jbuild file, but also all its children in the subtree. If you have `alias foo` at the root that is recursive, and you clone a project in a subdirectory and it also has `alias foo`, suddenly the name of the source `foo` alias changed. François: maybe "alias file", because non-recursive aliases behave like a file of the same name. Gabriel: this is a delicate discussion, maybe we should stay aware that this is an issue and move on. Jérémie: this was the backend. Regarding the fronted. jbuild.ml parses jbuild files, it is independent from the context. main.ml has the "main logic" that does the local setup, interprets the command and continues. description of Main.setup: Jbuild.load gives you: - all the jbuild files around (the OCaml ones are not yet evaluated) - all the external packages, etc. Then we read the context to setup all workspace contexts. Then we call Gen_rules to generate all the rules in each context. That's when we evaluate the jbuild files in ocaml syntax. François: Gen_rules iterates from the leaves to the root, so that you can refer to a file that is inside a leaf but not the contrary. Jérémie: with dynamic rule generation we won't need this property anymore, as the rule arrow will keep the dependencies in order and detect cycles etc. description of Main.build: - calling Main.setup - resolving targets; there is a bit of logic, if you ask for foo.exe it actually builds _build/default/foo.exe - then just do_build which calls the actual builder For things that are part of the workspace, jbuilder creates its own internal database. For things that are outside/external, jbuilder reads META files like ocamlfind. We have our own implementation of findlib. Spiros: the bootstrap function in main, it has to be called before everything? Jérémie: that's only used to build jbuilder itself. Bootstrap: I (Jérémie) vendor some libraries that I use because I cannot have dependencies. But these dependencies are built with jbuild files themselves, and when I want to build them I don't have jbuilder yet. In bootstrap.ml there is a micro-build-system that knows about the vendored files. Basically it collates all the jbuilder source files in a one big ocaml file, we build it and get a .exe. Gabriel: one thing I wanted to do in ocamlbuild for a long time is, on successful build, generate a log that a very simple interpreter can replay. jbuilder has this. Can't you use this for the bootstrap? Jérémie: but in the log, the commands are specialized to your system. The ocamlopt.exe path contains my home path, other people can't reuse it. The file extensions as well: .dll vs .so, etc. It is very difficult to get something portable out of this process. François: about the bootstrap, sometimes it fails in wrong way, you have to remember that sometimes if you see a weird error you should just remove boot.exe. Jérémie: for me this log/makefile generation would be interesting if we wanted to use jbuilder in the compiler. Well another thing is, if you change the rules, you can diff the two generated makefiles. Spiros: why the dependencies? - re: there is no glob - opam: .. - cmdliner: to get the nice cli (but in the bootstrap we don't use it) Gabriel: you said you reimplemented findlib, was it just to avoid the extra dependency? Jérémie: findlib is mainly done for one-shot invocation, and I wanted to get better error messages. Plus the META format is very well documented, it was easy and fun to reimplement it; I only got the minimal parts I got. Rudi: what about the ppx drivers? Jérémie: when you have the `preprocess` command in a jbuild file. The rules for this are in super_context, pped_module. All the ppx you put in `preprocess` are libraries; jbuilder will collate all these libraries and link them together in a single executable that becomes a specialized driver. François: but not all files have the same ppx extensions? Jérémie: for every set of ppx, you generate a driver. Rudi: couldn't you toggle which rewriters you are running from the driver? Jérémie: to do that you would need to link all ppxes at once, it's not very efficient. Yaron: so you require that all those ppx preprocessors use ppx_driver? Jérémie: it's not ppx_driver, it's the reimplementation of the driver that was put in ocaml-migrate-parsetree. Drup: but you need to put your ppx in driverified mode, so you are incompatible with legacy ppx extensions. Jérémie: there are three motivations for asking drivers: - improving portability - when you use ocaml-migrate-parsetree, Jérémie: I think in practice that wasn't really a problem, people migrate their ppx. Gabriel: is that an issue? Drup: having to use the driver parts of ocaml-migrate-parsetree (OMP) forces you to use jbuilder, because it's very hard to use from oasis/ocamlbuild. Frédéric tried very hard to make the driver work with oasis, and it never worked. Jérémie: I don't want to encourage people to use non-driverified ppx extensions. Let's agree to disagree and move on. François: what is the future for opam-installer? Jérémie: I suppose we could handle it on the jbuilder side indeed, that's an interesting thing to do. Gabriel: two questions - what should be changed to make jbuilder more maintainable? - are you planning to de-harcdode some of the hardcoded thing to use the plugin mechanism instead? Jérémie: right now the way we list source files is a bit fragile. We could also split things in sub-libraries. Jérémie: One thing you lose when you split libraries is, you break parallelism, because ocamldep is just per-library. François: could you make the build of libraries more fine-grained to avoid that issue? Jérémie: we looked at this for Jenga at some point, and I think it's quite complicated. octachron: codep can compute dependencies, resolving namespace correctly, and I think it would be possible to solve this problem. Jérémie: let's keep in mind that this would be a lot of work for a maybe 1.1, 1.2 efficiency boost. François: I think you don't gain much on the first build, but you gain on recompilation. octachron: I think codep could do this. Yaron: if you do this fine-grained computation on the whole thing, you may run into memory usage issues as our Jane Street experience shows. Gabriel: it may be possible to be explicit, for the user to say "these dependencies are in the same 'project', I develop in sync, please be fine-grained there". Rudi: you could also go back to approximation if you see the graph becomes too large. De-hardcoding? Jérémie: our targets for de-hardcoding would be - Menhir/ocamlyacc - js_of_ocaml Jérémie: for example Menhir is interesting, or even ocamlyacc. If we had this small DSL, it would be easier to describe what to do than what we do in the source. Rudi: for proper Menhir support you need to interact with ocamldep in some way. Jérémie: I think Menhir could work. For js_of_ocaml, I think we should think of it as cross-compilation, so I hope that it would work better when we have this. For Merlin, I think we discussed this with Fred, it could parse a sort of log that just describes how each compilation unit was built. Rudi: can we use something better than just a log? François: I think there are formats used by clang that it could be interesting to reuse as it comes with tooling. https://clang.llvm.org/docs/JSONCompilationDatabase.html --- other features, smaller priorities, things that Jérémie cares about - alternative implementations (variants) In many case, people want to have libraries with several implementations. When they link the executable they specify which implementation they want. For example with js_of_ocaml you want your own implementation of Unix. Or the way bisect works. Usually people do this with ocamlfind predicates. In Leo's namespace proposal you have a notion of variants that includes this. You would be able to define a library "virtual module" with only the .mli, and somewhere else you would provide the implementation .ml, with a variant name. (Drup: that looks a bit like Backpack) Rudi: what is wrong with predicates, why are they out of fashion? Jérémie: the full power of predicates is hard to transcribe in jbuilder; with predicates you could put predicates on anything. In jbuilder I want things to be the same whether they are installed or not, so I would need to expose all this at the build level; it makes the thing fairly complicated. I think that `variants` correspond to a majority of real-world usage of predicates. François: would variant names be global or local? François: will we have the problem that the same variant may mean different things in two libraries? Maybe we could have as with ppx, `ocaml.warning` and `warning`. Maybe we could have a way to define/scope them. Gabriel: there are cases where it's clear that variants should be global, and cases where it's clear they have to be local (François' example of having a GUI or not). Drup: variants are inside namespaces, right? And namespaces can take arbitrary paths. So we could use this. Jérémie: variants are in a different namespaces than namespaces. Yaron: maybe these are two different features. Jérémie: one thing is, we want to stay close to the simple thing that has a chance of entering the language directly. Rudi: we can choose the longer variant name for now. - multi_directories, currently François' copy hack is a bit cumbersome, but I think we could have a more natural import_dirs directive that would just work. François: what do you import, only the .ml and .mli, but also the .c as well? I would propose to include a glob. Gabriel: with this (copy_files# foo/*.ml) solution, is the jbuild file in foo/ consulted to decide preprocessing, etc.? Jérémie: no, no. - Setting default release/dev flags. jbuilder has a hardcoded list of flags. People want to change that. I would suggest having like: (set_defaults ((flags (:standard -O3)))) and this changes the defaults for you and also all your subtrees, *for this scope*. (If you have a vendored package, not a sub-library, it's a different scope and isn't affected.) You can also use include: ((flags (:include flags.sexp))) Rudi: yeah but that's so much work, I only want to set one debug flag for a bit! - Metaprogramming of jbuild files Gabriel: is it easy to build a C++ sub-project using make/cmake? Jérémie: we have an example of that with libre2 in https://github.com/janestreet/re2/blob/master/src/re2_c/jbuild François: in the other direction, I could add support for the make server protocol to benefit from parallelism when called from make. Jérémie: but in fact supporting the server protocol is more difficult than that... François: just something about the design of jbuild-ignore, at first I thought this was like .gitignore, "no don't look at this", and in fact jbuild-ignore still crawls the file but ignores jbuild files. Jérémie: maybe we could have a syntax to say this in jbuild-ignore, something like !path for "don't ever look in there".