The Zen Computational Linguistics Toolkit

Version 4.1

Yinyang
Welcome to the Zen site. It presents a computational linguistics toolkit developed by Gérard Huet at INRIA at the Paris-Rocquencourt center.

Zen is implemented in Pidgin ML, which is a core subset of the Objective Caml programming language under the so-called revised syntax.

The Zen toolkit has been used for the implementation of the Sanskrit Heritage Engine, a set of Web services for Sanskrit Computational Linguistics under development.

Its application to the analysis of Sanskrit euphony (sandhi) is available as an article in PDF format.

This toolkit has been applied by Sylvain Pogodalla and Nicolas Barth to the morphological analysis of French verbs (300 000 inflected forms for 6500 verbs), along the Bescherelle data. It has also been used as inspiration for morphological computations in the Grammatical Frameworks project GF.

A documentation is available in literate programming style as a pdf document. Background articles for using the toolkit are an article on its use for Sanskrit tagging, an article describing the mixed automata Aum technology, and an article on modular transducers.

A compressed tar file is available. Under Unix/Linux/MacOSX, untarring this file will produce a directory ZEN_4.1, in which the README file will provide installation information. This version will work with distribution version Ocaml 4.02.2 or more recent. Enjoy!

This library, with copyright INRIA 2002-2015, is distributed as open source software under the LGPL license.



Gerard.Huet@inria.fr
Last update : September 7th, 2015