Please note that this newsitem has been archived, and may contain outdated information or links.
27 February 2007, HAI-Tea, Pieter Adriaans
In this lecture I will present some recent work I did with Paul Vitanyi and Ceriel Jacobs on the application of the MDL (Minimum Description Length) principle to grammar induction. We have studied MDL in terms of two-part code optimization and randomness deficiency. These notions will be explained in the lecture. In this framework we showed that: 1) Shorter code not necessarily leads to better theories, e.g. the randomness deficiency does not decrease monotonically with the MDL code, 2) contrary to what is suggested by the results of Gold:1967 there is no fundamental difference between positive and negative data from an MDL perspective, 3) MDL is extremely sensitive to the correct calculation of code length. Using these ideas we have implemented a MDL variant of the EDSM algorithm. The results show that although MDL works well as a global optimization criterion, it falls short of the performance of algorithms that evaluate local features of the problem space. MDL can be described as a global strategy for featureless learning.
For more information, see
http://homepages.cwi.nl/~paulv/papers/perils.pdf and
http://staff.science.uva.nl/~pietera/ALS/background/lncs_icgi-mdl.pdf.
For more information on HAI-Tea lectures, see
http://www.science.uva.nl/onderwijs/studieprogramma/haitea/.
Please note that this newsitem has been archived, and may contain outdated information or links.