Loughborough University
Leicestershire, UK
LE11 3TU
+44 (0)1509 263171
Loughborough University

Loughborough University Institutional Repository

Please use this identifier to cite or link to this item: https://dspace.lboro.ac.uk/2134/26547

Title: Fast learning of restricted regular expressions and DTDs
Authors: Freydenberger, Dominik D.
Kotzing, Timo
Issue Date: 2015
Publisher: © Springer Science+Business Media New York
Citation: FREYDENBERGER, D.D. and KOTZING, 2015. Fast Learning of Restricted Regular Expressions and DTDs. Theory of Computing Systems, 57 (4), pp.1114-1158
Abstract: © 2014, Springer Science+Business Media New York. We study the problem of generalizing from a finite sample to a language taken from a predefined language class. The two language classes we consider are subsets of the regular languages and have significance in the specification of XML documents (the classes corresponding to so-called chain regular expressions, Chares, and to single-occurrence regular expressions, Sores). The previous literature gives a number of algorithms for generalizing to Sores providing a trade-off between quality of the solution and speed. Furthermore, a fast but non-optimal algorithm for generalizing to Chares is known. For each of the two language classes we give an efficient algorithm returning a minimal generalization from the given finite sample to an element of the fixed language class; such generalizations are called descriptive. In this sense of descriptivity, both our algorithms are optimal.
Description: The final publication is available at Springer via http://dx.doi.org/10.1007/s00224-014-9559-3
Version: Accepted for publication
DOI: 10.1007/s00224-014-9559-3
URI: https://dspace.lboro.ac.uk/2134/26547
Publisher Link: http://dx.doi.org/10.1007/s00224-014-9559-3
ISSN: 1432-4350
Appears in Collections:Published Articles (Computer Science)

Files associated with this item:

File Description SizeFormat
SubregularLearning.pdfAccepted version724.57 kBAdobe PDFView/Open


SFX Query

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.