Linguistically–driven Selection of Correct Arcs for Dependency Parsing

Felice Dell’Orletta, Giulia Venturi, Simonetta Montemagni

Abstract


LISCA is an unsupervised algorithm aimed
at assigning a quality score to each arc generated by
a dependency parser in order to produce a decreasing
ranking of arcs from correct to incorrect ones. LISCA
exploits statistics about a set of linguistically–motivated
and dependency–based features extracted from a large
corpus of automatically parsed sentences and uses
them to assign a quality score to each arc of a
parsed sentence belonging to the same domain of
the automatically parsed corpus. LISCA has been
successfully tested on two datasets belonging to two
different domains and in all experiments it turned out
to outperform different baselines, thus showing to be
able to reliably detect correct arcs also representing
domain–specific peculiarities.

Keywords


Dependency parsing, correct arcs.

Full Text: PDF