Algorithm for Processing Queries that Involve Boolean Columns for a Natural Language Interface to Databases
Abstract
In the last decades, the use of natural language interfaces to databases (NLIDBs) has increased exponentialy; unfortunately, the complexity of natural language has limited their effectiveness. The presence of Boolean columns in databases increases the difficulty for translating natural language queries to SQL. A Boolean column is a column that can only store two possible values: true/false, yes/no, 1/0. The problem for processing queries that involve Boolean columns, is that the search value for these columns (true/false, yes/no, 1/0) is not explicit in the queries. This problem makes NLIDBs generate erroneous translations as shown in experimental tests. A survey of the literature on NLIDBs has shown that this problem has not been identified, much less addressed. In this article, a new algorithm for processing queries that involve Boolean columns is presented. The algorithm uses syntactic and semantic information that facilitates detecting Boolean columns and their implicit values in a query. The experimental tests show that it is highly effective for translating this type of queries.