Feature-Based Grammar for Command Recognition

This project implements a rule-based natural language command recognizer using a feature-based context-free grammar (FCFG) in Python with the NLTK library.

The goal is to parse simple Spanish commands directed to a robot and extract their semantic representation, which can then be used for execution.

Project Overview

The system takes textual commands in Spanish as input and produces:

A syntactic parse tree
A semantic structure describing the intended action

The core of the project is the design and implementation of the grammar gramatica_base.fcfg.

Syntactic and Semantic Representation

The grammar was designed after manually analyzing a set of written commands from a training text. These commands were first analyzed syntactically, and the resulting structures were then enriched with semantic information encoded as feature attributes. This analysis made it possible to define a feature structure capable of representing actions and their parameters.

The resulting feature structure is shown in the following figure:

Technical Implementation

The syntactic and semantic representations were implemented as a set of feature-based grammar rules compatible with NLTK’s load_parser method. These rules combine syntactic structure with semantic attributes through feature unification, allowing the parser to build a structured representation of each command.

Each rule propagates and constrains features such as ACCION and DISTANCIA so that, when combined, they yield a semantic interpretation of the input sentence. The full set of these rules constitutes the grammar.

Example

The following input:

avanzar diez metros

is parsed into the semantic representation:

SEM = [ACCION='avanzar',
       DISTANCIA=[CANTIDAD='diez', UNIDAD='metro'],
       VELOCIDAD=?v]

This interpretation is produced by combining grammar rules such as:

A sentence rule mapping the verb phrase to the global semantic structure:

S[SEM=[ACCION=?a, DISTANCIA=?d, VELOCIDAD=?v]] ->
    SV[ACCION=?a, DISTANCIA=?d, VELOCIDAD=?v]

A verb phrase rule combining the verb and the distance phrase:

SV[ACCION='avanzar', DISTANCIA=?d, VELOCIDAD=?v] ->
    V[ACCION='avanzar'] SN[OBJETO='distancia', CANTIDAD=?c, UNIDAD=?u]

A noun phrase rule for distance expressions:

SN[CANTIDAD=?c, UNIDAD=?u, TIPO='distancia'] ->
    Det[CANTIDAD=?c] N[TIPO='distancia', UNIDAD=?u]

Lexical entries such as:

V[ACCION='avanzar'] -> 'avanzar'
Det[CANTIDAD='diez'] -> 'diez'
N[TIPO='distancia', UNIDAD='metro'] -> 'metros'

Although the grammar rules are written in top-down form, the resulting parse can be understood bottom-up. The parser first matches the input words with lexical entries and then combines them into progressively larger constituents through feature unification:

1. Lexical matching

avanzar -> V[ACCION='avanzar', COMP='si']
diez    -> Det[CANTIDAD='diez']
metros  -> N[TIPO='distancia', UNIDAD='metro']

2. Building the nominal constituent

N[TIPO='distancia', UNIDAD='metro']
→ Nominal[OBJETO='distancia', ..., TIPO='distancia', UNIDAD='metro']

Det[CANTIDAD='diez'] + Nominal[...]
→ SN[CANTIDAD='diez', OBJETO='distancia', ..., TIPO='distancia', UNIDAD='metro']

3. Building the verb phrase

V[ACCION='avanzar', COMP='si'] + SN[...]
→ SV[ACCION='avanzar', COMP='si', DISTANCIA=[CANTIDAD='diez', UNIDAD='metro']]

4. Building the sentence

SV[ACCION='avanzar', COMP='si', DISTANCIA=[CANTIDAD='diez', UNIDAD='metro']]
→ S[SEM=[ACCION='avanzar', DISTANCIA=[CANTIDAD='diez', UNIDAD='metro'], VELOCIDAD=?v]]

Project Structure

├── gramatica_base.fcfg           # Feature-based grammar defining the accepted command structures
├── programa_base.py              # Base Python script that loads the grammar and parses the input sentences
├── text-input-train.txt          # Training sentences used during grammar development
├── text-input-test.txt           # Test sentences used to evaluate the grammar on unseen examples
├── text-output-train.txt         # Parser output obtained on the training set
├── text-output-test.txt          # Parser output obtained on the test set
└── Procedimiento y discusión.pdf # Report describing the methodology, analysis, and results

Results

When applied to the test set (text-input-test.txt), the system correctly analyzes seven of the nine input commands: it partially analyzes one, and fails to recognize another one. See Procedimiento y discusión.pdf for a more detailed discussion.

Technologies Used

Python 3.6
NLTK (Natural Language Toolkit)
Feature-based grammars (FCFG)
Unification-based parsing

Conclusions

This project shows that feature-based grammars with unification are a powerful approach for mapping natural language commands into structured semantic representations.

However:

Grammar rules can become complex and repetitive
Handling implicit meaning and lexical variation requires further refinement

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
data		data
docs		docs
grammar		grammar
src		src
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Feature-Based Grammar for Command Recognition

Project Overview

Syntactic and Semantic Representation

Technical Implementation

Example

Project Structure

Results

Technologies Used

Conclusions

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Feature-Based Grammar for Command Recognition

Project Overview

Syntactic and Semantic Representation

Technical Implementation

Example

Project Structure

Results

Technologies Used

Conclusions

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages