Skip to content

ariadnafruits/spanish-command-recognizer

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Feature-Based Grammar for Command Recognition

This project implements a rule-based natural language command recognizer using a feature-based context-free grammar (FCFG) in Python with the NLTK library.

The goal is to parse simple Spanish commands directed to a robot and extract their semantic representation, which can then be used for execution.


Project Overview

The system takes textual commands in Spanish as input and produces:

  • A syntactic parse tree
  • A semantic structure describing the intended action

The core of the project is the design and implementation of the grammar gramatica_base.fcfg.


Syntactic and Semantic Representation

The grammar was designed after manually analyzing a set of written commands from a training text. These commands were first analyzed syntactically, and the resulting structures were then enriched with semantic information encoded as feature attributes. This analysis made it possible to define a feature structure capable of representing actions and their parameters.

The resulting feature structure is shown in the following figure:

Feature structure

Technical Implementation

The syntactic and semantic representations were implemented as a set of feature-based grammar rules compatible with NLTK’s load_parser method. These rules combine syntactic structure with semantic attributes through feature unification, allowing the parser to build a structured representation of each command.

Each rule propagates and constrains features such as ACCION and DISTANCIA so that, when combined, they yield a semantic interpretation of the input sentence. The full set of these rules constitutes the grammar.

Example

The following input:

avanzar diez metros

is parsed into the semantic representation:

SEM = [ACCION='avanzar',
       DISTANCIA=[CANTIDAD='diez', UNIDAD='metro'],
       VELOCIDAD=?v]

This interpretation is produced by combining grammar rules such as:

  • A sentence rule mapping the verb phrase to the global semantic structure:

    S[SEM=[ACCION=?a, DISTANCIA=?d, VELOCIDAD=?v]] ->
        SV[ACCION=?a, DISTANCIA=?d, VELOCIDAD=?v]
    
  • A verb phrase rule combining the verb and the distance phrase:

    SV[ACCION='avanzar', DISTANCIA=?d, VELOCIDAD=?v] ->
        V[ACCION='avanzar'] SN[OBJETO='distancia', CANTIDAD=?c, UNIDAD=?u]
    
  • A noun phrase rule for distance expressions:

    SN[CANTIDAD=?c, UNIDAD=?u, TIPO='distancia'] ->
        Det[CANTIDAD=?c] N[TIPO='distancia', UNIDAD=?u]
    
  • Lexical entries such as:

    V[ACCION='avanzar'] -> 'avanzar'
    Det[CANTIDAD='diez'] -> 'diez'
    N[TIPO='distancia', UNIDAD='metro'] -> 'metros'
    

Although the grammar rules are written in top-down form, the resulting parse can be understood bottom-up. The parser first matches the input words with lexical entries and then combines them into progressively larger constituents through feature unification:

1. Lexical matching

avanzar -> V[ACCION='avanzar', COMP='si']
diez    -> Det[CANTIDAD='diez']
metros  -> N[TIPO='distancia', UNIDAD='metro']

2. Building the nominal constituent

N[TIPO='distancia', UNIDAD='metro']
→ Nominal[OBJETO='distancia', ..., TIPO='distancia', UNIDAD='metro']

Det[CANTIDAD='diez'] + Nominal[...]
→ SN[CANTIDAD='diez', OBJETO='distancia', ..., TIPO='distancia', UNIDAD='metro']

3. Building the verb phrase

V[ACCION='avanzar', COMP='si'] + SN[...]
→ SV[ACCION='avanzar', COMP='si', DISTANCIA=[CANTIDAD='diez', UNIDAD='metro']]

4. Building the sentence

SV[ACCION='avanzar', COMP='si', DISTANCIA=[CANTIDAD='diez', UNIDAD='metro']]
→ S[SEM=[ACCION='avanzar', DISTANCIA=[CANTIDAD='diez', UNIDAD='metro'], VELOCIDAD=?v]]

Project Structure

├── gramatica_base.fcfg           # Feature-based grammar defining the accepted command structures
├── programa_base.py              # Base Python script that loads the grammar and parses the input sentences
├── text-input-train.txt          # Training sentences used during grammar development
├── text-input-test.txt           # Test sentences used to evaluate the grammar on unseen examples
├── text-output-train.txt         # Parser output obtained on the training set
├── text-output-test.txt          # Parser output obtained on the test set
└── Procedimiento y discusión.pdf # Report describing the methodology, analysis, and results

Results

When applied to the test set (text-input-test.txt), the system correctly analyzes seven of the nine input commands: it partially analyzes one, and fails to recognize another one. See Procedimiento y discusión.pdf for a more detailed discussion.

Technologies Used

  • Python 3.6
  • NLTK (Natural Language Toolkit)
  • Feature-based grammars (FCFG)
  • Unification-based parsing

Conclusions

This project shows that feature-based grammars with unification are a powerful approach for mapping natural language commands into structured semantic representations.

However:

  • Grammar rules can become complex and repetitive

  • Handling implicit meaning and lexical variation requires further refinement

About

Rule-based recognition of Spanish robot commands using a feature-based grammar and semantic parsing

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages