Skip to content

Issue with multinomial model in pygformula when using categorical covariates with more than 6 levels #21

@jgonzhijon

Description

@jgonzhijon

I'm encountering an issue when using categorical covariates with more than 6 levels in a multinomial model with pygformula. The model fitting process produces NaN values, which then prevents bootstrap results from being computed — due to a NoneType error in the results.

This does not happen if I:

  • Reduce the number of levels in the categorical variable to 6 or fewer.
  • Use one-hot encoding and fit a one-vs-rest model instead of multinomial.

I would prefer to use the multinomial option directly with categorical covariates if possible.

My questions are:
What could be causing the multinomial model to fail when categorical variables have many levels?
Is there a known limitation or workaround in pygformula or statsmodels?
If this can't be fixed cleanly, would using one-vs-rest with one-hot encoding be a valid alternative?

Here is the relevant part of the error output, the warnings repeat across all the execution :

 
/opt/anaconda3/lib/python3.12/site-packages/statsmodels/discrete/discrete_model.py:3028: RuntimeWarning: invalid value encountered in divide
  return eXB/eXB.sum(1)[:,None]
Optimization terminated successfully.
Current function value: nan
Iterations 4
Optimization terminated successfully.
Current function value: 0.214036
Iterations 12
Optimization terminated successfully.
Current function value: 0.458476
Iterations 8
/opt/anaconda3/lib/python3.12/site-packages/statsmodels/discrete/discrete_model.py:3027: RuntimeWarning: overflow encountered in exp
  eXB = np.column_stack((np.ones(len(X)), np.exp(X)))
/opt/anaconda3/lib/python3.12/site-packages/statsmodels/discrete/discrete_model.py:3028: RuntimeWarning: invalid value encountered in divide
  return eXB/eXB.sum(1)[:,None]
Optimization terminated successfully.
Current function value: nan
Iterations 5
/opt/anaconda3/lib/python3.12/site-packages/statsmodels/discrete/discrete_model.py:3027: RuntimeWarning: overflow encountered in exp
  eXB = np.column_stack((np.ones(len(X)), np.exp(X)))
/opt/anaconda3/lib/python3.12/site-packages/statsmodels/discrete/discrete_model.py:3028: RuntimeWarning: invalid value encountered in divide
  return eXB/eXB.sum(1)[:,None]
Optimization terminated successfully.
Current function value: 0.471235
Iterations 8
Optimization terminated successfully.
Current function value: nan
Iterations 4
Optimization terminated successfully.
Current function value: 0.205321
Iterations 10
Optimization terminated successfully.
Current function value: 0.458697
Iterations 8
/opt/anaconda3/lib/python3.12/site-packages/statsmodels/discrete/discrete_model.py:3027: RuntimeWarning: overflow encountered in exp
  eXB = np.column_stack((np.ones(len(X)), np.exp(X)))
/opt/anaconda3/lib/python3.12/site-packages/statsmodels/discrete/discrete_model.py:3028: RuntimeWarning: invalid value encountered in divide
  return eXB/eXB.sum(1)[:,None]
Optimization terminated successfully.
Current function value: nan
Iterations 4
/opt/anaconda3/lib/python3.12/site-packages/statsmodels/discrete/discrete_model.py:3027: RuntimeWarning: overflow encountered in exp
  eXB = np.column_stack((np.ones(len(X)), np.exp(X)))
/opt/anaconda3/lib/python3.12/site-packages/statsmodels/discrete/discrete_model.py:3028: RuntimeWarning: invalid value encountered in divide
  return eXB/eXB.sum(1)[:,None]
Optimization terminated successfully.
Current function value: nan
Iterations 4
Traceback (most recent call last):
  File "/Users/juanito/Documents/Paper1-HSU/HSU-PSID-Data/Main_Analysis/prueba_error.py", line 147, in 
    g.fit()
  File "/opt/anaconda3/lib/python3.12/site-packages/pygformula/parametric_gformula/parametric_gformula.py", line 742, in fit
    boot_results_dicts[i]['boot_results'][j] for j in range(len(self.int_descript)))

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^
TypeError: 'NoneType' object is not subscriptable

 

Thanks for all the work on this package!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions