Skip to content

Incompatible Pandas column selection syntax in HourlyStats(breaks with recent Pandas versions) #127

@ufuk-cakir

Description

@ufuk-cakir

Bug Description

Running the current version of Cell2Fire(0.2) with Python (3.11.10) and pandas=2.2.3 on MacOS 15.4 results in a ValueError during statistics generation. The error is due to outdated column selection syntax inside cell2fire/utils/Stats.py

Error message

ValueError: Cannot subset columns with a tuple with more than one element. Use a list instead.

Location in the Code

all places in the Stats.py where groupby appears

SummaryDF = Ah[["NonBurned", "Burned", "Harvested", "Hour"]].groupby('Hour')["NonBurned", "Burned", "Harvested"].mean()

Cause of Bug

The line uses a legacy style of selecting multiple columns after groupby, passing a tuple instead of a list:

.groupby('Hour')["NonBurned", "Burned", "Harvested"]

This syntex is no longer supported in newer version of Pandas. Check for example this StackOverflow post

It now requires passing a list of column names

.groupby('Hour')[["NonBurned", "Burned", "Harvested"]]

Suggested FIx

  • either change lines where groupby() is used to the syntax that uses double brackets like
.groupby('Hour')[["NonBurned", "Burned", "Harvested"]]
  • Or Fix pandas version to pandas<2.0.0 in the requirements.txt

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions