Skip to content

Commit aa65a77

Browse files
authored
Document num_sample=None in natural_breaks docstring (#3505)
1 parent 256de04 commit aa65a77

1 file changed

Lines changed: 6 additions & 1 deletion

File tree

xrspatial/classify.py

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -842,12 +842,17 @@ def natural_breaks(agg: xr.DataArray,
842842
of values to be reclassified.
843843
k : int, default=5
844844
Number of classes to be produced.
845-
num_sample : int, default=20000
845+
num_sample : int or None, default=20000
846846
Number of sample data points used to fit the model.
847847
Natural Breaks (Jenks) classification is indeed O(n²) complexity,
848848
where n is the total number of data points, i.e: `agg.size`
849849
When n is large, we should fit the model on a small sub-sample
850850
of the data instead of using the whole dataset.
851+
``None`` means fit on all data instead of a sub-sample. That is
852+
the full O(n²) case described above, so it may be slow and raises
853+
``MemoryError`` if the Jenks matrices would exceed half of the
854+
available RAM. For dask the full sample is drawn lazily via
855+
indexed access.
851856
name : str, default='natural_breaks'
852857
Name of output aggregate.
853858

0 commit comments

Comments
 (0)