Conversation
… information, and more
LeonidElkin
left a comment
There was a problem hiding this comment.
- Universal options (calculations) of the characteristic should be passed to
query_methodand stick out explicitly for the user as a public API instead of the current**options: Any. @Desiment will describe the use cases in more detail in the PR or personally - The strategy assumes responsibility for passing options to
ComputationMethod. If no changes have been made by the user, we leave the defaults.
There was a problem hiding this comment.
Firstly, add a docstring with details of this submodule in the top of the file. Secondly, in order not to stretch the init files, I recommend making a scheme with importing all from the files of this module. For details, see how it's done here
There was a problem hiding this comment.
Actually, you need to add docstrings everywhere. I won't mention it farther
| @@ -0,0 +1,70 @@ | |||
| from __future__ import annotations | |||
|
|
|||
| __author__ = "Leonid Elkin, Mikhail Mikhailov, Irina Sergeeva" | |||
There was a problem hiding this comment.
I think you can add only yourself to author
| tags : frozenset[str] | ||
| Constraint tags used for matching (e.g. ``{"continuous", "univariate"}``). |
There was a problem hiding this comment.
Naming is not very telling. At least constraint_tags or smth like that
| name: str | ||
| target: GenericCharacteristicName | ||
| sources: Sequence[GenericCharacteristicName] | ||
| fitter: FitterFunc |
There was a problem hiding this comment.
We've already have such type in computations.py. Move it to types.py and use in both places
| from pysatl_core.distributions.computation import ComputationMethod | ||
|
|
||
| if self.cacheable: | ||
| return ComputationMethod( | ||
| target=self.target, | ||
| sources=list(self.sources), | ||
| fitter=self.fitter, | ||
| ) | ||
| return ComputationMethod( | ||
| target=self.target, | ||
| sources=list(self.sources), | ||
| evaluator=self.fitter, # type: ignore[arg-type] | ||
| ) |
There was a problem hiding this comment.
It is a bit incorrect. You provide FitterFunc type to evaluator. I think it's worth doing the following: if the fitter is not cacheable, then we don't name it a fitter, but an evaluator. And we return from it not the FittedComputationMethod, but the already calculated value. This will help us maintain the overall consistency of the API.
I think it's possible to rename submodule fitters to computations and handle both evaluator and fitters cases. You probably could move computation.py there
There was a problem hiding this comment.
After it's done I think the cacheable field will be gone
| raw = kwargs.pop(self.name, self.default) | ||
| try: | ||
| value = self.type(raw) | ||
| except (TypeError, ValueError) as exc: | ||
| raise TypeError( | ||
| f"Option '{self.name}': cannot convert {raw!r} to {self.type.__name__}" | ||
| ) from exc | ||
| if self.validate is not None and not self.validate(value): | ||
| raise ValueError(f"Option '{self.name}': value {value!r} failed validation.") | ||
| return value |
There was a problem hiding this comment.
Based on our dialogue in an online call. The kwargs will have to leave, and we will use the descriptors to form kwargs for specific fitters and evaluators.
| cacheable: bool = True | ||
| options: tuple[FitterOption, ...] = () | ||
| tags: frozenset[str] = field(default_factory=frozenset) | ||
| priority: int = 0 |
There was a problem hiding this comment.
It doesn't seem that priority is used for smth. I think we should either remove it or change the strategy to consider it while finding the path in the graph
| FittedComputationMethod[NumericArray, NumericArray] | ||
| Array-semantic ``cdf`` callable. | ||
| """ | ||
| opts = FITTER_PDF_TO_CDF_1C.resolve_options(kwargs) |
There was a problem hiding this comment.
This is also should be gone after moving descriptors to strategy
| FITTER_PDF_TO_CDF_1C = FitterDescriptor( | ||
| name="pdf_to_cdf_1C", | ||
| target=CharacteristicName.CDF, | ||
| sources=[CharacteristicName.PDF], | ||
| fitter=fit_pdf_to_cdf_1C, | ||
| options=( | ||
| FitterOption( | ||
| name="limit", | ||
| type=int, | ||
| default=200, | ||
| description="Maximum number of quad subdivisions per integral.", | ||
| validate=lambda v: v > 0, | ||
| ), | ||
| ), | ||
| tags=frozenset({"continuous", "univariate"}), | ||
| priority=0, | ||
| description="PDF -> CDF via segment-wise scipy.integrate.quad with cumsum.", | ||
| ) |
There was a problem hiding this comment.
I think we should fill in the descriptors lazily when the strategy does not find an analytically defined characteristic in the graph and goes looking for paths in the graph, or when the user wants to get/change options in the descriptors.
This is necessary so that when working with our builtins distributions in a standard way, you do not need to keep extra information in memory.
There was a problem hiding this comment.
So I think it's worth keeping the descriptor padding and calculations in different places.
No description provided.