Description
A static typechecker can't evaluate is_remote_only() which makes the type annotation of relevant DataFrame methods be a Union of the property method or a Column (because of __getattr__).
|
if not is_remote_only(): |
|
|
|
@property |
|
def rdd(self) -> "RDD[Row]": |
|
"""Returns the content as an :class:`pyspark.RDD` of :class:`Row`. |
|
|
|
.. versionadded:: 1.3.0 |
|
|
|
Returns |
|
------- |
|
:class:`RDD` |
|
|
|
Examples |
|
-------- |
|
>>> df = spark.range(1) |
|
>>> type(df.rdd) |
|
<class 'pyspark.core.rdd.RDD'> |
|
""" |
|
... |
I read the reasoning on the PR #45053. In addition, I think it makes sense to add logic for static typecheckers.
Example
_ = df.rdd.flatMap(some_fn)
Typechecker doesn't know if rdd is the callable property rdd or a Column returned by __getattr__.
Calling flatMap() will make the typechecker throw an error because it doesn't know if the Callable or Column was returned.
Tested with https://github.com/astral-sh/ty and pyspark==4.1.2.
Description
A static typechecker can't evaluate
is_remote_only()which makes the type annotation of relevantDataFramemethods be aUnionof the property method or aColumn(because of__getattr__).spark/python/pyspark/sql/dataframe.py
Lines 173 to 191 in 4a61083
I read the reasoning on the PR #45053. In addition, I think it makes sense to add logic for static typecheckers.
Example
Typechecker doesn't know if
rddis the callable propertyrddor aColumnreturned by__getattr__.Calling
flatMap()will make the typechecker throw an error because it doesn't know if theCallableorColumnwas returned.Tested with https://github.com/astral-sh/ty and
pyspark==4.1.2.