Skip to content

PKL Config Scala#1076

Open
andyglow wants to merge 1 commit intoapple:mainfrom
andyglow:add-pkl-config-scala
Open

PKL Config Scala#1076
andyglow wants to merge 1 commit intoapple:mainfrom
andyglow:add-pkl-config-scala

Conversation

@andyglow
Copy link
Copy Markdown

@andyglow andyglow commented May 23, 2025

Purpose

Opens the road to first-class Scala support in Pkl. This PR adds pkl-config-scala — the binding/runtime layer that lets hand-authored Scala case classes consume Pkl configs. A follow-up PR will add pkl-codegen-scala (the source generator that pairs with the existing pkl-codegen-java / pkl-codegen-kotlin), closing the loop so users can either hand-write or generate their Scala types.

How it looks:

import org.pkl.config.java.ConfigEvaluator
import org.pkl.config.scala.syntax._

case class AppConfig(name: String, port: Int, tags: Seq[String])

val config: AppConfig = ConfigEvaluator
  .preconfigured()
  .forScala()
  .evaluate(ModuleSource.text("""
    |name = "orders"
    |port = 8080
    |tags = List("beta", "team-a")
    |""".stripMargin))
  .to[AppConfig]

What's covered

Pkl type Scala target
class Foo { x: Int } case class Foo(x: Int)
String-literal union (what Pkl codegen emits as an enum elsewhere) scala.Enumeration — both plain (val A, B = Value) and nested-Val forms
String, Int, Boolean, Float String, Int/Long, Boolean, Double (via built-in JVM mapping)
Duration scala.concurrent.duration.Duration, FiniteDuration
Int (epoch) / String (ISO-8601) java.time.Instant
String / Regex scala.util.matching.Regex
Pair<A, B> (A, B) / Tuple2[A, B]
List<T>, Set<T>, Listing<T>, Collection<T> Seq[T]
Map<K, V>, Mapping<K, V> immutable.Map[K, V]
Nullable T? Option[T]

Intentionally not covered

These route through a ConversionException; users with hand-authored types register their own converter factories:

  • Scala Set[T], Vector[T], mutable collections — matches the minimal default set in pkl-config-java / pkl-config-kotlin.
  • Scala 3 enum Foo { case A, B, C } — deferred; the current resolver targets Scala 2 scala.Enumeration. A follow-up can add a parallel path.

Architectural decisions

Two coupled decisions drive the bulk of the non-obvious code. Calling them out because they break from the common infrastructure pattern in pkl-config-java.

1. Narrow use of scala-reflect for enum resolution

Problem. Pkl codegen compiles a string-literal union into a language-native enum in every target. For Scala 2, the natural target is scala.Enumeration, and the user writes case class Palette(color: SimpleEnum.Value). But SimpleEnum.Value is a path-dependent type: Java reflection erases it to scala.Enumeration$Value, a class shared across every Enumeration subclass in the JVM. At the point where a ConverterFactory is asked for a converter, we've lost the information needed to know which Enumeration to look up.

Approaches considered and rejected.

  1. Pre-substitute resolved Value instances into the PObject's property map and delegate to stock PObjectToDataObject. Dead — PClassInfo.forValue(...) throws on any non-Pkl value.
  2. Swap enum-param types for a sentinel ParameterizedType that carries the Enumeration reference. Dead — ValueMapperImpl.getConverter normalises custom Type subclasses back to their raw Class via Reflection.getExactSubtype(...) before dispatching to factories, stripping any embedded information.
  3. Upstream patch in pkl-config-java (making create non-final or adding a per-param hook). Out of scope for this PR.

What we landed on. A narrow use of Scala runtime reflection, surfaced in ConstructorParamResolver:

  • Walks a case class's primary constructor parameter list once per Class, using scala.reflect.runtime.universe to recover path-dependent Enumeration#Value types and their originating Enumeration module.
  • Emits a Seq[Param] where each param is pure data:
    case class Param(index: Int, name: String, tpe: Param.Type)
    sealed trait Param.Type { def jvmType: java.lang.reflect.Type }
    case class Jvm(jvmType) extends Param.Type
    case class ScalaEnum(jvmType, members: List[Enumeration#Value]) extends Param.Type
    A Param describes everything statically known about the slot; no converters, no runtime state — inspectable in a debugger or a log line.
  • scala-reflect mirror results are cached in a TrieMap[Class[?], Map[Int, Enumeration]]; mirror calls are wrapped in universe.synchronized because scala-reflect is not thread-safe by default.
  • Scala-reflect is used only for enum detection. Nothing else in the pipeline touches it.

2. Fork of PObjectToDataObject.ConverterImpl

Problem. Even with ConstructorParamResolver telling us "param 3 is a ScalaEnum of {Aaa, Bbb, Ccc}," the stock PObjectToDataObject.ConverterImpl.convert dispatches per-param through ValueMapper.getConverter(PClassInfo.String, paramType). The paramType arriving at the factory is still the erased scala.Enumeration$Value — no generic factory can recover the specific enumeration from it. And the stock create(...) is final, the inner ConverterImpl is private, so we can't interpose.

What we landed on. ScalaPObjectToCaseClass re-implements the ctor-mapping loop in Scala. The dispatch is a direct pattern match on Param.Type:

  • Param.Type.Jvm → delegates to a per-param CachedSourceTypeInfo (already present in the module for collection-factory caching). This memoises the expensive ValueMapper.getConverter lookup atomically.
  • Param.Type.ScalaEnum → direct linear scan over pre-resolved members via a small helper function. No external dispatch means no cache is needed.

Trade-offs.

  • Maintenance burden. ~150 lines of logic duplicated from pkl-config-java/PObjectToDataObject.ConverterImpl.convert. If upstream gains features (new annotation support, etc.), we'll need to port them. This is the biggest structural cost of the PR.
  • Alternative considered: upstream patch. Making create non-final or exposing a hook would let this fork collapse back into a small subclass override. Open to doing that if maintainers prefer — would shrink this PR considerably.
  • Scala 3. The ConstructorParamResolver uses the Scala 2 reflection API (scala.reflect.runtime.universe). A Scala 3 migration will need a parallel resolver. Deferred.

Supporting decisions (lower-stakes)

  • Thread-safe CachedSourceTypeInfo — cache entries are AtomicReference[Entry(classInfo, converter)] so the paired fields update coherently, without the tearing risk of separate vars.
  • Default converter set trimmed to match pkl-config-java / pkl-config-kotlin (Seq, Map, Option, Pair, case class, enum). Users register their own for Set / Vector / mutable collections.
  • Scala-native types preferred internally (TrieMap over ConcurrentHashMap, Scala Option + .toJava at Java boundaries, Seq[Param] at API surfaces with Vector at construction time). Java types kept only where the API demands them (java.lang.reflect.*, MethodHandle, Optional return).
  • Scala 3 syntax adopted via -Xsource:3 + scalafmt's runner.dialect = scala3. The module compiles on Scala 2.13 today; the syntax migration reduces churn whenever the rest of the codebase moves to Scala 3.

@andyglow andyglow force-pushed the add-pkl-config-scala branch from c101308 to bb90a58 Compare October 15, 2025 17:52
Copy link
Copy Markdown
Member

@bioball bioball left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Made a first pass. Thanks for the contribution, this looks like a great start!

rootProject.file("buildSrc/src/main/resources/license-header.star-block.txt"),
"package ",
)
}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move this to pklScalaLibrary.gradle.kts

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Comment thread build-logic/src/main/kotlin/pklScalaLibrary.gradle.kts
val libs = the<LibrariesForLibs>()

dependencies {
api(libs.scalaLibrary)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove (this is redundant when setting scala.scalaVersion)

Suggested change
api(libs.scalaLibrary)

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed api(libs.scalaLibrary) — redundant now that the plugin wires it in.

Comment on lines +34 to +38
private var classInfo: PClassInfo[Any] =
PClassInfo.Unavailable.asInstanceOf[PClassInfo[Any]]

// Holds an optional converter, cached upon first retrieval.
private var converter: Option[Converter[Any, Any]] = None
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These should be thread-safe, because ValueMappers can possibly be shared across threads.

Maybe put inside AtomicReference?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Merged the two vars into a single AtomicReference[Entry(classInfo, converter)] so the paired state updates coherently — no risk of a reader seeing a new classInfo with a stale converter. Under contention, threads may each compute a fresh converter (wasted getConverter lookup), but never see a torn state.

override def create(
sourceType: PClassInfo[_],
targetType: Type
): Optional[Converter[_, _]] = {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two comments:

  1. This should return Optional.empty if sourceType is not PClassInfo.String
  2. The return type should be Optional[Converter[String, Enumeration]] right? Is there any case where the mapped value wouldn't be an Enumeration?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both points addressed, but the final shape diverged from the initial refactor. Enum params no longer go through the generic ConverterFactory.create path at all — they're dispatched directly via pattern match on Param.Type.ScalaEnum inside ScalaPObjectToCaseClass (see #12). So the question of whether to tighten the Optional[Converter[_, _]] signature is moot: the wildcard never participates in enum resolution anymore. The inner lookup is strongly typed: String => Enumeration#Value.

// .forParametrizedType1[java.util.Collection[_], Array[_]](
// x => x == PClassInfo.Collection | x == PClassInfo.Set | x == PClassInfo.List,
// pCollectionToMutableCollectionConv[Array](_.iterator.toArray[Any])
// )
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove unused code

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed — both the commented-out block and its entry in all.

Comment thread pkl-config-scala/NOTE.md Outdated
* Some(list of `Enumeration#Value`) if the enumeration can be resolved,
* None otherwise
*/
def asCustomEnum: Option[List[Enumeration#Value]] = {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not totally following the code here.

Why do we need to care about case classes here? Can we expect enums to simply look like this instead?

object SimpleEnum extends Enumeration {
  val Aaa, Bbb, Ccc = Value
}

And we can just use .values to access enum members during conversion.

Pkl only turns string literal unions into enums in every other language; this means that we don't need our enum values to be parameterized.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done — and the annotation is deleted, not just optional. Both forms work now:

// plain — previously unsupported
object Color extends Enumeration { val Red, Green, Blue = Value }
case class Palette(primary: Color.Value)

// nested-Val form, still supported, no annotation needed
object Shape extends Enumeration {
  case class V() extends Val(nextId)
  val Circle, Square = V()
}
case class Drawing(shape: Shape.V)

Getting there required more than .getDeclaringClass. For the plain form the runtime class is scala.Enumeration$Value, shared across every Enumeration in the JVM — Java reflection can't distinguish owners. I explored three workarounds before settling on the current approach:

  1. Pre-substitute resolved Value instances into the PObject's property map and delegate. Dead — PClassInfo.forValue throws on non-Pkl values.
  2. Swap enum-param Types for a sentinel ParameterizedType carrying the Enumeration reference. Dead — ValueMapperImpl.getConverter calls Reflection.getExactSubtype(...) before factory dispatch, which normalises custom Types back to their raw class.
  3. Upstream patch (make create non-final / add a per-param hook). Out of scope for this PR.

What we landed on:

  • ConstructorParamResolver uses scala.reflect.runtime.universe only to walk ctor params and recover the path-dependent Enumeration#Value prefix. It emits pure-data Params with a sealed Param.Type { Jvm | ScalaEnum }.
  • ScalaPObjectToCaseClass forks PObjectToDataObject.ConverterImpl and dispatches on Param.Type directly. For ScalaEnum it bypasses ValueMapper.getConverter (which can't see past the erasure) and does a direct member lookup.

The fork is the main cost — roughly 150 lines duplicated from PObjectToDataObject.ConverterImpl.convert. Happy to collapse it if you'd accept an upstream patch exposing a per-param hook; that would reduce this module to a slim subclass. Architectural details are in the updated PR description.

@andyglow andyglow force-pushed the add-pkl-config-scala branch 3 times, most recently from 5a85162 to 023c5be Compare May 2, 2026 20:38
@andyglow andyglow force-pushed the add-pkl-config-scala branch from 023c5be to c8a7a01 Compare May 2, 2026 21:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants