feat: setup Baseline profiles#20702
Conversation
57a4fee to
bc00bd3
Compare
|
As I mentioned in the issue on my device (Pixel 10) this is around 5% but if there would be a low spec device then it should have a better result so we can expect around 5-10% improvement imo |
|
Please add more comments to the docs in the commit messages |
bc00bd3 to
bc0149c
Compare
5fcd050 to
c915ac5
Compare
david-allison
left a comment
There was a problem hiding this comment.
LGTM! Confirmed DeckPicker:onResume is called.
Maybe add a TODO to add a test for this?
SPLcom/ichi2/anki/DeckPicker;->onResume()V
c915ac5 to
2e2ce73
Compare
|
This is likely going to have major conflicts with:
Please feel free to assign this to me, there was a lot of nastiness in both PRs |
|
I tried to rebase but it seems it would need more resolving, thankyou @david-allison |
2e2ce73 to
303b044
Compare
david-allison
left a comment
There was a problem hiding this comment.
Please add a readme to the module
303b044 to
b9c87cf
Compare
|
|
||
| - Physical Android device, API 28+ (emulators aren't trustworthy) | ||
| - USB debugging on, device plugged in | ||
| - `adb uninstall com.ichi2.anki` first if you hit a signing-key error |
There was a problem hiding this comment.
Do we /need/ to use com.ichi2.anki, or could we get away with .debug so the maintainer doesn't need to uninstall the app
There was a problem hiding this comment.
Release only. Baseline profiles have no effect on debug builds.
There was a problem hiding this comment.
Debug APKs aren't minified, aren't processed by R8, and the runtime doesn't apply AOT compilation to debuggable processes. The profile would just sit in the APK as dead weight.
There was a problem hiding this comment.
This isn't about a debug app (.A is just as relevant, sorry for the confusion).
Using com.ichi2.anki is the production app identifier, and would mean that I'm having to uninstall the app which runs on my live collection on my main phone for a development task.
There was a problem hiding this comment.
This should be changed if possible to have the interpolated app name - Brayan set up infra to make that possible in the past, I'll try to drop a comment on the specific affected line in the gradle files
There was a problem hiding this comment.
Ok, tthe hardcoded fallback in targetAppId is gone (it now errors instead of silently defaulting), and the README documents the parallel-install workflow
| ## When to regenerate | ||
|
|
||
| When the startup path changes meaningfully — new init work in | ||
| `AnkiDroidApp.onCreate`, a refactor to `IntentHandler` or `DeckPicker`, | ||
| a new dependency wired into `Application`. Otherwise, leave it alone; | ||
| the profile is device- and OS-agnostic, one commit covers everyone. |
There was a problem hiding this comment.
Is it worth adding an annotation for this case, to flag to developers that changes may be necessary
Is it worth enforcing that this is run before a public release?
There was a problem hiding this comment.
Not worth it IMO since developer will be concerned with their change, so a change to DeckPicker rarely touches the cold-start path meaningfully.
There was a problem hiding this comment.
👍
Is it worth enforcing that this is run before a public release?
I suspect this is a more interesting question
There was a problem hiding this comment.
very interesting -
- the onCreate "you should regenerate baselines" thing is interesting, note that we just touched onCreate in order to set up widget reminders on boot, and I'd be shocked if in PRs similar to that one the person remembered. Bears some thought 🤔
- when to run / run before release builds - would be interesting to perhaps check for a change in the version code of format "switching from alpha to beta" or "beta to public release" - if that digit in the versionCode changed, the android test release build run could re-run baselines and...do I don't know what with the information - similar to APK size check I suppose it could post a comment on the commit line with the information and a link to git blame where you could click and see the last commit with the last baseline commit message from it with baseline timing ? Just a brainstorm
Anyway, will research this a bit, and as a dev / CI feature will likely at least get it merged in some form even if imperfect now so we can start "feeling" it and see how it works, what we really want out of it
There was a problem hiding this comment.
Having read the documentation and this PR and these comments I think the ideal path forward is to connect to a test lab that is CI accessible but has real hardware devices and a free tier or is free for open source.
I have not looked extensively but Firebase Test Lab has some free runs per day on their free tier and I could create an AnkiDroid project there to take advantage of that, in the absence of some better idea. If it goes away, no big deal.
But assuming it stays we could select our idea of an "old phone" and a "new phone* (over time new phone becomes old phone and you pick new phone but you'll maintain comparables between each pair for a while to see if anything goes seriously wrong), and start collecting metrics over time (TBD: somehow? that was CI available in order to graph over time?) and we could run it on a schedule or on push to main but only if versionCode changed to capture change-over-time-for-releases ?
Not sure how to hook Firebase Test Lab up to CI but they sell that service so docs should be commercial-grade
Not sure how to persist state (performance over time each versionCode change) in a CI-accessible way but there must be some way ? Maybe even a wiki page with entries and a script-generated graph of performance over time, and the CI script edits it or something
mikehardy
left a comment
There was a problem hiding this comment.
I think the app name thing is a worthy change in this same PR, I think it's table stakes for this to work
I think the thoughts about generating it over time in CI with a test lab is a splittable item for later
| * | ||
| * **This benchmark is for local, manual use only.** It requires a connected | ||
| * physical device, has no pass/fail assertions, and should not be included | ||
| * in CI test suites. Run with: |
There was a problem hiding this comment.
| configure<TestExtension> { | ||
| namespace = "com.ichi2.anki.baselineprofile" | ||
|
|
||
| compileSdk = |
There was a problem hiding this comment.
cross-linking with related PR for boilerplate-ness:
There was a problem hiding this comment.
I'll rebase and see what boilerplate in baselineprofile/build.gradle.kts can be consolidated. Not blocking this PR. Or David can if this gets in first, all good for now
| // `play` flavor via `missingDimensionStrategy`, which keeps this module | ||
| // flavor-free but still able to resolve `:AnkiDroid` variants. |
There was a problem hiding this comment.
what a coincidence, the documentation for macrobenchmark was also important, but flavorless
| - **`StartupBenchmark`** measures cold-start time with vs without the | ||
| profile so you can see the impact. | ||
|
|
||
| Neither runs in CI. Both are manual tools for maintainers. |
There was a problem hiding this comment.
related to my firebase test labs comment I think this is an assertion based on actual information and exposing the actual information is useful
specifically I would alter this perhaps like
| Neither runs in CI. Both are manual tools for maintainers. | |
| For results to be valid for comparisons and real-world impact, they must be run on real hardware devices - ideally the same device over time and representative of typical user hardware. So this is currently not intended to be used in CI unless CI is connected to real devices (e.g. Firebase Test Lab or similar) |
|
|
||
| - Physical Android device, API 28+ (emulators aren't trustworthy) | ||
| - USB debugging on, device plugged in | ||
| - `adb uninstall com.ichi2.anki` first if you hit a signing-key error |
There was a problem hiding this comment.
This should be changed if possible to have the interpolated app name - Brayan set up infra to make that possible in the past, I'll try to drop a comment on the specific affected line in the gradle files
| ## When to regenerate | ||
|
|
||
| When the startup path changes meaningfully — new init work in | ||
| `AnkiDroidApp.onCreate`, a refactor to `IntentHandler` or `DeckPicker`, | ||
| a new dependency wired into `Application`. Otherwise, leave it alone; | ||
| the profile is device- and OS-agnostic, one commit covers everyone. |
There was a problem hiding this comment.
Having read the documentation and this PR and these comments I think the ideal path forward is to connect to a test lab that is CI accessible but has real hardware devices and a free tier or is free for open source.
I have not looked extensively but Firebase Test Lab has some free runs per day on their free tier and I could create an AnkiDroid project there to take advantage of that, in the absence of some better idea. If it goes away, no big deal.
But assuming it stays we could select our idea of an "old phone" and a "new phone* (over time new phone becomes old phone and you pick new phone but you'll maintain comparables between each pair for a while to see if anything goes seriously wrong), and start collecting metrics over time (TBD: somehow? that was CI available in order to graph over time?) and we could run it on a schedule or on push to main but only if versionCode changed to capture change-over-time-for-releases ?
Not sure how to hook Firebase Test Lab up to CI but they sell that service so docs should be commercial-grade
Not sure how to persist state (performance over time each versionCode change) in a CI-accessible way but there must be some way ? Maybe even a wiki page with entries and a script-generated graph of performance over time, and the CI script edits it or something
| val artifactsLoader = v.artifacts.getBuiltArtifactsLoader() | ||
| v.instrumentationRunnerArguments.put( | ||
| "targetAppId", | ||
| v.testedApks.map { artifactsLoader.load(it)?.applicationId ?: "com.ichi2.anki" }, |
There was a problem hiding this comment.
this...may work but should really be tested with -PcustomSuffix=".A" -PcustomName="AnkiDroid ParallelA" (or similar) to make sure, and if not it should use the style here to make sure it does work
Otherwise developers will never want to do this as it'll nuke their actual app install
|
For the avoidance of doubt I want to say I love this though - I read about baseline profiles when they came out, and the macrobenchmark (and microbenchmark...) libraries when they came out and thought "man, if I had time I'd love to do this" you did it! |
|
I'll push back with a question: should we split out the CI runner into a separate issue/PR? It feels like scope creep As far as I understand it (once we get the version name/docs sorted out), this is a win on its own, a somewhat stale profile is better than no profile, and setting up CI to run this properly on Firebase might take a while to get right. |
I agree and indeed said so explicitly:
...trying to do that here would make this PR kind of ridiculous IMHO |
- BaselineProfileGenerator: Automates the critical user journey from home screen to DeckPicker to capture startup traces. - StartupBenchmark: Uses Macrobenchmark to compare "None" vs. "Baseline Profile" compilation modes. This lets us isolate and measure the speed boost we get from install-time AOT.
Assisted-by: Claude Opus 4.6
b9c87cf to
97e925d
Compare
|
Verified parallel build it works fine |

Note
Assisted-by: Claude Opus 4.6 - writing readme
Purpose / Description
This PR setsups Baseline Profiles for the app i.e. macrobenchmark producer module that generates AnkiDroid's baseline profile and measures cold-start time.
Fixes
Approach
See commits
How Has This Been Tested?
Learning (optional, can help others)
Note: I think these are the casis where we want to refenerate the profiles
baseline~none(profile has drifted).Checklist
Please, go through these checks before submitting the PR.