Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
248 commits
Select commit Hold shift + click to select a range
dc9afbb
chore: add expect_stat, expect_single_stat in GetStat trait (#3126)
broccoliSpicy Nov 26, 2024
39222ec
feat: support write multi fragments or empty fragment in one spark ta…
SaintBacchus Dec 2, 2024
dc8f0f6
fix: full text search may produce dup results when search over multip…
BubbleCal Dec 3, 2024
3d3ebf2
fix: fix typing for _write_fragment (#3171)
chenkovsky Dec 3, 2024
c5a1382
ci(java): introduce spotless-maven-plugin (#3193)
yanghua Dec 3, 2024
574b7d0
fix: fix storage options for dataset builder (#3156)
chenkovsky Dec 4, 2024
e0bf62a
chore: add .java-version to .gitignore for java module (#3197)
yanghua Dec 4, 2024
0c2b70a
feat: add drop to dataset (#3184)
chenkovsky Dec 4, 2024
6edb1b8
fix: fix storage options for ray (#3164)
chenkovsky Dec 4, 2024
75d526e
chore: fix warnings on rust 1.83 (#3202)
eddyxu Dec 4, 2024
955749e
feat: upgrade arrow (to 53) & datafusion (to 42) (#3201)
westonpace Dec 4, 2024
6e84834
Bump version
Dec 4, 2024
6c7b9fd
perf: in-register lookup table & SIMD for 4bit PQ (#3178)
BubbleCal Dec 5, 2024
f21397d
feat: enhance repdef utilities to handle empty / null lists (#3200)
westonpace Dec 5, 2024
1e349cd
fix!: correctly handle nulls in btree and bitmap indices (#3211)
westonpace Dec 6, 2024
970e7d5
docs: add the documentation about how to install packages for tests (…
yanghua Dec 6, 2024
276a284
feat: let pylance use sub-level logger of logging (#3206)
yanghua Dec 6, 2024
e4ab9a8
feat: support _rowid meta column for spark connector in java (#3194)
SaintBacchus Dec 6, 2024
0e35ef6
ci: update python/Cargo.log on version bump (#3207)
westonpace Dec 6, 2024
84c6fc0
chore: remove cuvs and pylibraft (#3214)
eddyxu Dec 7, 2024
4444c60
feat!: support hamming distance & binary vector (#3198)
BubbleCal Dec 7, 2024
f1c6c3e
feat: support blob api in pytorch loader (#3217)
eddyxu Dec 8, 2024
df640c4
chore: configure the spotless maven plugin to format Scala code (#3219)
yanghua Dec 9, 2024
5ff966d
feat(python): add experimental parameter `enable_move_stable_row_ids`…
SaintBacchus Dec 9, 2024
10c31b3
feat: add the repetition index to the miniblock write path (#3208)
westonpace Dec 10, 2024
ef9d0c2
docs: add doc and test for 4bit PQ (#3212)
BubbleCal Dec 10, 2024
c4cb87a
feat: packed struct encoding (#3186)
broccoliSpicy Dec 10, 2024
faf776d
fix: test failure in `test_fsl_packed_struct` (#3227)
broccoliSpicy Dec 10, 2024
7ec23f0
feat: support between sql clauses (#3225)
connellPortrait Dec 10, 2024
1c8d406
feat(java): support drop columns for dataset (#3237)
yanghua Dec 12, 2024
d5afc0a
feat(java): expose uri method for Dataset instance (#3231)
yanghua Dec 12, 2024
00d1e84
fix: remove overzealous warning (#3239)
westonpace Dec 12, 2024
d3a4bc1
chore: remove unused import (#3242)
westonpace Dec 13, 2024
99ae761
fix: correctly copy null buffer when making deep copy (#3238)
westonpace Dec 13, 2024
679b93c
feat: add file statistics (#3232)
broccoliSpicy Dec 13, 2024
c310aee
feat: enable tracing for object storage (#3244)
wjones127 Dec 13, 2024
6203435
chore: remove legacy C plugin integration (#3243)
westonpace Dec 14, 2024
83b8efd
docs: blob api documents (#3247)
eddyxu Dec 14, 2024
1a12c21
feat(java): support limit and offset interface for spark connector (#…
SaintBacchus Dec 16, 2024
7fe14ea
fix: allow LANCE_LOG to be set to trace (#3246)
westonpace Dec 16, 2024
64fcfcc
feat: adds list decode support for mini-block encoded data (#3241)
westonpace Dec 17, 2024
f2906cf
fix: list indices always shows vector index type is IVF_PQ even it's …
BubbleCal Dec 17, 2024
8a16e2e
feat(java): support topn pushdown in spark connector (#3261)
SaintBacchus Dec 17, 2024
b1ab748
feat: add replace_schema_metadata and replace_field_metadata (#3263)
westonpace Dec 17, 2024
d038e34
feat: merge-insert supports inserting subset of columns (#3100)
wjones127 Dec 18, 2024
ae36abe
fix: panic when get stats from index over binary vectors (#3267)
BubbleCal Dec 18, 2024
95f98b3
feat: support merge by row_id, row_addr (#3254)
chenkovsky Dec 18, 2024
a07717a
fix(rust): adjust scan range to avoid unnecessary warnings (#3248)
takaebato Dec 18, 2024
6cd6ae8
feat: add the s3 retry config options for storage option (#3268)
SaintBacchus Dec 18, 2024
70f246e
ci(java/scala): make spotless maven plugin auto-format in validate ph…
yanghua Dec 19, 2024
5cbb59d
docs: add java module into directory structure (#3273)
yanghua Dec 19, 2024
2b29487
feat(java): support alter columns for dataset (#3259)
yanghua Dec 20, 2024
72ae355
feat: support remapping for IVF_FLAT, IVF_PQ and IVF_SQ (#2708)
BubbleCal Dec 20, 2024
10e6454
refactor(python)!: simplify marshalling of `Fragment`, `DataFile`, `O…
wjones127 Dec 20, 2024
022135b
feat: change MSRV from 1.78 to 1.80.1 (#3279)
westonpace Dec 20, 2024
805438f
fix: when taking struct fields they should be merged into the output …
westonpace Dec 20, 2024
efdea24
fix: full text search with limit may return an incorrect results (#3284)
BubbleCal Dec 23, 2024
c40164b
fix: refine type annotation (#3278)
chenkovsky Dec 23, 2024
ae70478
feat: support merge fragment with dataset (#3256)
chenkovsky Dec 23, 2024
d06488e
Bump version
Dec 23, 2024
877b018
ci(python): type checking with pyright (#3286)
wjones127 Dec 24, 2024
c6fcb31
chore(java): remove some supported TODO and add allow_http for storag…
SaintBacchus Dec 24, 2024
bcb040e
fix: fix pyproject.toml (#3299)
chenkovsky Dec 26, 2024
3a47444
ci: allow bencher benchmarks to be executed with workflow_dispatch (#…
westonpace Dec 27, 2024
38a0a92
feat: cache btree sub-index pages (#3309)
westonpace Dec 27, 2024
11f6e26
feat(java): support spark in predict push down to lance scan (#3314)
SaintBacchus Dec 30, 2024
7363a53
fix: is not false crash (#3298)
chenkovsky Dec 31, 2024
2092808
fix: default value is overwritten (#3319)
chenkovsky Dec 31, 2024
898396d
feat(py): support count rows with filter in a fragment (#3318)
eddyxu Dec 31, 2024
8767c10
feat(java): support take api for java module (#3316)
yanghua Dec 31, 2024
783bc12
fix: lance ray sink crash when fields contain none (#3322)
Jay-ju Jan 1, 2025
33c45c8
feat(java): support overwrite for spark connector (#3313)
SaintBacchus Jan 1, 2025
6e7010a
ci(python): add typecheck for lance/debug.py,tracing.py,dependencies.…
yanghua Jan 2, 2025
c0c1b53
feat: add global counters for bytes_read & iops for benchmarking util…
westonpace Jan 2, 2025
397dc27
fix: allow empty scalar indices and don't drop nulls on update (#3329)
westonpace Jan 3, 2025
8585207
perf: parallelize indexing partitions (#3303)
BubbleCal Jan 3, 2025
39f12dc
feat: vector search with distance range (#3326)
BubbleCal Jan 3, 2025
aad48df
feat: add utility for reporting data stats (#3328)
westonpace Jan 3, 2025
8fe7147
feat: cache miniblock metadata (#3323)
westonpace Jan 3, 2025
8a23d50
feat(java): support statistics row num for lance scan (#3304)
SaintBacchus Jan 3, 2025
f730f75
fix: coerce scalar for between (#3327)
chenkovsky Jan 3, 2025
45fde4c
ci(java/scala): auto check and insert unified license header (#3296)
yanghua Jan 3, 2025
dbf9139
feat: support with_rowaddr for spark (#3336)
chenkovsky Jan 4, 2025
1d40479
feat(java): support get real data size for lance spark statistics int…
SaintBacchus Jan 5, 2025
f621115
feat(java): support add columns via sql expressions (#3287)
yanghua Jan 7, 2025
a6aadaf
feat: move fsl handling to structural encodings and add support for m…
westonpace Jan 7, 2025
3823d75
fix(java): replace org.json with gson to resolve the jar conflict wit…
SaintBacchus Jan 7, 2025
ed8e76f
fix: avoid double-take in some scenarios (#3357)
westonpace Jan 7, 2025
5a18b14
feat: support lindera for japanese and korea tokenization (#3218)
chenkovsky Jan 7, 2025
fc74654
feat: add support for repetition index to the full zip structural enc…
westonpace Jan 7, 2025
c9bb25d
feat: support IVF_FLAT and hamming in pylance (#3301)
BubbleCal Jan 8, 2025
94e7bf9
feat!: support multivector type (#3190)
BubbleCal Jan 8, 2025
397edeb
feat: allow blob in `write_fragments` (#3235)
fecet Jan 8, 2025
64adfea
fix: handle deletions in take (#3360)
wjones127 Jan 9, 2025
837ac24
refactor(java): simpilfy fragment (#3307)
chenkovsky Jan 9, 2025
8db5943
fix: fix ray lance sink error (#3230)
Jay-ju Jan 10, 2025
226d86f
ci(java/scala): introduce auto code style check and fix exists issues…
yanghua Jan 10, 2025
f478c46
feat: make it possible to build lance without protoc (except on Windo…
westonpace Jan 10, 2025
29db3bb
fix: scan out of range (#3339)
chenkovsky Jan 10, 2025
69d4610
feat: log the number of rows we were able to sample (#3367)
westonpace Jan 10, 2025
2142594
fix: cast null arrays to the appropriate type when coercing to a tabl…
andrijazz Jan 10, 2025
167494c
ci: add cargo-deny (#3370)
kemingy Jan 13, 2025
47b0b6c
chore: better error message for unsupported data type (#3371)
BubbleCal Jan 13, 2025
cfeece4
fix(python): correct type hint for `write_fragments()` (#3373)
chenkovsky Jan 13, 2025
cf49205
feat: upgrade datafusion to 44.0 (#3341)
westonpace Jan 13, 2025
6e76529
feat: `execute_uncommitted` for merge insert (#3233)
wjones127 Jan 13, 2025
c58cfed
Bump version
Jan 13, 2025
26eb471
ci: additional disk space for rust release (#3375)
wjones127 Jan 13, 2025
4d77d7b
fix: json schema serializes field metadata (#3379)
Jan 14, 2025
5258717
chore(java): enlarge release timeout (#3380)
LuQQiu Jan 14, 2025
b572905
feat: enable all datafusion functions (#3381)
westonpace Jan 14, 2025
bd04392
fix: flat FTS would return all unindexed rows (#3386)
BubbleCal Jan 15, 2025
7cc43a7
feat: support float16/float64 for multivector (#3387)
BubbleCal Jan 15, 2025
9f7e012
fix: updating schema/field metadata now retains fragments (#3384)
Jan 15, 2025
8b8b8c8
feat: add drop_index (#3382)
westonpace Jan 15, 2025
4149457
chore: expose utils for infering vector dim and element type (#3385)
BubbleCal Jan 16, 2025
62a2256
fix: full text search index may be corrupted after remapping (#3388)
BubbleCal Jan 16, 2025
bae235d
feat: add an all null column as a metadata-only operation (#3391)
Jan 17, 2025
4cb37c1
fix: handle the possibility that serialize_expressions returns a memo…
westonpace Jan 20, 2025
2b784b3
fix!: delta index fragment bitmaps contained previous index coverage …
wjones127 Jan 20, 2025
7f60aa0
perf: avoid re-alloc on assigning PQ (#3399)
BubbleCal Jan 21, 2025
3cb54c6
fix: merge_insert with subcols sometimes outputs unexpected nulls (#3…
wjones127 Jan 22, 2025
3c82243
ci: fix Python Arm build (#3409)
wjones127 Jan 23, 2025
aae351b
perf: skip shuffling if there is only 1 partition (#3405)
BubbleCal Jan 23, 2025
3f26e60
fix: ensure that 'block_size' parameter is properly propagated in the…
vjc578db Jan 23, 2025
82464b3
ci: use ARM runner for Python ARM release builds (#3411)
wjones127 Jan 23, 2025
a3434ca
fix(rust): loosen bytemuck pin (#3413)
wjones127 Jan 23, 2025
43cd830
chore: fix clippy lints (#3414)
wjones127 Jan 23, 2025
6432a6b
fix: don't compare metadata in merge insert to detect if partial sche…
westonpace Jan 23, 2025
5a92d31
feat: finish up variable-length encodings in the full-zip path (#3344)
westonpace Jan 25, 2025
58c5e27
fix: support fp16 type in SQ (#3417)
chebbyChefNEQ Jan 26, 2025
6d77d14
fix: move IO tasks off of CPU runtime in merge_insert (#3420)
wjones127 Jan 28, 2025
bfacd7c
fix: filter out null values when sampling for index training (#3404)
wjones127 Jan 28, 2025
66b99fb
feat: add testing of string/binary to 2.1 full-zip encoding and fix b…
westonpace Jan 28, 2025
7c34f14
ci(java): timeout test jobs after an hour (#3421)
wjones127 Jan 28, 2025
7aa7d94
fix: handle null vectors in flat search (#3422)
wjones127 Jan 28, 2025
a7c5216
chore: downgrade backpressure warning to a debug log message (#3392)
westonpace Jan 28, 2025
c58814a
fix: avoid divide-by-zero when training an index with a large dimensi…
westonpace Jan 31, 2025
d34fa95
chore: add binary array generator that generates different sized bina…
westonpace Jan 31, 2025
c73d717
feat: auto-migrate old index metadata (#3428)
wjones127 Feb 1, 2025
1ea5909
fix: bump openssl for CVE (#3431)
chebbyChefNEQ Feb 3, 2025
42722fb
feat: allow replacement of entire datafile when the schema lines up c…
chebbyChefNEQ Feb 3, 2025
2295324
chore: clean up reader coerce in fragment.py (#3432)
chebbyChefNEQ Feb 4, 2025
d62ddb0
Bump version
Feb 6, 2025
fbea65b
fix: remove extraneous padding in plain encoder (#3434)
wkalt Feb 7, 2025
8a61b69
test: assert the indexed/unindexed rows for optimizing tests (#3436)
BubbleCal Feb 11, 2025
2e2bf1a
fix: implement with_new_children for FTS (#3441)
BubbleCal Feb 11, 2025
c70d1d2
fix: don't eagerly materialize fields that the user hasn't asked for …
westonpace Feb 11, 2025
c054697
perf: make miniblock decoding cheaper (#3438)
westonpace Feb 12, 2025
7687558
feat(rust): upgrade object_store, dashmap, snafu (#3429)
wjones127 Feb 12, 2025
a6101e5
fix: allocate much memory for residual vectors than needed (#3446)
BubbleCal Feb 13, 2025
6b58bc1
fix: flat KNN column stats order doesn't match schema (#3451)
BubbleCal Feb 15, 2025
4a0fb90
feat: expose specifying scanner filters via datafusion (#3458)
westonpace Feb 18, 2025
9ea6b7e
feat(python): add files lance/schema.py, lance/file.py, lance/util.py…
renato2099 Feb 18, 2025
cca98fc
feat(java): support add columns via reader (#3456)
yanghua Feb 19, 2025
59b414b
feat: support to read IVF partitions (#3462)
BubbleCal Feb 21, 2025
f508cbd
Bump version
Feb 21, 2025
c69a5a2
fix: flat FTS panic with prefilter (#3470)
BubbleCal Feb 24, 2025
db5281c
fix: temporarily disable spilling when training indices on string col…
westonpace Feb 24, 2025
b185a27
feat: add with_new_children implementations for several nodes (#3471)
westonpace Feb 24, 2025
59d6596
feat: add support for ngram indices (#3468)
westonpace Feb 25, 2025
8f8b630
Bump version
Feb 26, 2025
f69480e
fix: scalar quantization can't work with NaNs (#3476)
BubbleCal Feb 26, 2025
f73398a
docs: fix typo in read_and_write.rst (#3479)
ascillitoe Feb 26, 2025
7f91eb0
docs: add README.md for java module (#3302)
yanghua Feb 28, 2025
6756a12
chore(api)!: remove unused param in take call (#3453)
lyang24 Feb 28, 2025
9e614b1
fix: ngram bench target not correct (#3490)
BubbleCal Feb 28, 2025
949c6e7
feat: add support for explain analyze (#3484)
wkalt Feb 28, 2025
356d132
chore: update clippy suggestions (#3495)
westonpace Mar 1, 2025
9211330
feat(java): support delete rows from the dataset (#3498)
yanghua Mar 1, 2025
33ae43b
feat: add support for empty structs to the 2.0 format (#3499)
westonpace Mar 3, 2025
89a33b7
feat: cache v3 index partitions in dataset session (#3467)
BubbleCal Mar 3, 2025
6194619
feat: add support for pickling fragment metadata (#3497)
westonpace Mar 3, 2025
a144028
perf: parallelize ngram indexing (#3501)
BubbleCal Mar 3, 2025
dca745b
feat: support add all null column as metadata-only operation via sql …
Mar 3, 2025
eb16635
fix: bypass the arrow take for struct array (#3500)
BubbleCal Mar 4, 2025
87f055f
perf: implement XTR for retrieving multivector (#3437)
BubbleCal Mar 4, 2025
5d1c84f
fix: prevent despecialization of object store methods (#3506)
wjones127 Mar 4, 2025
9888678
fix: the IVF/PQ centroids/codebook is with wrong data type if trainin…
BubbleCal Mar 5, 2025
74f0aa6
docs: include create scalar index and drop index to the top level of …
eddyxu Mar 5, 2025
9a1fdaf
feat: `ConditionalPutCommitHandler` for concurrency on S3, faster com…
wjones127 Mar 5, 2025
644213b
feat: add gcp token-based auth support (#3511)
alex766 Mar 6, 2025
3b9d546
feat!: update DataFusion to 45.0 and Arrow to 54.1 (#3503)
timsaucer Mar 7, 2025
3e3bdb9
feat: emit a trace event when a significant user file is created or d…
westonpace Mar 8, 2025
e603372
fix: pass down correct types when creating indices and items schedule…
westonpace Mar 8, 2025
b8a74ce
Bump version
Mar 8, 2025
bcf9e09
fix: the distance for multivector query is not correct (#3522)
BubbleCal Mar 10, 2025
49b67f9
Bump version
Mar 10, 2025
e12bb9e
chore: add RUSTSEC-2024-0436 to ignore list for cargo deny (#3526)
westonpace Mar 10, 2025
9175ff7
feat: write_dataset from pylist and pydict (#3527)
eddyxu Mar 10, 2025
0487ff5
docs: fix read_and_write example (#3521)
eddyxu Mar 11, 2025
f85787d
feat!: create index in v3 version by default (#3477)
BubbleCal Mar 11, 2025
4358df1
docs: organize contents into sections (#3528)
eddyxu Mar 11, 2025
c12fc3b
feat: rework how we train ngram indices for better performance (#3518)
westonpace Mar 11, 2025
eddb670
docs: update ray integration and move schema evolution doc to a separ…
eddyxu Mar 11, 2025
422c38d
docs: fix checklinks (#3532)
eddyxu Mar 11, 2025
b66f34e
docs: add example of `Dataset.insert` (#3534)
eddyxu Mar 11, 2025
53bd796
chore: suppress humantime advisory (#3529)
westonpace Mar 12, 2025
8643409
docs: update README to include new table format and format v2 blogs (…
eddyxu Mar 12, 2025
80cb78c
feat: expose make_deletions_null to python as include_deleted_rows (#…
westonpace Mar 12, 2025
9203377
perf: coalesce continuous indices into ranges if possible (#3513)
niyue Mar 12, 2025
15420d5
perf: improve v3 indexing perf (#3525)
BubbleCal Mar 13, 2025
b026158
feat: add project transaction operation for pylance sdk (#3538)
SaintBacchus Mar 13, 2025
20bda34
feat: set object store retry via environment variables (#3536)
LuQQiu Mar 13, 2025
b1e737b
docs: enable merge insert doctest (#3542)
eddyxu Mar 14, 2025
674cbf8
fix(java): java version is out of sync with rust and python (#3546)
yanghua Mar 15, 2025
f6edd3a
docs: raw distributed write (#3548)
eddyxu Mar 17, 2025
f49982c
ci: fix cross compilation of fp16 kernels (#3559)
wjones127 Mar 17, 2025
be0df4b
chore: emit warning if unstable version is used (#3558)
westonpace Mar 18, 2025
ab169e3
feat: don't log span info (#3547)
westonpace Mar 18, 2025
9719a7c
refactor: rework how take handles parallelism (#3543)
westonpace Mar 18, 2025
f7457be
docs: how to use tags (#3562)
eddyxu Mar 18, 2025
f5f8c14
fix: indexing time in unit tests is much slower than before (#3561)
BubbleCal Mar 19, 2025
ff2ab10
feat: support retrain index and incremental kmeans (#3489)
BubbleCal Mar 20, 2025
9cc68f8
fix: the PQ codes corrupted after remapping (#3573)
BubbleCal Mar 20, 2025
ddcf1f2
fix: remove some expensive debug impls (#3576)
westonpace Mar 20, 2025
2f9fcad
feat: add tracing events for I/O, index loading, and plan execution (…
westonpace Mar 20, 2025
f02095d
fix: reintroduce TakeExec.dataset method (#3577)
wkalt Mar 20, 2025
18f20c3
feat: make it possible to get the field ids from a lance_schema (#3568)
westonpace Mar 21, 2025
d74bdb2
fix(android): compilation error on android (#3555)
TD-Sky Mar 21, 2025
5cc092e
refactor(rust): fix build_predicate misleading row_ids replace to row…
yanghua Mar 21, 2025
8d163e4
chore: collect all related jars for lance spark connector when buildi…
SaintBacchus Mar 21, 2025
e8f4d98
feat(python): add warning about fork (#3584)
wjones127 Mar 22, 2025
ddb3b86
perf: improve 4bit PQ performance (#3557)
BubbleCal Mar 22, 2025
a49913f
ci: support python310 tomli (#3590)
Jay-ju Mar 24, 2025
db72d25
feat: add tracing to cleanup (#3585)
wjones127 Mar 24, 2025
32c99af
fix: work around deranged breaking change not labeled as such (#3591)
westonpace Mar 24, 2025
9dbb06a
feat: add JNI bindings for the file reader/writer (#3588)
westonpace Mar 24, 2025
babb5ab
Bump version
Mar 24, 2025
eb4680e
fix: divide by 0 error if remapping PQ storage to empty (#3596)
BubbleCal Mar 25, 2025
852b155
perf(java): cache the fragments to avoid parse the fragment json for …
SaintBacchus Mar 25, 2025
c44b74f
feat(python): support adding null columns with pyarrow field or schem…
eddyxu Mar 26, 2025
20cde3b
feat: pull gcp token from env variables (#3583)
alex766 Mar 26, 2025
33634d3
fix: schema isn't expected for IVF_PQ (#3606)
BubbleCal Mar 26, 2025
2c4fe13
fix: propagate parent span to spawned ObjectWriter tasks (#3609)
Mar 26, 2025
5891b1a
chore: remove SNAPSHOT from version (#3600)
westonpace Mar 26, 2025
d75c45c
Bump version
Mar 26, 2025
cd44f55
fix: set maximan 8 target partitions for merge insert update fragment…
LuQQiu Mar 26, 2025
47026e0
docs: add example of adding new columns with only pyarrow Field or Sc…
eddyxu Mar 26, 2025
40142fb
fix: avoid creating empty encoding task and part for PrimitiveFieldEn…
niyue Mar 27, 2025
9c9c0ad
feat: add support for fixed size binary to btree (#3613)
westonpace Mar 27, 2025
7a49e5d
docs: add spark r/w lance demo (#3574)
yanghua Mar 28, 2025
82f6560
fix: fix python format (#3608)
Jay-ju Mar 28, 2025
245a745
perf: migrate to `ManifestLocation`, add e_tag (#3592)
wjones127 Mar 28, 2025
1b6ed1a
feat: upgrade to datafusion 46 (#3618)
wjones127 Mar 29, 2025
1aa9d5a
feat: support fuzzy query and boost query (#3610)
BubbleCal Mar 31, 2025
f936f84
Update chrono and arrow
emilk Apr 1, 2025
42ea0c9
rebase
zehiko Apr 29, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
9 changes: 2 additions & 7 deletions .github/workflows/build_linux_wheel/action.yml
Original file line number Diff line number Diff line change
Expand Up @@ -69,12 +69,7 @@ runs:
args: ${{ inputs.args }}
before-script-linux: |
set -e
apt install -y unzip
if [ $(uname -m) = "x86_64" ]; then
PROTOC_ARCH="x86_64"
else
PROTOC_ARCH="aarch_64"
fi
curl -L https://github.com/protocolbuffers/protobuf/releases/download/v24.4/protoc-24.4-linux-$PROTOC_ARCH.zip > /tmp/protoc.zip \
yum install -y openssl-devel clang \
&& curl -L https://github.com/protocolbuffers/protobuf/releases/download/v24.4/protoc-24.4-linux-aarch_64.zip > /tmp/protoc.zip \
&& unzip /tmp/protoc.zip -d /usr/local \
&& rm /tmp/protoc.zip
21 changes: 15 additions & 6 deletions .github/workflows/bump-version/action.yml
Original file line number Diff line number Diff line change
Expand Up @@ -24,19 +24,28 @@ runs:
run: |
cargo install cargo-workspaces --version 0.2.44
cargo ws version --no-git-commit -y --exact --force 'lance*' ${{ inputs.part }}
- name: Update python lockfile
working-directory: python
shell: bash
run: |
cargo update -p lance
- name: Bump java version
working-directory: java
shell: bash
run: |
# Get current version
current_version=$(mvn help:evaluate -Dexpression=project.version -q -DforceStdout)
current_version=${current_version%\%}
current_version=${current_version%%}

base_version="${current_version%-*}"
if [[ "$current_version" == *-* ]]; then
suffix="-${current_version#*-}"
else
suffix=""
fi

# Split the version into components using parameter expansion
major=${current_version%%.*}
minor=${current_version#*.}
minor=${minor%%.*}
patch=${current_version##*.}
IFS=. read major minor patch <<<"$base_version"

case "${{ inputs.part }}" in
patch)
Expand All @@ -57,6 +66,6 @@ runs:
;;
esac

new_version="${major}.${minor}.${patch}"
new_version="${major}.${minor}.${patch}${suffix}"

mvn versions:set versions:commit -DnewVersion=$new_version
9 changes: 5 additions & 4 deletions .github/workflows/cargo-publish.yml
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ on:
workflow_dispatch:
inputs:
tag:
description: 'Tag to publish (e.g., v1.0.0)'
description: "Tag to publish (e.g., v1.0.0)"
required: true
type: string

Expand All @@ -19,12 +19,13 @@ env:

jobs:
build:
runs-on: ubuntu-24.04
# Needs additional disk space for the full build.
runs-on: ubuntu-2404-4x-x64
timeout-minutes: 60
env:
# Need up-to-date compilers for kernels
CC: clang-18
CXX: clang-18
CXX: clang++-18
defaults:
run:
working-directory: .
Expand Down Expand Up @@ -53,5 +54,5 @@ jobs:
- uses: albertlockett/publish-crates@v2.2
with:
registry-token: ${{ secrets.CARGO_REGISTRY_TOKEN }}
args: '--all-features'
args: "--all-features"
path: .
3 changes: 2 additions & 1 deletion .github/workflows/ci-benchmarks.yml
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
name: Run Regression Benchmarks

on:
workflow_dispatch:
push:
branches:
- main
Expand All @@ -12,7 +13,7 @@ jobs:
env:
# Need up-to-date compilers for kernels
CC: clang-18
CXX: clang-18
CXX: clang++-18
defaults:
run:
shell: bash
Expand Down
23 changes: 16 additions & 7 deletions .github/workflows/docs-check.yml
Original file line number Diff line number Diff line change
Expand Up @@ -6,12 +6,11 @@ on:
pull_request:
paths:
- docs/**
- python/python/**
- .github/workflows/docs-check.yml

env:
# Disable full debug symbol generation to speed up CI build and keep memory down
# "1" means line tables only, which is useful for panic tracebacks.
RUSTFLAGS: "-C debuginfo=1"
RUSTFLAGS: "-C debuginfo=0"
# according to: https://matklad.github.io/2021/09/04/fast-rust-builds.html
# CI builds are faster with incremental disabled.
CARGO_INCREMENTAL: "0"
Expand All @@ -26,19 +25,29 @@ jobs:
- name: Set up Python
uses: actions/setup-python@v5
with:
python-version: "3.11"
python-version: "3.12"
cache: 'pip'
cache-dependency-path: "docs/requirements.txt"
- name: Install dependencies
run: |
sudo apt install -y -qq doxygen pandoc
- name: Build python wheel
uses: ./.github/workflows/build_linux_wheel
- name: Build Python
- name: Free disk space
working-directory: python
run: |
python -m pip install $(ls target/wheels/*.whl)
python -m pip install -r ../docs/requirements.txt
sudo chown 1001:118 -R target
mv target/wheels/*.whl ./
cargo clean
- name: Build Python
working-directory: docs
run: |
python -m pip install ../python/*.whl
python -m pip install -r requirements.txt
- name: Run test
working-directory: docs
run: |
make doctest
- name: Build docs
working-directory: docs
run: |
Expand Down
12 changes: 8 additions & 4 deletions .github/workflows/docs-deploy.yml
Original file line number Diff line number Diff line change
Expand Up @@ -19,9 +19,7 @@ concurrency:
cancel-in-progress: true

env:
# Disable full debug symbol generation to speed up CI build and keep memory down
# "1" means line tables only, which is useful for panic tracebacks.
RUSTFLAGS: "-C debuginfo=1"
RUSTFLAGS: "-C debuginfo=0"
# according to: https://matklad.github.io/2021/09/04/fast-rust-builds.html
# CI builds are faster with incremental disabled.
CARGO_INCREMENTAL: "0"
Expand All @@ -47,10 +45,16 @@ jobs:
sudo apt install -y -qq doxygen pandoc
- name: Build python wheel
uses: ./.github/workflows/build_linux_wheel
- name: Free disk space
working-directory: python
run: |
sudo chown 1001:118 -R target
mv target/wheels/*.whl ./
cargo clean
- name: Build Python
working-directory: python
run: |
python -m pip install $(ls target/wheels/*.whl)
python -m pip install ../python/*.whl
python -m pip install -r ../docs/requirements.txt
- name: Build docs
working-directory: docs
Expand Down
64 changes: 0 additions & 64 deletions .github/workflows/duckdb.yml

This file was deleted.

2 changes: 1 addition & 1 deletion .github/workflows/file_verification/test_write_read.py
Original file line number Diff line number Diff line change
Expand Up @@ -48,5 +48,5 @@
assert tab_lance == parquet_table
print(f"Table read from Lance is the same as table read from Parquet for file: {file_path}")

except Exception as e:
except Exception:
raise AssertionError(f"Table read from Lance is not the same as table read from Parquet for file: {file_path}")
Loading
Loading