[SYSTEMDS-3779] Add ColGroupDDCLZW with LZW-compressed MapToData #2398

florian-jobs · 2026-01-13T13:16:31Z

Summary

This PR introduces a new column group ColGroupDDCLZW that stores the mapping vector in LZW-compressed form.

Key design points

MapToData is not stored explicitly; only the compressed LZW representation is kept.
Operations that allow sequential access operate directly on _dataLZW without full decompression.
For complex or random-access patterns, the implementation falls back to DDC (uncompressed).

Current status

Core data structure and compression/decompression are in place.
Work in progress on operations that can be implemented via sequential decoding without full materialization.
Work in progress on Performance.

Feedback on design and integration is very welcome.

…extending on APreAgg like ColGroupDDC for easier implementation. Idea: store only compressed version of _data vector and important metadata. If decompression is needed we reconstruct the _data vector using the metadata and the compressed _data vector. Decompression takes place at most once. This is just an idea and theres other ways of implementing.

* - DDCLZW stores the mapping vector exclusively in compressed form. * - No persistent MapToData cache is maintained. * - Sequential operations decode on-the-fly, while operations requiring random access explicitly materialize and fall back to DDC. */

…ng von Decompress

…and decompress and its used data structures compatible.

…lgorithms and try made them compatible.

…DC test for ColGroupDDCTest. Improved compress/decompress methods in LZW class.

…lemted from its Interface.

…mapping This commit adds an initial implementation of ColGroupDDCLZW, a new column group that stores the mapping vector in LZW-compressed form instead of materializing MapToData explicitly. The design focuses on enabling sequential access directly on the compressed representation, while complex access patterns are intended to fall back to DDC. No cache or lazy decompression mechanism is introduced at this stage.

…press(). Decompress will now return an empty map if the index is zero.

janniklinde

Thank you for the PR. I left some comments in the code.

In general, please use tabs instead of spaces to make the diff more readable (can be done by importing the codestyle xml). It would be good if we are able to create the column group similar to this:

CompressionSettingsBuilder csb = new CompressionSettingsBuilder().setSamplingRatio(1.0)
	.setValidCompressions(EnumSet.of(AColGroup.CompressionType.DDCLZW))
		.setTransposeInput("false");
CompressionSettings cs = csb.create();

final CompressedSizeInfoColGroup cgi = new ComEstExact(mbt, cs).getColGroupInfo(colIndexes);
CompressedSizeInfo csi = new CompressedSizeInfo(cgi);
AColGroup cg = ColGroupFactory.compressColGroups(mbt, csi, cs, 1).get(0);

So corresponding features / methods to support this should be implemented.

src/main/java/org/apache/sysds/runtime/compress/colgroup/ColGroupDDCLZW.java

janniklinde · 2026-01-16T10:39:18Z

src/main/java/org/apache/sysds/runtime/compress/colgroup/ColGroupDDCLZW.java

All implemented methods must be covered by tests

src/main/java/org/apache/sysds/runtime/compress/colgroup/ColGroupDDCLZW.java

janniklinde · 2026-01-16T11:00:25Z

Please add some more tests to really verify correctness. For example, you should do a full compression and then decompress it again. Then it should be compared to the original data

src/main/java/org/apache/sysds/runtime/compress/colgroup/ColGroupDDC.java

src/test/java/org/apache/sysds/test/component/compress/colgroup/ColGroupDDCTest.java

…GroupDDCTest back to correct formatting. Added LZWMappingIterator to decompress values on the fly without having to allocate full compression map [WIP]. Added Test class ColGroupDDCLZWTest.

…tting again.

Signed-off-by: Luka Dekanozishvili <[email protected]>

LukaDeka · 2026-01-18T18:57:34Z

Added new unit tests for ColGroupDDCLZW (they're subject to change and only an initial draft).

They might include redundant/unnecessary checks.

The rest of the methods are also untested. I'll do it later and possibly refactor the helper functions for the tests.

…ded decompressToDenseBlockDenseDictionary [WIP] needs to be tested further. Added fallbacks to ddc for variouos functions. Added scalar and unary ops and various other simple methods from ddc.

…erns. Added append and appendNInternal, recompress and various other functions that needed to be implemented. No tests yet.

Baunsgaard

good progress, i have left some comments.

I would love to see some performance numbers.

src/main/java/org/apache/sysds/runtime/compress/colgroup/ColGroupDDC.java

src/main/java/org/apache/sysds/runtime/compress/colgroup/scheme/DDCLZWScheme.java

Baunsgaard · 2026-01-21T00:27:28Z

src/main/java/org/apache/sysds/runtime/compress/colgroup/ColGroupDDCLZW.java

+		return (((long) prefixCode) << 32) | (nextSymbol & 0xffffffffL);
+	}
+
+	// Compresses a mapping (AMapToData) into an LZW-compressed byte/integer/? array.


you probably want to compress into a byte[] array, or if you want to bit shift a bit, pack into a long[] array.

src/main/java/org/apache/sysds/runtime/compress/colgroup/ColGroupDDCLZW.java

Baunsgaard · 2026-01-21T00:39:15Z

src/main/java/org/apache/sysds/runtime/compress/colgroup/ColGroupDDCLZW.java

+	}
+
+	@Override
+	public void leftMultByMatrixNoPreAgg(MatrixBlock matrix, MatrixBlock result, int rl, int ru, int cl, int cu) {


This is the cool one to support! It is a bit hard, but will probably pay of with LZW.

You can keep a soft reference to a hashmap mapping different rl to offsets into your data structure. That would make it possible to skip the initial scan until rl. Furthermore, the hashmap's growth would be limited, since the callers to these rl interfaces typically are bounded by cpu cores. You can use the same trick in some other functions where you scan until rl.

src/main/java/org/apache/sysds/runtime/compress/colgroup/scheme/DDCLZWSchemeMC.java

…mapping sequentially. reverted ColGroupDDC formatting again. Reverted CompressedSizeInfoColGroup formatting and adding DDCLZW part for testing. Added various tests for which functionality in the testing pipeline need to be added in order to work.

Signed-off-by: Luka Dekanozishvili <[email protected]>

LukaDeka · 2026-01-21T20:41:20Z

Added a few benchmarks that mostly compare memory as well as operation times for methods (so far, only for getIdx).

Right now, the comparison is only done for DDCLZW with DDC.

There are sizable memory savings for datasets with repeating patterns or large datasets:

================================================================================
Benchmark: benchmarkRandomData
================================================================================

Size:       1 | DDC:       61 bytes | DDCLZW:       67 bytes | Memory reduction:  -9.84% | De-/Compression speedup: 0.09/0.00 times
Size:      10 | DDC:       70 bytes | DDCLZW:       95 bytes | Memory reduction: -35.71% | De-/Compression speedup: 0.04/0.00 times
Size:     100 | DDC:      160 bytes | DDCLZW:      299 bytes | Memory reduction: -86.87% | De-/Compression speedup: 0.01/0.00 times
Size:    1000 | DDC:     1060 bytes | DDCLZW:     1551 bytes | Memory reduction: -46.32% | De-/Compression speedup: 0.00/0.00 times
Size:   10000 | DDC:    10060 bytes | DDCLZW:    10487 bytes | Memory reduction:  -4.24% | De-/Compression speedup: 0.00/0.00 times
Size:  100000 | DDC:   100060 bytes | DDCLZW:    78783 bytes | Memory reduction:  21.26% | De-/Compression speedup: 0.00/0.00 times

I also added the De-/Compression speedup field to compare other compression types with each other as well.

I also added a benchmark for the slides, but it doesn't look too useful at the moment:

================================================================================
Benchmark: benchmarkSlice
================================================================================

Size:       1 | Slice[    0:    0] | DDC:      0 ms | DDCLZW:      1 ms | Slowdown: 37.09 times
Size:      10 | Slice[    2:    7] | DDC:      0 ms | DDCLZW:     20 ms | Slowdown: 1141.72 times
Size:     100 | Slice[   25:   75] | DDC:      0 ms | DDCLZW:      3 ms | Slowdown: 169.34 times
Size:    1000 | Slice[  250:  750] | DDC:      0 ms | DDCLZW:      3 ms | Slowdown: 348.98 times
Size:   10000 | Slice[ 2500: 7500] | DDC:      0 ms | DDCLZW:      6 ms | Slowdown: 483.40 times
Size:  100000 | Slice[25000:75000] | DDC:      0 ms | DDCLZW:     24 ms | Slowdown: 325.22 times

The file might be in a wrong directory as well and wrongly labeled as a "test". We wouldn't want benchmarks running on every GitHub Actions trigger etc.

Would it make more sense to refactor it into a main function?

Baunsgaard · 2026-01-21T23:32:17Z

@LukaDeka
Good to see some numbers. However, the ones you have reported are a bit unfortunate. I have a few points you should consider:

Random data is not very compressible, and in actuality, truly random data would tend to make DDC superior for your use case. What you are looking for is to control the entropy of your data. If the entropy is low, you should get more benefits from LZW; if it is high, then your compression ratio should tend towards DDC.
As an additional experiment, you can generate data that has exploitable patterns specific to LZW. Try to generate some data that is in the "best" possible structure. This should ideally show scaling close to (O(sqrt{n})) of the input size with standard LZW, while DDC, being a dense format, always has (O(n)).
Do not worry about input data that is smaller than 100 elements for these experiments. For instance, experiments with 1 input row trivially show that other encodings can perform better than DDC. It starts getting interesting at larger sizes.
Control and explicitly mention the number of distinct items you have as a parameter for your experiment. Additionally, calculate the entropy and use that as an additional measure of compressibility of the data. These two changes will improve the experiments.

…cheme according to guidelines.

florian-jobs · 2026-01-22T16:03:12Z

Status update:

Many methods that operate sequentially on the original mapping have been implemented using partial on the fly decoding of the compressed LZW mapping via an iterator.

Methods with more complex or non sequential access patterns are not yet handled in this way (for example leftMultByMatrixNoPreAgg) and currently fall back to DDC. These will be addressed in follow-up work.

Most decompression paths now rely on partial decoding of the LZW mapping rather than full materialization. Scalar and unary operations have also been implemented.

Several previously reported issues have been fixed. I have reverted the unintended formatting changes in the affected files and ensured alignment with the existing code style.

I will continue working on the remaining improvements suggested by @Baunsgaard and @janniklinde.

What is still missing at this point are more dedicated tests for the individual methods to ensure correctness which @LukaDeka is working on.

Thanks for the detailed feedback and reviews, they were very helpfull!

Baunsgaard · 2026-01-22T16:10:55Z

When you process some of the comments feel free to mark them as resolved!

LukaDeka · 2026-01-22T16:15:23Z

When you process some of the comments feel free to mark them as resolved!

I wanted to before, but I think I don't have the permission in GitHub to do that. Not sure if Florian has it.

Baunsgaard · 2026-01-22T16:51:08Z

When you process some of the comments feel free to mark them as resolved!

I wanted to before, but I think I don't have the permission in GitHub to do that. Not sure if Florian has it.

Alternatively if you do not have permissions, make a comment saying resolved. Then when we go though the PR, it is cleaner.

… it into the compression pipeline and serialization framework.

florian-jobs · 2026-01-24T09:04:13Z

I have marked some comments as resolved
.

… some documentation for non native ddc methods in ddclzw class.

… by IDE. Removed unneccesary comments from classes DDCLZW and DDCLZWTest. Optimized some tests to use compression framework.

…ngsBuilder erstellen

…al decompression and adjusting the function decompress to become decompressFull

… comments.

Signed-off-by: Luka Dekanozishvili <[email protected]>

LukaDeka · 2026-01-28T20:11:46Z

Update for benchmarks

Addressing the feedback

What you are looking for is to control the entropy of your data.

I wasn't able to "generate" data that matched a given entropy (percentage), but I added a helper function to calculate "Shannon-entropy" for the given arrays. It's displayed now in the benchmarks.

You can generate data that has exploitable patterns specific to LZW.

I added genPatternLZWOptimal which features "repeating patterns". Right now, it just repeats the same pattern (length 10) twice, but based on my observations, any repeating pattern is compressed very well.

Do not worry about input data that is smaller than 100 elements for these experiments.

I adjusted the sizes to 100, 1000, 10.000, 40.000.

...explicitly mention the number of distinct items you have...

nUnique is not displayed with the benchmarks.

I also added another for loop so that both nUnique and size are incremented:

================================================================================
Benchmark: benchmarkUniquesLZWOptimal
================================================================================

................................... Size: 100 ...................................
Size:     100 | nUnique:    2 | Entropy:  99.88% | DDC:      52 bytes | DDCLZW:     123 bytes | Memory reduction: -136.54% | De-/Compression speedup: 0.02/0.00 times
Size:     100 | nUnique:    3 | Entropy:  99.66% | DDC:     144 bytes | DDCLZW:     151 bytes | Memory reduction:   -4.86% | De-/Compression speedup: 0.01/0.00 times
Size:     100 | nUnique:    5 | Entropy:  99.41% | DDC:     160 bytes | DDCLZW:     187 bytes | Memory reduction:  -16.87% | De-/Compression speedup: 0.01/0.00 times
Size:     100 | nUnique:   10 | Entropy:  99.03% | DDC:     200 bytes | DDCLZW:     263 bytes | Memory reduction:  -31.50% | De-/Compression speedup: 0.01/0.00 times
Size:     100 | nUnique:   20 | Entropy:  83.91% | DDC:     280 bytes | DDCLZW:     367 bytes | Memory reduction:  -31.07% | De-/Compression speedup: 0.01/0.00 times
Size:     100 | nUnique:   50 | Entropy:  64.25% | DDC:     520 bytes | DDCLZW:     607 bytes | Memory reduction:  -16.73% | De-/Compression speedup: 0.01/0.00 times
Size:     100 | nUnique:  100 | Entropy:  54.58% | DDC:     920 bytes | DDCLZW:    1007 bytes | Memory reduction:   -9.46% | De-/Compression speedup: 0.01/0.00 times
................................... Size: 1000 ...................................
Size:    1000 | nUnique:    2 | Entropy:  99.96% | DDC:     164 bytes | DDCLZW:     355 bytes | Memory reduction: -116.46% | De-/Compression speedup: 0.00/0.00 times
Size:    1000 | nUnique:    3 | Entropy:  99.93% | DDC:    1044 bytes | DDCLZW:     439 bytes | Memory reduction:   57.95% | De-/Compression speedup: 0.00/0.00 times
Size:    1000 | nUnique:    5 | Entropy:  99.86% | DDC:    1060 bytes | DDCLZW:     527 bytes | Memory reduction:   50.28% | De-/Compression speedup: 0.00/0.00 times
Size:    1000 | nUnique:   10 | Entropy:  99.64% | DDC:    1100 bytes | DDCLZW:     659 bytes | Memory reduction:   40.09% | De-/Compression speedup: 0.00/0.00 times
Size:    1000 | nUnique:   20 | Entropy:  98.53% | DDC:    1180 bytes | DDCLZW:     911 bytes | Memory reduction:   22.80% | De-/Compression speedup: 0.00/0.00 times
Size:    1000 | nUnique:   50 | Entropy:  85.20% | DDC:    1420 bytes | DDCLZW:    1291 bytes | Memory reduction:    9.08% | De-/Compression speedup: 0.00/0.00 times
Size:    1000 | nUnique:  100 | Entropy:  72.37% | DDC:    1820 bytes | DDCLZW:    1691 bytes | Memory reduction:    7.09% | De-/Compression speedup: 0.00/0.00 times
Size:    1000 | nUnique:  200 | Entropy:  62.91% | DDC:    2620 bytes | DDCLZW:    2491 bytes | Memory reduction:    4.92% | De-/Compression speedup: 0.00/0.00 times
Size:    1000 | nUnique:  500 | Entropy:  53.63% | DDC:    6020 bytes | DDCLZW:    4891 bytes | Memory reduction:   18.75% | De-/Compression speedup: 0.00/0.00 times
Size:    1000 | nUnique: 1000 | Entropy:  48.25% | DDC:   10020 bytes | DDCLZW:    8891 bytes | Memory reduction:   11.27% | De-/Compression speedup: 0.00/0.00 times
................................... Size: 10000 ...................................
Size:   10000 | nUnique:    2 | Entropy:  99.99% | DDC:    1292 bytes | DDCLZW:    1147 bytes | Memory reduction:   11.22% | De-/Compression speedup: 0.00/0.00 times
Size:   10000 | nUnique:    3 | Entropy:  99.99% | DDC:   10044 bytes | DDCLZW:    1379 bytes | Memory reduction:   86.27% | De-/Compression speedup: 0.00/0.00 times
Size:   10000 | nUnique:    5 | Entropy:  99.98% | DDC:   10060 bytes | DDCLZW:    1719 bytes | Memory reduction:   82.91% | De-/Compression speedup: 0.00/0.00 times
Size:   10000 | nUnique:   10 | Entropy:  99.94% | DDC:   10100 bytes | DDCLZW:    2143 bytes | Memory reduction:   78.78% | De-/Compression speedup: 0.00/0.00 times
Size:   10000 | nUnique:   20 | Entropy:  99.81% | DDC:   10180 bytes | DDCLZW:    2619 bytes | Memory reduction:   74.27% | De-/Compression speedup: 0.00/0.00 times
Size:   10000 | nUnique:   50 | Entropy:  98.98% | DDC:   10420 bytes | DDCLZW:    3671 bytes | Memory reduction:   64.77% | De-/Compression speedup: 0.00/0.00 times
Size:   10000 | nUnique:  100 | Entropy:  95.94% | DDC:   10820 bytes | DDCLZW:    4047 bytes | Memory reduction:   62.60% | De-/Compression speedup: 0.00/0.00 times
Size:   10000 | nUnique:  200 | Entropy:  83.39% | DDC:   11620 bytes | DDCLZW:    4847 bytes | Memory reduction:   58.29% | De-/Compression speedup: 0.00/0.00 times
Size:   10000 | nUnique:  500 | Entropy:  71.09% | DDC:   24020 bytes | DDCLZW:    7247 bytes | Memory reduction:   69.83% | De-/Compression speedup: 0.00/0.00 times
Size:   10000 | nUnique: 1000 | Entropy:  63.96% | DDC:   28020 bytes | DDCLZW:   11247 bytes | Memory reduction:   59.86% | De-/Compression speedup: 0.00/0.00 times
................................... Size: 40000 ...................................
Size:   40000 | nUnique:    2 | Entropy: 100.00% | DDC:    5044 bytes | DDCLZW:    2319 bytes | Memory reduction:   54.02% | De-/Compression speedup: 0.00/0.00 times
Size:   40000 | nUnique:    3 | Entropy: 100.00% | DDC:   40044 bytes | DDCLZW:    2811 bytes | Memory reduction:   92.98% | De-/Compression speedup: 0.00/0.00 times
Size:   40000 | nUnique:    5 | Entropy:  99.99% | DDC:   40060 bytes | DDCLZW:    3463 bytes | Memory reduction:   91.36% | De-/Compression speedup: 0.00/0.00 times
Size:   40000 | nUnique:   10 | Entropy:  99.98% | DDC:   40100 bytes | DDCLZW:    4227 bytes | Memory reduction:   89.46% | De-/Compression speedup: 0.00/0.00 times
Size:   40000 | nUnique:   20 | Entropy:  99.95% | DDC:   40180 bytes | DDCLZW:    5319 bytes | Memory reduction:   86.76% | De-/Compression speedup: 0.00/0.00 times
Size:   40000 | nUnique:   50 | Entropy:  99.74% | DDC:   40420 bytes | DDCLZW:    7307 bytes | Memory reduction:   81.92% | De-/Compression speedup: 0.00/0.00 times
Size:   40000 | nUnique:  100 | Entropy:  99.09% | DDC:   40820 bytes | DDCLZW:    8927 bytes | Memory reduction:   78.13% | De-/Compression speedup: 0.00/0.00 times
Size:   40000 | nUnique:  200 | Entropy:  96.36% | DDC:   41620 bytes | DDCLZW:    8367 bytes | Memory reduction:   79.90% | De-/Compression speedup: 0.00/0.00 times
Size:   40000 | nUnique:  500 | Entropy:  82.16% | DDC:   84020 bytes | DDCLZW:   10767 bytes | Memory reduction:   87.19% | De-/Compression speedup: 0.00/0.00 times
Size:   40000 | nUnique: 1000 | Entropy:  73.91% | DDC:   88020 bytes | DDCLZW:   14767 bytes | Memory reduction:   83.22% | De-/Compression speedup: 0.00/0.00 times

Remarks

The main difficulty was judging which benchmarks are useful since most of my entropy values were pretty high to max.

Also benchmarkGetIdx doesn't make sense right now since the time signatures between DDC and DDCLZW don't match because of the "on-the-fly" sequential decompression, but the method could be swapped out trivially (so I kept the method).

I also commented out the benchmarkSlice since it didn't look useful.

florian-jobs and others added 14 commits January 7, 2026 13:39

Idea:

007611c

* - DDCLZW stores the mapping vector exclusively in compressed form. * - No persistent MapToData cache is maintained. * - Sequential operations decode on-the-fly, while operations requiring random access explicitly materialize and fall back to DDC. */

More TODOS written and cleaned up project.

b1bf906

Dictionary initialisierung für Compress und rudimentäre Implementieru…

8027458

…ng von Decompress

Uebersichtlichkeit verbessert

ef3b834

Minor error fixing. Redesigned compress method.

9886821

Added red/write methods to serialize and deserialize from stream.

e0d5d75

Commented code, error handling for compress. next step make compress …

beb4613

…and decompress and its used data structures compatible.

Added first stages of tests. improved compression and decompression a…

620e03a

…lgorithms and try made them compatible.

Added convertToDDCLZW() method to ColGroupDDC Class. Added convertToD…

b7911d7

…DC test for ColGroupDDCTest. Improved compress/decompress methods in LZW class.

Started working on ColGroupDDCLZW's other methods that need to be imp…

1dfe91e

…lemted from its Interface.

test commit

3156863

[SYSTEMDS-3779] Added new Compression and ColGroup Types DDCLZW.

10d5776

github-project-automation bot added this to SystemDS PR Queue Jan 13, 2026

github-project-automation bot moved this to In Progress in SystemDS PR Queue Jan 13, 2026

florian-jobs changed the title ~~Add ColGroupDDCLZW with LZW-compressed MapToData~~ [SYSTEMDS-3779] Add ColGroupDDCLZW with LZW-compressed MapToData Jan 13, 2026

Annika Lehmann added 2 commits January 15, 2026 13:18

Decompression to a specific index

a8df1fe

slice Rows

96cb6e9

janniklinde self-requested a review January 16, 2026 08:26

[SYSTEMDS-3779] Add imemdiate stop after index certain index in decom…

a30cc91

…press(). Decompress will now return an empty map if the index is zero.

janniklinde requested changes Jan 16, 2026

View reviewed changes

github-project-automation bot moved this from In Progress to In Review in SystemDS PR Queue Jan 16, 2026

Baunsgaard reviewed Jan 16, 2026

View reviewed changes

src/main/java/org/apache/sysds/runtime/compress/colgroup/ColGroupDDC.java Show resolved Hide resolved

Baunsgaard reviewed Jan 16, 2026

View reviewed changes

src/test/java/org/apache/sysds/test/component/compress/colgroup/ColGroupDDCTest.java Show resolved Hide resolved

florian-jobs and others added 4 commits January 16, 2026 16:26

[SYSTEMDS-3779] Reverted formatting of ColGroupDDC,ColGroupDDCLZW,Col…

d39fad0

…GroupDDCTest back to correct formatting. Added LZWMappingIterator to decompress values on the fly without having to allocate full compression map [WIP]. Added Test class ColGroupDDCLZWTest.

[SYSTEMDS-3779] Intermediate DDCLZW Scheme

9e2cf11

[SYSTEMDS-3779] Added getIdx using LZWMappingIterator. Reverted forma…

7de7f1d

…tting again.

[SYSTEMDS-3779] Fixed out of bounds logic

4f3f413

Signed-off-by: Luka Dekanozishvili <[email protected]>

[SYSTEMDS-3779] Added new tests for ColGroupDDCLZW (draft)

ca7e6ff

Signed-off-by: Luka Dekanozishvili <[email protected]>

florian-jobs added 2 commits January 19, 2026 14:20

[SYSTEMDS-3779] Increased sliceRows performance by using iterator. Ad…

ddd2a8b

…ded decompressToDenseBlockDenseDictionary [WIP] needs to be tested further. Added fallbacks to ddc for variouos functions. Added scalar and unary ops and various other simple methods from ddc.

Added various fallbacks to ddc for functions with complex access patt…

a8735e1

…erns. Added append and appendNInternal, recompress and various other functions that needed to be implemented. No tests yet.

Baunsgaard reviewed Jan 21, 2026

View reviewed changes

florian-jobs and others added 2 commits January 21, 2026 10:51

[SYSTEMDS-3779] Added benchmark 'tests' with helpers for DDCLZW vs DDC

8e6ebcf

Signed-off-by: Luka Dekanozishvili <[email protected]>

florian-jobs added 2 commits January 22, 2026 15:55

[MINOR] Removed unnessecary imports and * imports. Reformated DDCLZWS…

3cc492c

…cheme according to guidelines.

[MINOR] Added License to DDCLZWScheme.

2bc1392

[SYSTEMDS-3779] Introduce DDCLZW as a new ColGroup type and integrate…

1726021

… it into the compression pipeline and serialization framework.

florian-jobs and others added 8 commits January 25, 2026 12:21

[MINOR] Improved imports for LZWTest class (removed * imports). Added…

96e5801

… some documentation for non native ddc methods in ddclzw class.

[MINOR] Improved imports for LZWTest class (removed more * imports).

b4db70d

[MINOR] Optimized imports for ColGroupDDCTest due to auto optmization…

6b7367c

… by IDE. Removed unneccesary comments from classes DDCLZW and DDCLZWTest. Optimized some tests to use compression framework.

[SYSTEMDS-3779] Objekte der DDC-Column Group mit dem CompressionSetti…

f753cd3

…ngsBuilder erstellen

[SYSTEMDS-3779] Removal of the tests for the no longer existing parti…

c932770

…al decompression and adjusting the function decompress to become decompressFull

[MINOR] Code refractoring in test classes and removing of unnecessary…

7152f67

… comments.

[SYSTEMDS-3779] Entropy, reworked benchmarks

3256e81

Signed-off-by: Luka Dekanozishvili <[email protected]>

[SYSTEMDS-3779] Local merge

9241629

[SYSTEMDS-3779] Add ColGroupDDCLZW with LZW-compressed MapToData #2398

Are you sure you want to change the base?

[SYSTEMDS-3779] Add ColGroupDDCLZW with LZW-compressed MapToData #2398

Uh oh!

Conversation

florian-jobs commented Jan 13, 2026

Summary

Key design points

Current status

Uh oh!

janniklinde left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

janniklinde Jan 16, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

janniklinde commented Jan 16, 2026

Uh oh!

Uh oh!

Uh oh!

LukaDeka commented Jan 18, 2026

Uh oh!

Baunsgaard left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Baunsgaard Jan 21, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Baunsgaard Jan 21, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

LukaDeka commented Jan 21, 2026

Uh oh!

Baunsgaard commented Jan 21, 2026

Uh oh!

florian-jobs commented Jan 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Baunsgaard commented Jan 22, 2026

Uh oh!

LukaDeka commented Jan 22, 2026

Uh oh!

Baunsgaard commented Jan 22, 2026

Uh oh!

florian-jobs commented Jan 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

LukaDeka commented Jan 28, 2026

Update for benchmarks

Addressing the feedback

Remarks

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

janniklinde left a comment •

edited

Loading

florian-jobs commented Jan 22, 2026 •

edited

Loading

florian-jobs commented Jan 24, 2026 •

edited

Loading