feat: crackle segmentation decoder#1013
Conversation
51aa997 to
75bd9eb
Compare
|
typescript code looks good |
chrisj
left a comment
There was a problem hiding this comment.
I created a separate draft PR to get a working client preview. I applied the fixes to that PR that I added as comments to this one
| import { asyncComputation } from "#src/async_computation/index.js"; | ||
|
|
||
| export const decodeCrackle = | ||
| asyncComputation<(data: Uint8Array) => Uint8Array>("decodeCrackle"); |
There was a problem hiding this comment.
change both Uint8Array to Uint8Array<ArrayBuffer>
|
Thanks, these are great catches. I'm having trouble installing tsgo for local type checking at the moment, so I'll just upload what I have rn. I'll see if I can get it working. |
|
I can't run a linux docker on my local mac, so I compiled this libcrackle.wasm on a Google Cloud Instance running ubuntu-minimal-2604-resolute-amd64-v20260529 with the latest emscripten. I find that the binary differs between linux and macos. |
|
@william-silversmith if you take the one from my branch, it should work, it passed all the github actions |
|
@william-silversmith it looks like the wasm file needs to be given executable permission |
|
Ok, I got tsgo working locally and fixed the issues! |
|
Looks like a docker image failed to pull, might be worth rerunning the tests. |
|
It is getting stuck during playwright browser install this PR will fix it: #1020 |
|
@william-silversmith rebase off master and you're tests should pass now |
perf: smaller binary refactor: remove extraneous code refactor: drop robin_hood from this decoder
This reverts commit 64831a6.
17ef375 to
7d3f5cf
Compare
|
Thank you guys, rebased and force pushed. |
|
I updated the wasm here if you can't build it yourself, just remember to make sure it is executable It looks like your formatter is not formatting the same, for example it is adding empty lines to the README.md whereas mine is removing them Can you try an npm install in case your oxlint/prettier are out of date? |
|
Thank you Chris! That helped a lot. |
perf: smaller binary refactor: remove extraneous code refactor: drop robin_hood from this decoder chore: fix formatting fix: cover more code with exception handling fix: remove sw from return type fix: add <ArrayBuffer> to Uint8Array type fix: incorrect import fix: change permisssions of build.sh to add u+x build: added libcrackle.wasm compiled on an ubuntu machine fix: change type to Uint8Array<ArrayBuffer> fix: change to Uint8Array<ArrayBuffer> Revert "fix: change to Uint8Array<ArrayBuffer>" This reverts commit 64831a6. build: use Chris' version of libcrackle.wasm chore: cleanup some dead encoder code chore: add permission u+x to libcrackle.wasm fix: generic ArrayBuffer types chore: updated libcrackle.wasm chore: ran npm run lint:fix
…n_from_buffer errors, so bad header CRCs return 71 instead of being ignored. index.ts: JS wrapper now rejects short headers, validates the v1 header CRC before reading dimensions, handles subarray byte offsets correctly, and rejects unsafe decoded sizes before WASM allocation
|
@william-silversmith thanks! all passing now. I found the c++ code hard to review so I had AI review the PR It suggested these documentation additions: and these crackle header related changes: it also made a note about the reference of COPYING.LESSER in the licenses for crackle and compresso and it's lack of existence in the repo. Lastly, my own thought is: should we have tests for compresso in crackle in the neuroglancer repo? |
|
Thanks, the documentation changes don't really give much info other than the codec exists, which is still a positive thing. The documentation doesn't mention compresso (!). I think both should be linked to their respective githubs at least. The codec changes are useful, though they are kind of superficial. It would be good to catch header defects in clear way. Good catch on the COPYING.LESSER. Initially I was going to license the limited versions of cc3d as LGPL but changed my mind and licensed them as Apache. I guess I forgot to strip that language out. As for testing, it's tbh a good idea. I haven't really thought through how to test WASM though. |
|
Ok I integrated the non-testing changes I think. I'll need a new crackle binary though :D |
|
That's strange... this time my locally compiled one matched. |
|
We definitely want testing --- currently we test precomputed decoding in python/tests/precomputed_test.py. However, currently those tests make use of tensorstore for encoding which won't work for cracke, so I suppose an alternative approach will be needed. Maybe we need to store some tiny golden files in the repo itself, ideally ones that exercise all of the different code paths. |
|
Nevermind, the wasm did differ... I'll add some tests. The main things to test are:
One other thing that this codec supports but the Neuroglancer version might want to handle is if you download a large crackle encoded file (e.g. 1024x1024x1024 voxels for example, which is easily supported) compressed_segmentation doesn't support files that large, but it would support transcoding in chunks. Maybe something for future work... you'd need a synthetic chunk size for use internally to neuroglancer vs fetching chunks. |
|
Okay, I've uploaded some tests. Let's see if they pass! I may need an updated libcrackle.wasm @chrisj |

Hi Jeremy,
The crackle codec is high speed and so far as I know the highest compression codec available for segmentation. The format has been stable for a while, so I am contributing this decoder to make it practically usable for more people.
You can find the main repo for the format here: https://github.com/seung-lab/crackle/
The advantage of the format is that it is speedy (single thread decode up to 550 MVx/sec) and compact (often 2-3x better when layered with gzip) and in an "uncompressed state" (i.e. not layered with gzip) it is often smaller than a gzipped raw. In that state, it is possible to query for what labels are present without decompressing and extract random z-slices. However, that additional functionality is not included in this implementation for the most part.
These properties make it a great in-memory storage format too (I use it that way all the time), which could be a way for neuroglancer to cache even more enormous volumes in the future by dynamically converting between compressed_segmentation and crackle.
I tried to pare this implementation down to just the decompressor and is (with one exception) my own code. The main exception is a portable MIT licensed implementation of crc32c and a crc8 implementation inspired by stack overflow, so if that is an issue, I'm sure we can find an alternative (I know Google has its own implementation).
In the nearish future, I may make some small adjustments to the format, but it won't be a huge change. This v1 format has been stable for almost a year and a half.
I hope you find this a useful addition to Neuroglancer!
Will Silversmith