DuckDB

Compiling Isn't Running: Functionally Testing DuckDB-WASM Extensions

A DuckDB extension that compiles for WebAssembly has only proven that it compiles. Whether it loads, and whether it actually runs, are separate questions. I built a Node harness to ask them across 124 community extensions. Here's what it found and the fixes that came out of it.

When the WebAssembly build of a DuckDB extension passes CI, it’s tempting to read that as “the extension works.” It doesn’t mean that. It means the code compiled: emcc accepted the source, produced a .duckdb_extension.wasm, and the file got uploaded to the catalog. Whether that file loads into a running engine, and whether it does anything useful once loaded, are questions the build never asks.

For the WASM extensions in Haybarn, Query.Farm’s DuckDB distribution, nobody was asking them either. So I went looking, and the first extension I checked had been broken for weeks while its badge stayed green.

So I built a harness to run all of them. Here’s the short version of this whole post — every WASM-enabled community extension, run against the published engine and graded on its own test suite:

58
43
pass · 58 fail · 43 crash · 1 skip · 9 no tests · 5 not deployed · 8
124 WASM-enabled community extensions, run on 13 June 2026. Green passed their own sqllogictest suites; red failed them (darkest red crashed the engine); gray never produced a verdict — no tests, unsupported test directives, or no artifact on the catalog. Of the 102 that actually ran tests, 58 passed.

That gap matters for two kinds of people. If you publish a DuckDB extension, the WASM build is the one target you probably can’t test by hand, and a green badge can hide a binary that throws the instant a browser user calls a function. If you build on DuckDB-WASM, an extension you rely on may be quietly missing in the browser while working everywhere else. Either way, the failure surfaces at your users instead of your CI, which is the worst place to find it.

The canary

The jsonata extension lets you run JSONata expressions over JSON inside SQL. Its WASM build compiled fine, shipped to the catalog, and installed without complaint:

INSTALL jsonata FROM community;  -- ok
LOAD jsonata;                    -- ok
SELECT jsonata('Account', '{"Account": 5}');
-- TypeError: n is not a function

INSTALL and LOAD both succeed. The first real function call throws an opaque TypeError from inside the worker. A test that stops at “does it load” passes this and ships it. But nobody installs an extension just to load it; they call its functions, which is where this one fell over.

Native builds catch exactly this. make test links the extension into DuckDB’s unittest binary and runs the extension’s own test/sql/*.test sqllogictest files against it. The WASM build runs none of that, because there’s no WASM unittest binary; the test runner is native C++. So WASM has no functional test layer at all.

Why “it compiled” tells you so little in WASM

On native, a loadable DuckDB extension is a shared library the engine dlopens. In WASM it’s an emscripten side module, linked with -sSIDE_MODULE=2, that the engine’s dynamic linker loads at runtime and resolves against the main module. Anything it can’t resolve doesn’t fail the load; it becomes a stub that only blows up when something calls it.

That gap, between compiling and resolving, is where a clean compile can still hide a broken extension. It breaks four ways, and the census hit all of them: a dependency that never got linked into the .wasm, a file read that assumes a real filesystem, an HTTP call that assumes real sockets, or a hard dependency on another extension that has no WASM build. None of this shows up at compile time; you only see it when you run the extension.

The plan: run their own tests, against the real engine, in Node

I didn’t want to write new tests. Every extension already ships a test/sql/*.test suite, the same sqllogictest files the native build runs. Those assertions are a much better oracle than any smoke query I’d invent, so the harness runs them against the published WASM engine, the way a user would actually get the extension.

The harness, haybarn-extension-wasm-tester (now open source), does this per extension:

  1. Census. Parse the community catalog descriptors, select the ones with WASM enabled.
  2. Fetch. Shallow-clone each extension’s repo at the exact ref the catalog shipped, and find its test/sql/*.test files.
  3. Run. Spin up a fresh @haybarn/haybarn-wasm engine in Node, install and load the extension from the catalog the way a user would, and run the test file’s records against it.
  4. Compare. Match results and errors the way sqllogictest does.

Why Node instead of a browser? Node has a real filesystem, so the engine installs and caches extensions exactly the way the published packaging does, and it runs the same wasm_eh engine binary and the same extension binary a browser would pull.

What it found

The failures fell into five groups. Each has a specific cause, and most of the fixes are a few lines. One thing to be clear about first: these are bugs in the extensions themselves — in their source, or their build configuration — not in the WASM engine. They reproduce on the official duckdb-wasm build just the same. The harness is only the thing that finally ran them.

1. The missing library (LINKED_LIBS)

This is the n is not a function class, and it was the most common. The extension depends on a vcpkg C++ library — yaml-cpp, LibXml2, libxxhash, a QuickJS runtime — and links it the normal way, with target_link_libraries:

target_link_libraries(${LOADABLE_EXTENSION_NAME} yaml-cpp::yaml-cpp)

Correct for native. Ignored by the -sSIDE_MODULE=2 link, which only honors libraries named in the extension descriptor’s LINKED_LIBS. The dependency’s symbols are left undefined in the .wasm. The module still loads, and then the first call into the missing code throws. The fix is to name the library where the WASM link will actually see it:

duckdb_extension_load(yaml
    LINKED_LIBS "$<TARGET_FILE:yaml-cpp::yaml-cpp>"
)

That one move fixed hashfuncs, marisa, textplot, json_schema, quickjs, and jsonata itself.

2. Raw file I/O instead of DuckDB’s filesystem

An extension that reads its input with fopen or std::fstream works everywhere except the one place that has no host filesystem. A Stata-file reader, a FIT-file reader, a couple of others all tripped on this. The fix is to read through DuckDB’s own filesystem abstraction:

auto &fs = FileSystem::GetFileSystem(context);
auto handle = fs.OpenFile(path, FileFlags::FILE_FLAGS_READ);
// read via the handle instead of a FILE*

The bonus: for any path it opens through that abstraction, the extension also gets s3://, https://, and registered-buffer support on native builds. Raw fopen was leaving that on the table.

3. Raw HTTP instead of HTTPUtil

An extension that opens its own httplib client needs OS sockets, which WASM doesn’t have. DuckDB ships an HTTP abstraction (HTTPUtil) that’s already wired to the browser’s HTTP stack under WASM and to the normal stack natively. Routing requests through it makes the extension work in WASM and inherit DuckDB’s proxy and TLS settings on native. (Occasionally you find the opposite: one “HTTP stats” extension already subclassed HTTPUtil correctly, and its only failure was a test that hit a live network endpoint.)

4. A dependency that doesn’t exist in WASM

The delta_classic extension delegates all its work to the core delta extension, and delta is a Rust extension (delta-kernel-rs) that no DuckDB distribution builds for WASM, upstream included. The native builds are on the catalog, but extensions.duckdb.org/<version>/wasm_eh/delta.duckdb_extension.wasm 404s for every version I checked. No amount of fixing delta_classic helps; its dependency can’t be there. The honest fix is to mark it WASM-excluded so users get a clear message instead of a baffling signature error.

5. Tests that were never WASM’s fault

A few “failures” were the test files, not the engine — a suite asserting True/False where the engine renders true/false, a numeric test rounding to one more decimal place than the platform reproduces. Real signal, just pointed at the test rather than the binary. Fixing those (and teaching the comparator that true and 1 are the same boolean) cleared them.

The scoreboard

Here’s one full census run: Haybarn engine v1.5.3, wasm_eh, every WASM-enabled community extension, tested on June 13, 2026.

StatusCountMeaning
pass58every runnable record passed
fail43a record produced a wrong result or unexpected error
skip9every test file used directives the runner doesn’t support
no-tests5the repo ships no test/sql/*.test files
not-deployed8declared WASM-enabled, but no artifact on the catalog
crash1the engine died loading it
Total surveyed124

That run executed 11,219 sqllogictest records across 1,088 test files, with 873 record failures. If you drop the categories that aren’t really a verdict on a shipped binary (not-deployed, no-tests, skip), 102 extensions actually ran records (pass, fail, or crash), and 58 of them pass — a 57% pass rate. A failure here usually doesn’t mean a broken extension so much as one that hasn’t been adapted to WASM’s constraints yet: no filesystem, no sockets, stricter linking.

The fixes aren’t hypothetical. The 18 WASM extensions Query.Farm maintains all pass now — several of them used to load and then throw on the first call — and the harness will catch it if any regress. The same fixes apply to everyone else’s extensions, which is why I sent them upstream.

I didn’t root-cause all 43 failures, but for the third-party ones I could diagnose, I opened fifteen issues (all linked in the appendix below). Each names the file and symbol, explains the LINKED_LIBS, filesystem, or HTTP fix, and includes code to copy. They’re written against DuckDB-WASM in general rather than Haybarn specifically, since the patterns apply to any distribution.

What I’d tell anyone shipping WASM extensions

A passing compile only tells you the linker was happy. For WASM the gap between that and a working extension is wide, because most of the failures live in runtime symbol resolution and the absence of a filesystem or sockets. The only way to know is to load the artifact and call into it.

When you do, the fixes are usually small: a LINKED_LIBS entry, a FileSystem::OpenFile instead of fopen, an HTTPUtil call instead of raw sockets. The hard part was never the fix. It’s seeing the failure at all, which a green compile badge will happily hide.

There’s nothing Haybarn-specific about the check itself — it runs any extension’s own test suite against any duckdb-wasm engine, so it would fit just as well in the community-extensions release pipeline upstream. Once it’s had proper review, I’d like to see DuckDB run something like it before publishing WASM artifacts. A compile that never executed isn’t much of a guarantee, and this is a cheap way to turn it into one.

The harness is on GitHub, MIT-licensed. It’s standalone on purpose, not wired into any build pipeline, so I can point it at the catalog by hand whenever I want a real answer instead of a compile:

# one extension, or drop --only to run the whole catalog
node bin/test.mjs --community-dir ../haybarn-community-extensions --only jsonata

Appendix: full results

Engine v1.5.3, wasm_eh, tested June 13, 2026. This is a snapshot; some of the failing extensions below have fixes filed or already deployed since.

Each extension links to its source repository.

Pass (58)

Fail (43)

These didn’t pass on the test date. In most cases that’s a small, WASM-specific gap rather than a broken extension; where I diagnosed one, the specific fix is in the issues filed below.

Crash (1): netquack

Skip — unsupported test directives, no functional verdict (9): arrow, cozip, delta_export, harbor, mpduck, nsv, pivot_table, sitemap, web_search

No tests in repo (5): duckdbi, duckgl, ducklake_cdc, duckorch, sheetreader

Declared WASM-enabled but not deployed to the catalog (8): elasticsearch, flock, gdx, miint, nats_js, sitting_duck, valhalla_routing, webmacro

Issues filed (15)

The issues I opened on third-party extensions, grouped by the fix each one recommends.

Read files through DuckDB’s FileSystem instead of raw fopen/fstream:

Name the dependency in LINKED_LIBS so the WASM link includes it:

Make HTTP requests through DuckDB’s HTTPUtil (or exclude WASM) instead of raw sockets:

Exclude the WASM platforms, because the extension can’t work there as built:

#DuckDB #WebAssembly #DuckDB Extensions #Testing #Haybarn #Query.Farm

Related Posts