duckdb-spanner

A DuckDB extension for querying Google Cloud Spanner databases directly from DuckDB.

Experimental: This extension is under active development. The API (function signatures, named parameters, configuration) may change without notice between versions. Do not depend on backward compatibility.

Both GoogleSQL and PostgreSQL dialect databases are supported transparently. Dialect detection is automatic — no configuration needed. spanner_query and spanner_scan work the same way regardless of the database dialect.

Prerequisites

Rust (stable)
DuckDB CLI (for running the extension)
Python 3 (for make extension metadata appending)
Make

Known Issues

This project depends on a patched version of gcloud-spanner via [patch.crates-io].
The upstream RowIterator::try_recv() discards metadata and stats when a PartialResultSet has empty values, which breaks columns_metadata() and stats() for QueryMode::Plan, empty result sets, and other scenarios.
See yoshidan/google-cloud-rust#428 for details.
database_role (fine-grained access control) is not yet supported as a named parameter. The upstream gcloud-spanner SessionConfig does not expose creator_role for session creation (the underlying BatchCreateSessionsRequest.session_template supports it, but the Rust client hardcodes it to None).
Results are streamed via an internal channel. Memory usage is bounded regardless of result set size.
use_parallelism enables the partitioned API (required for Data Boost), but partitions are currently executed sequentially. True concurrent partition execution is planned for a future release.

Installation

Build from Source

Clone the repository with submodules:

git clone --recurse-submodules https://github.com/apstndb/duckdb-spanner.git
cd duckdb-spanner

Build the extension with metadata (recommended):

make extension

Or build with cargo directly (without extension metadata):

cargo build --features loadable-extension --release

The built extension is spanner.duckdb_extension (via make extension) or target/release/libduckdb_spanner.dylib (macOS) / libduckdb_spanner.so (Linux).

Loading the Extension

This is an unsigned extension, so DuckDB must be started with the -unsigned flag:

duckdb -unsigned

Then load the extension:

LOAD 'path/to/spanner.duckdb_extension';

Or set the option programmatically before loading:

SET allow_unsigned_extensions = true;
LOAD 'path/to/spanner.duckdb_extension';

For a quick build-and-launch workflow:

make duckdb

Authentication

This extension uses Application Default Credentials (ADC).
For local development with end-user credentials, install the Google Cloud SDK and run:

gcloud auth application-default login

When using the Spanner emulator, no authentication is required. Either set the SPANNER_EMULATOR_HOST environment variable or pass endpoint directly:

export SPANNER_EMULATOR_HOST=localhost:9010

Getting Started

Run a SQL Query

SELECT * FROM spanner_query(
    'projects/myproj/instances/myinst/databases/mydb',
    'SELECT Id, Name FROM Users WHERE Age > 20'
);

Read a Table

SELECT * FROM spanner_scan(
    'projects/myproj/instances/myinst/databases/mydb',
    'Users'
);

With the Spanner Emulator

export SPANNER_EMULATOR_HOST=localhost:9010

-- endpoint is inferred from SPANNER_EMULATOR_HOST
SELECT * FROM spanner_query(
    'projects/test/instances/test/databases/test',
    'SELECT * FROM Users'
);

-- Or pass endpoint explicitly
SELECT * FROM spanner_query(
    'projects/test/instances/test/databases/test',
    'SELECT * FROM Users',
    endpoint := 'localhost:9010'
);

Table Functions

`spanner_query`

Runs arbitrary Spanner SQL. Implemented as a table macro that wraps spanner_query_raw, accepting a STRUCT for params and automatically converting it to JSON.

SELECT * FROM spanner_query(
    'projects/myproj/instances/myinst/databases/mydb',
    'SELECT * FROM Users WHERE Age > @min_age',
    params := {'min_age': spanner_value(21::BIGINT)},
    exact_staleness_secs := 10,
    use_data_boost := true
);

`spanner_scan`

Reads all rows from a table using the Spanner Read API.

SELECT * FROM spanner_scan(
    'projects/myproj/instances/myinst/databases/mydb',
    'Users',
    index := 'UsersByName'
);

Named Parameters

Named parameters are inspired by BigQuery's EXTERNAL_QUERY and CloudSpannerProperties.

Both functions accept the following named parameters:

Parameter	Type	Default	Description
`endpoint`	VARCHAR	(production)	Custom gRPC endpoint (e.g., `localhost:9010` for the emulator)
`use_parallelism`	BOOLEAN	`false`	Use partitioned query/read for parallel execution
`use_data_boost`	BOOLEAN	`false`	Enable Data Boost
`max_parallelism`	INTEGER	(default)	Maximum number of partitions
`exact_staleness_secs`	BIGINT		Read at an exact staleness (seconds ago)
`max_staleness_secs`	BIGINT		Read with bounded staleness (at most N seconds ago)
`read_timestamp`	VARCHAR		Read at a specific timestamp (RFC 3339, e.g., `'2024-01-15T10:30:00Z'`)
`min_read_timestamp`	VARCHAR		Read at a timestamp no earlier than this (RFC 3339)
`priority`	VARCHAR		Request priority: `'low'`, `'medium'`, or `'high'`

At most one timestamp bound parameter can be specified. If none is set, Spanner uses a strong read (the default).

spanner_query additionally accepts:

Parameter	Type	Description
`params`	STRUCT	Query parameters (see Query Parameters)

spanner_scan additionally accepts:

Parameter	Type	Description
`index`	VARCHAR	Secondary index name to use for the read
`dialect`	VARCHAR	Database dialect: `'googlesql'` or `'postgresql'` (auto-detected if omitted)

Query Parameters

The params parameter accepts a STRUCT mapping parameter names to values. Two helper macros format values for Spanner type compatibility.

`spanner_value(val)` -- Auto-Detect Type

Infers the Spanner type from DuckDB's typeof() and converts the value to a Spanner-compatible format.

spanner_value(42::BIGINT)                           -- {"value":42,"type":"INT64"}
spanner_value('2024-01-15'::DATE)                   -- {"value":"2024-01-15","type":"DATE"}
spanner_value('\xDEAD'::BLOB)                       -- {"value":"3kFE","type":"BYTES"}
spanner_value('2024-06-15T10:30:00Z'::TIMESTAMPTZ)  -- {"value":"2024-06-15T10:30:00.000000Z","type":"TIMESTAMP"}
spanner_value(NULL::BIGINT)                         -- {"value":null,"type":"INT64"}

See DuckDB to Spanner type mapping for the full conversion table.

`spanner_typed(val, type_name)` -- Explicit Type

For cases where auto-detection is insufficient, specify the Spanner type name directly.

spanner_typed(NULL, 'INT64')    -- {"value":null,"type":"INT64"}
spanner_typed('abc', 'STRING')  -- {"value":"abc","type":"STRING"}

Plain Values

Values without spanner_value() or spanner_typed() wrappers are passed as raw JSON. Types are inferred from the JSON representation (number, string, boolean, null).

-- Mix plain and typed values
params := {'id': 1, 'name': spanner_value('Alice')}

Type Mapping

DuckDB to Spanner (Query Parameters)

spanner_value() maps DuckDB types to Spanner types. Values are serialized as JSON internally:

DuckDB Type	Spanner Type	JSON Value
BOOLEAN	BOOL	boolean
TINYINT, SMALLINT, INTEGER, BIGINT	INT64	number
FLOAT	FLOAT32	number
DOUBLE	FLOAT64	number
VARCHAR	STRING	string
DATE	DATE	string
TIMESTAMP	TIMESTAMP	string (RFC 3339: `%Y-%m-%dT%H:%M:%S.%fZ`)
TIMESTAMP WITH TIME ZONE	TIMESTAMP	string (normalized to UTC, RFC 3339)
BLOB	BYTES	string (base64 encoded)
HUGEINT, UBIGINT, DECIMAL	NUMERIC	string (preserves precision)
UUID	UUID	string
INTERVAL	INTERVAL	string (ISO 8601 duration)
JSON	JSON	any JSON value
TIME	STRING	string
BIT	BYTES	string (base64 encoded)
T[]	ARRAY<T>	JSON array (elements follow scalar conversion rules)

spanner_value() also supports array types:

spanner_value([1, 2, 3])             -- {"value":[1,2,3],"type":"ARRAY<INT64>"}
spanner_value(['a', 'b'])            -- {"value":["a","b"],"type":"ARRAY<STRING>"}
spanner_value([true, false])         -- {"value":[true,false],"type":"ARRAY<BOOL>"}

STRUCT parameters are not supported.

Spanner to DuckDB (Query Results)

Spanner Type	DuckDB Type
BOOL	BOOLEAN
INT64	BIGINT
FLOAT32	FLOAT
FLOAT64	DOUBLE
NUMERIC	DECIMAL(38,9)
STRING	VARCHAR
JSON	VARCHAR (aliased as JSON)
BYTES	BLOB
DATE	DATE
TIMESTAMP	TIMESTAMP WITH TIME ZONE
UUID	UUID
INTERVAL	INTERVAL
ARRAY<T>	LIST(T)
STRUCT<...>	STRUCT(...)
PROTO	BLOB
ENUM	BIGINT

Testing

Shell Tests

Shell tests run SQL queries against a DuckDB CLI with the extension loaded:

make test

Integration Tests

Integration tests require a Spanner emulator:

# Start the emulator
make emulator-start

# Run integration tests (uses testcontainers, but the emulator must be available)
cargo test

Advanced

`spanner_query_raw` -- Low-Level Table Function

The underlying table function that spanner_query wraps. Use this when:

You have a pre-built JSON string for params (e.g., from application code)
Table macro expansion causes issues in your client library or tooling

SELECT * FROM spanner_query_raw(
    'projects/myproj/instances/myinst/databases/mydb',
    'SELECT * FROM Users WHERE Age > @min_age',
    params := '{"min_age": {"value": 21, "type": "INT64"}}'
);

The params parameter is a VARCHAR containing a JSON object. Use spanner_params() to build it from a STRUCT:

SELECT * FROM spanner_query_raw(
    'projects/myproj/instances/myinst/databases/mydb',
    'SELECT * FROM Users WHERE Age > @min_age',
    params := spanner_params({'min_age': spanner_value(21::BIGINT)})
);

Registered Names

This extension registers the following names into the global DuckDB namespace.

Name	Kind	Description
`spanner_query`	table macro	Wraps `spanner_query_raw` with ergonomic params (see Table Functions)
`spanner_query_raw`	table function	Execute Spanner SQL (see Low-Level Table Function)
`spanner_scan`	table function	Read a Spanner table (see Table Functions)
`spanner_value(val)`	scalar macro	Auto-detect Spanner type from DuckDB type (see Query Parameters)
`spanner_typed(val, type_name)`	scalar macro	Explicit Spanner type wrapper (see Query Parameters)
`spanner_params(s)`	scalar macro	Convert a STRUCT to a JSON params string for `spanner_query_raw`
`interval_to_iso8601(i)`	scalar macro	Convert a DuckDB INTERVAL to an ISO 8601 duration string (e.g., `'P1Y3M'`, `'PT2H30M'`)
`_spanner_type_name(t)`	scalar macro	Internal: map a DuckDB type name (scalar or array) to a Spanner type name
`_spanner_scalar_type_name(t)`	scalar macro	Internal: scalar-only type name mapping used by `_spanner_type_name`

Names prefixed with _ are internal implementation details and may change without notice.

Convenience Macro Pattern

You can define a convenience macro that hardcodes your database and endpoint:

CREATE MACRO my_query(
    sql, params := NULL, use_parallelism := NULL,
    use_data_boost := NULL, max_parallelism := NULL,
    exact_staleness_secs := NULL, max_staleness_secs := NULL,
    read_timestamp := NULL, min_read_timestamp := NULL,
    priority := NULL
) AS TABLE
SELECT * FROM spanner_query(
    'projects/myproj/instances/myinst/databases/mydb',
    sql,
    endpoint := 'localhost:9010',
    params := params,
    use_parallelism := use_parallelism,
    use_data_boost := use_data_boost,
    max_parallelism := max_parallelism,
    exact_staleness_secs := exact_staleness_secs,
    max_staleness_secs := max_staleness_secs,
    read_timestamp := read_timestamp,
    min_read_timestamp := min_read_timestamp,
    priority := priority
);

-- Usage
SELECT * FROM my_query('SELECT * FROM Users');
SELECT * FROM my_query(
    'SELECT * FROM Users WHERE Age > @min',
    params := {'min': spanner_value(21::BIGINT)}
);

apstndb/duckdb-spanner

duckdb-spanner

Prerequisites

Known Issues

Installation

Build from Source

Loading the Extension

Authentication

Getting Started

Run a SQL Query

Read a Table

With the Spanner Emulator

Table Functions

`spanner_query`

`spanner_scan`

Named Parameters

Query Parameters

`spanner_value(val)` -- Auto-Detect Type

`spanner_typed(val, type_name)` -- Explicit Type

Plain Values

Type Mapping

DuckDB to Spanner (Query Parameters)

Spanner to DuckDB (Query Results)

Testing

Shell Tests

Integration Tests

Advanced

`spanner_query_raw` -- Low-Level Table Function

Registered Names

Convenience Macro Pattern

On this page

Contributors

Latest Release

apstndb/duckdb-spanner

duckdb-spanner

Prerequisites

Known Issues

Installation

Build from Source

Loading the Extension

Authentication

Getting Started

Run a SQL Query

Read a Table

With the Spanner Emulator

Table Functions

spanner_query

spanner_scan

Named Parameters

Query Parameters

spanner_value(val) -- Auto-Detect Type

spanner_typed(val, type_name) -- Explicit Type

Plain Values

Type Mapping

DuckDB to Spanner (Query Parameters)

Spanner to DuckDB (Query Results)

Testing

Shell Tests

Integration Tests

Advanced

spanner_query_raw -- Low-Level Table Function

Registered Names

Convenience Macro Pattern

On this page

Contributors

Latest Release

`spanner_query`

`spanner_scan`

`spanner_value(val)` -- Auto-Detect Type

`spanner_typed(val, type_name)` -- Explicit Type

`spanner_query_raw` -- Low-Level Table Function