DataSetIQ/datasetiq-python
Official Python client for DataSetIQ β The Modern Economic Data Platform. Access millions of datasets with pandas-ready DataFrames.
DataSetIQ Python Client
Official Python SDK for DataSetIQ β The Modern Economic Data Platform
π Features
- Millions of Macro Datasets: Access FRED, BLS, Census, World Bank, IMF, OECD, and more
- Pandas-Ready: Returns clean DataFrames with date index
- Intelligent Caching: Disk-based caching with TTL (24h default)
- Automatic Retries: Exponential backoff with
Retry-Aftersupport - Free Tier: 25 requests/minute + 25 AI insights/month
- Type-Safe Errors: Helpful exception messages with upgrade paths
π¦ Installation
pip install datasetiqRequirements: Python 3.9+
π Quick Start
1. Get Your Free API Key
Visit datasetiq.com/dashboard/api-keys to create a free account and generate your API key.
2. Fetch Economic Data
import datasetiq as iq
# Set your API key
iq.set_api_key("diq_your_key_here")
# Get time series data as a Pandas DataFrame
df = iq.get("fred-cpi")
print(df.head())Output:
value
date
1947-01-01 21.48
1947-02-01 21.62
1947-03-01 22.00
1947-04-01 22.00
1947-05-01 21.95
3. Plot It
import matplotlib.pyplot as plt
df['value'].plot(title="Consumer Price Index", figsize=(12, 6))
plt.ylabel("CPI")
plt.show()π API Reference
Core Functions
get(series_id, start=None, end=None, dropna=False)
Fetch time series data as a Pandas DataFrame.
Parameters:
series_id(str): Series identifier (e.g.,"fred-cpi","bls-unemployment")start(str, optional): Start date inYYYY-MM-DDformatend(str, optional): End date inYYYY-MM-DDformatdropna(bool): Drop rows with NaN values (default:False)
Returns: pd.DataFrame with date index and value column
Example:
# Get recent data
df = iq.get("fred-gdp", start="2020-01-01", end="2023-12-31")
# Preserve data gaps (default)
df = iq.get("fred-cpi", dropna=False)
# Drop missing values
df = iq.get("fred-cpi", dropna=True)search(query, limit=10, offset=0)
Search for datasets by keyword.
Parameters:
query(str): Search term (searches titles, descriptions, IDs)limit(int): Max results to return (default:10, max:10)offset(int): Pagination offset (default:0)mode(str):"keyword"(default) or"semantic"(where supported by API)
Returns: pd.DataFrame with columns: id, slug, title, description, provider, frequency, start_date, end_date, last_updated
Example:
results = iq.search("unemployment rate")
print(results[["id", "title", "provider"]])
# Output:
# id title provider
# 0 fred-unrate Unemployment Rate (U.S.) FRED
# 1 bls-lns14000000 Labor Force: Unemployed BLSFeature Engineering Helpers
add_features(series, lags=(1,3,12), windows=(3,6,12), include=None, dropna=False)
Generate common modeling features (lags, rolling stats, MoM/YoY %, z-scores) for a single series.
df = iq.add_features("fred-cpi", lags=[1, 3, 12], windows=[3, 12])
print(df[["value", "value_yoy_pct", "value_mom_pct", "value_lag_1"]].tail())Lightweight Insights
get_insight(series, window="1y")
Return a small dict with summary text + key metrics (latest value, MoM, YoY, volatility, trend).
insight = iq.get_insight("fred-cpi", window="1y")
print(insight["summary"])
# fred-cpi: latest 311.17 on 2023-12-01 | +0.24% vs prior | +3.12% YoY | trend upward | volatility (std) 1.23ML-Ready Bundles
get_ml_ready(series_ids, align="inner", impute="ffill+median", features="default")
Fetch multiple series, align on date, impute gaps, and add per-series features (lags, rolling stats, MoM/YoY %, z-score).
Requires API key on a paid plan.
df = iq.get_ml_ready(
["fred-cpi", "fred-gdp"],
align="inner",
impute="ffill+median",
features="default",
lags=[1, 3, 12],
windows=[3, 12],
)
print(df.head())Configuration
set_api_key(api_key)
Set your DataSetIQ API key.
iq.set_api_key("diq_your_key_here")configure(**options)
Customize client behavior.
Options:
api_key(str): Your API keybase_url(str): API base URL (default:https://www.datasetiq.com/api/public)timeout(tuple):(connect_timeout, read_timeout)in seconds (default:(3.05, 30))max_retries(int): Max retry attempts (default:3)max_retry_sleep(int): Cap total backoff time in seconds (default:20)anon_max_pages(int): Safety limit for anonymous pagination (default:200)data_cache_ttl(int): Cache TTL for time series data in seconds (default:86400/ 24h)search_cache_ttl(int): Cache TTL for search results in seconds (default:900/ 15m)enable_cache(bool): Enable/disable disk caching (default:True)
Example:
iq.configure(
api_key="diq_your_key_here",
max_retries=5,
data_cache_ttl=3600, # 1 hour cache
enable_cache=True
)Cache Management
clear_cache()
Clear all cached data.
count = iq.clear_cache()
print(f"Cleared {count} cached files")get_cache_size()
Get cache statistics.
file_count, total_bytes = iq.get_cache_size()
print(f"Cache: {file_count} files, {total_bytes / 1024 / 1024:.2f} MB")π Authentication Modes
Authenticated Mode (Recommended)
With API Key:
- β Full CSV exports (all observations)
- β Higher rate limits (25-500 RPM based on plan)
- β Access to AI insights and premium features
- β Date filtering support
iq.set_api_key("diq_your_key_here")
df = iq.get("fred-cpi") # Full datasetAnonymous Mode
Without API Key:
β οΈ Returns latest 100 observations only (most recent data)β οΈ Lower rate limits (5 RPM)β οΈ Metadata-only for some datasetsβ οΈ No date filtering support
# No API key set
df = iq.get("fred-cpi") # Latest 100 observations only
print(df.tail()) # Most recent data pointsπ‘οΈ Error Handling
All errors include helpful marketing messages to guide you toward solutions.
Authentication Required (401)
try:
df = iq.get("fred-cpi")
except iq.AuthenticationError as e:
print(e)
# Output:
# [UNAUTHORIZED] Authentication required
#
# π GET YOUR FREE API KEY:
# β https://www.datasetiq.com/dashboard/api-keys
# ...Rate Limit Exceeded (429)
try:
df = iq.get("fred-cpi")
except iq.RateLimitError as e:
print(e)
# Output:
# [RATE_LIMITED] Rate limit exceeded: 26/25 requests this minute
#
# β‘ RATE LIMIT REACHED:
# 26/25 requests this minute
#
# π INCREASE YOUR LIMITS:
# β https://www.datasetiq.com/pricing
# ...Quota Exceeded (429)
try:
# Generate 26th basic insight on free plan
pass
except iq.QuotaExceededError as e:
print(e.metric) # "insight_basic"
print(e.current) # 26
print(e.limit) # 25Series Not Found (404)
try:
df = iq.get("invalid-series-id")
except iq.NotFoundError as e:
print(e)
# Output:
# [NOT_FOUND] Series not found
#
# π SERIES NOT FOUND
#
# π‘ TIP: Search for series first:
# import datasetiq as iq
# results = iq.search('unemployment rate')
# ...π Advanced Examples
Comparing Multiple Series
import datasetiq as iq
import pandas as pd
# Fetch multiple series
cpi = iq.get("fred-cpi", start="2020-01-01")
gdp = iq.get("fred-gdp", start="2020-01-01")
# Merge on date
df = pd.merge(
cpi.rename(columns={"value": "CPI"}),
gdp.rename(columns={"value": "GDP"}),
left_index=True,
right_index=True,
how="outer"
)
print(df.head())Calculate Year-over-Year Change
df = iq.get("fred-cpi", start="2015-01-01")
# Calculate YoY % change
df['yoy_change'] = df['value'].pct_change(periods=12) * 100
print(df.tail())Export to Excel
df = iq.get("fred-gdp")
df.to_excel("gdp_data.xlsx")π§ͺ Development
Setup
git clone https://github.com/DataSetIQ/datasetiq-python.git
cd datasetiq-python
pip install -e ".[dev]"Run Tests
pytestCode Formatting
black datasetiq tests
ruff check datasetiq testsπ‘οΈ Stability & API Guarantees
Current Status: Beta (0.x versions)
- Breaking changes may occur between minor versions (e.g., 0.1.x β 0.2.x)
- Core functions (
get(),set_api_key()) are stable and tested - v1.0 release will follow semantic versioning with backward compatibility guarantees
- Subscribe to GitHub releases for updates
πΊοΈ Roadmap
- Add
get_insight()for AI-generated analysis - Support batch requests:
iq.get_many(["fred-cpi", "fred-gdp"]) - Async support:
await iq.get_async("fred-cpi") - Streaming for large datasets
- Jupyter notebook integration (progress bars)
π Resources
- Homepage: datasetiq.com
- API Keys: datasetiq.com/dashboard/api-keys
- Documentation: datasetiq.com/docs
- Pricing: datasetiq.com/pricing
- GitHub: github.com/DataSetIQ/datasetiq-python
- Support: support@datasetiq.com
π License
MIT License β See LICENSE for details.
π€ Contributing
Contributions are welcome! Please open an issue or submit a pull request.
Made with β€οΈ by DataSetIQ