"topic:sparksql" — Search

Data Accelerator for Apache Spark simplifies onboarding to Streaming of Big Data. It offers a rich, easy to use experience to help with creation, editing and management of Spark jobs on Azure HDInsights or Databricks while enabling the full power of the Spark engine.

C#30991Updated 2 months ago

apache-sparkazurebig-datacosmosdbdockereventhubhdinsightiotiothubkafkakafka-streamsnodejsreactservicefabricsparkspark-sqlspark-streamingsparksqlstreamingstreaming-data

hbutani/spark-druid-olapArchived

Sparkline BI Accelerator provides fast ad-hoc query capability over Logical Cubes. This has been folded into our SNAP Platform(http://bit.ly/2oBJSpP) an Integrated BI platform on Apache Spark.

Scala28090Updated 1 week ago

business-intelligenceolap-cubequery-optimizationsparksparksql

locationtech/rasterframes

Geospatial Raster support for Spark DataFrames

Jupyter Notebook25448Updated 2 weeks ago

earth-observationgeotrellisimage-processingmachine-learningscalasparkspark-mlsparksql

zio/zio-protoquill

Quill for Scala 3

Scala22358Updated 1 week ago

cassandrajdbclanguage-integrated-querylinqpostgresqlscalasparksparksqlsql

bluishglc/bdp

A prototype project of big data platform, the source codes of the book Big Data Platform Architecture and Prototype

Java198145Updated 2 months ago

bigdatademokafkamiddle-endmiddle-officeoozieprototypequickstartredissparkspark-demospark-examplesspark-sqlspark-streamingspark-streaming-examplessparksqlsqoopsqoop-import

ZhuXS/Spring-Shiro-Spark

Spring-Shiro-Spark是Spring-Boot Hibernate Spark Spark-SQL Shiro iView VueJs... ...的集成尝试

Java11434Updated 3 months ago

hibernate-jpaiviewshiro-securitysparksparksqlspring-bootvuejs

hyunjoonbok/PySpark

PySpark functions and utilities with examples. Assists ETL process of data modeling

Jupyter Notebook10476Updated 7 months ago

hadooppysparkpyspark-apipyspark-machine-learningpyspark-notebookpyspark-pythonsparksparksql

CybercentreCanada/jupyterlab-sql-editor

A JupyterLab extension providing, SQL formatter, auto-completion, syntax highlighting, Spark SQL and Trino

Jupyter Notebook9515Updated 5 days ago

auto-completiondataframedatagridextensionformatteripython-magicjsonjupyterlablspnested-structuresnotebookschemasparksqlsqlsyntax-highlightingtrinovscode-extension

saurfang/sparksql-protobuf

Read SparkSQL parquet file as RDD[Protobuf]

Scala9336Updated 6 months ago

parquetprotobufsparksql

microsoft/A-TALE-OF-THREE-CITIES

Analyzing the safety (311) dataset published by Azure Open Datasets for Chicago, Boston and New York City using SparkR, SParkSQL, Azure Databricks, visualization using ggplot2 and leaflet. Focus is on descriptive analytics, visualization, clustering, time series forecasting and anomaly detection.

R8736Updated 2 months ago

311-dataaiforsocialgoodanomaly-detectionanomalydiscoveryazureazure-databricksdatadatabricks-notebooksdatascience-machinelearningedageospatialleafletopendatarsparkrsparksqltime-series-analysistimeseries-forecastingvisualizationworkshop-materials

funkyminds/cleanframes

type-class based data cleansing library for Apache Spark SQL

Scala788Updated 1 year ago

apachesparkbigdatascalashapelesssparksparkscalasparksql

zsvoboda/ngods

New generation opensource data stack

Dockerfile759Updated 3 weeks ago

analyticsdatadata-pipelineicebergjdbcprestoprestodbprestosqlpythonscalasparkspark-sqlsparksqlsqltrinotrinodb

swoop-inc/spark-records

Bulletproof Apache Spark jobs with fast root cause analysis of failures.

Scala7315Updated 8 months ago

apache-sparkbig-datascalasparkspark-recordssparksqlswoop

yaooqinn/spark-rangerArchived

已经合入(apache/incubator-kyuubi) ACL Management for Apache Spark SQL with Apache Ranger.

Scala5857Updated 1 month ago

aclauthorizationdata-maskingrangerrow-level-securitysparksparksql

potix2/spark-google-spreadsheets

Google Spreadsheets datasource for SparkSQL and DataFrames

Scala5746Updated 1 year ago

data-framescalasparksparksqlspreadsheet

liumingmusic/HadoopLearning

全套大数据基础学习教程，包含最基础的centos、maven。大数据主要包含hdfs、mr、yarn、hbase、kafka、scala、sparkcore、sparkstreaming、sparksql。教程包含所有的源代码演示以及在线文档说明。

Scala5425Updated 11 months ago

centoshadoophbasehdfsmapreducemavenscalaspake2sparksqlsparkstreamingyarn

jubins/Spark-And-MLlib-Projects

This repository contains Spark, MLlib, PySpark and Dataframes projects

Jupyter Notebook4995Updated 4 months ago

aws-ec2mllibpysparkpythonsparkspark-dataframesspark-mlspark-streamingsparksql

BenFradet/struct-type-encoder

Deriving Spark DataFrame schemas from case classes

Scala4413Updated 3 years ago

sparksparksql

4paradigm/DemoApps

demo applications that show how to deploy offline feature engineering solutions to online in one minute with fedb and nativespark

Python3513Updated 3 years ago

feature-engineeringlightgbmmachine-learningrealtimerealtime-decisionsparksqlsqltensorflow

yaooqinn/spark-postgres

PostgreSQL and GreenPlum Data Source for Apache Spark

Scala3513Updated 8 months ago

greenplumpostgrespostgresqlsparksparksqltransactional

wushengyeyouya/Hive-JDBC-Proxy

Hive-JDBC-Proxy是一个高性能的HiveServer2和Spark ThriftServer的代理服务，具备负载均衡、基于规则转发Hive JDBC Client的请求给到HiveServer2和Spark ThriftServer的能力。

Scala3316Updated 7 months ago

hivehiveqlhiveserver2jdbcjdbc-driverproxysparkspark-sqlsparksqlthrift-server

spoddutur/cloud-based-sql-engine-using-spark

Cloud-based SQL engine using SPARK where data is accessible as JDBC/ODBC data source via Spark ThriftServer.

Java3114Updated 1 year ago

apache-sparkbeelinehadoop-frameworkjdbcspark-thrift-serversparksqlsql-enginethrift-server

lei-zuquan/java_spark

Spark 2.x 案例操作：Scala版本与 Java1.8lambda版代码示例。涵盖Spark核心技术操作SparkCore、SparkSql、SparkStreaming。同时提供了Spark高级性能优化、序列化、广播变量、数据倾斜、算子优化、JVM优化、troubleshooting、数据倾斜解决方案。是多年来根据工作积累整理出来！

Java283Updated 7 months ago

java-8kafkascalasparksparkcoresparksqlsparkstreaming

bjkonglu/resume-bjkonglu

记录Spark、Flink研究经验

267Updated 12 months ago

flinkkerberossparksparksql

Page 1 of 9