"topic:sqoop" — Search

186 results for “topic:sqoop”

heibaiying/BigData-Notes

大数据入门指南 :star:

Java16.9k4.3kUpdated 3 hours ago

azkabanbig-databigdataflumehadoophbasehdfshivekafkamapreducephoenixscalasparksqoopstormyarnzookeeper

apache/sqoopArchived

Mirror of Apache Sqoop

Java979580Updated 3 weeks ago

big-datajavasqoop

sunnyandgood/BigData

💎🔥大数据学习笔记

Java681229Updated 3 weeks ago

flumehadoophbasehdfshivelinuxmapreducemysqlshellsqoopzookeeper

WeBankFinTech/Exchangis

Exchangis is a lightweight,highly extensible data exchange platform that supports data transmission between structured and unstructured heterogeneous data sources

Java458208Updated 1 month ago

dataspherestudiodataxetlexchangisflinklinkissqooptransmission-enginewedatasphere

bluishglc/bdp

A prototype project of big data platform, the source codes of the book Big Data Platform Architecture and Prototype

Java198145Updated 2 months ago

bigdatademokafkamiddle-endmiddle-officeoozieprototypequickstartredissparkspark-demospark-examplesspark-sqlspark-streamingspark-streaming-examplessparksqlsqoopsqoop-import

aliyun/aliyun-maxcompute-data-collectors

No description provided.

Java13366Updated 23 hours ago

aliyun-maxcomputebigdataflumekettleoggsqoop

Pushkr/Apache-Spark-Hands-On

Educational notes,Hands on problems w/ solutions for hadoop ecosystem

Python8776Updated 1 year ago

basicsbigdatacca175cheatsheetclouderaflumehadoophandsonhivesparksqoop

mrugankray/Big-Data-Cluster

The goal of this project is to build a docker cluster that gives access to Hadoop, HDFS, Hive, PySpark, Sqoop, Airflow, Kafka, Flume, Postgres, Cassandra, Hue, Zeppelin, Kadmin, Kafka Control Center and pgAdmin. This cluster is solely intended for usage in a development environment. Do not use it to run any production workloads.

Shell7623Updated 2 weeks ago

airflowcassandraconda-environmentcontrol-centerflumehadoophdfshivehuekadminkafkapgadmin4postgresqlpysparkpython3schema-registrysparksqoopzeppelin

Mrkuhuo/bigdata_learning

大数据组件学习代码

Java6519Updated 4 months ago

cdh5clickhousedataxdolphinschedulerdoriselasticsearchflinkhadoophbasehivehudiicebergjavapythonsparksqoop

vivek2319/Learn-Hadoop-and-Spark

This repository focuses on gathering and making a curated list resources to learn Hadoop for FREE.

Python5743Updated 1 week ago

apache-atlasapache-kafkaapache-knoxapache-rangerapache-solrapache-sparkapache-stormbigdataflumehadoophadoop-filesystemhbasehivehiveqlimpalamapreduceooziepigsqoopyarn

dimajix/spark-training

Repository used for Spark Trainings

Jupyter Notebook5465Updated 7 months ago

hadoophadoop-traininghivepysparkpythonscalasparkspark-mlspark-streamingspark-trainingsqoop

v5tech/cloud

云计算之hadoop、hive、hue、oozie、sqoop、hbase、zookeeper环境搭建及配置文件

Shell5343Updated 4 months ago

flumeflume-nghadoophbasehivehueooziepigsqoopzookeeper

Cigna/ibis

IBIS is a workflow creation-engine that abstracts the Hadoop internals of ingesting RDBMS data.

Python5015Updated 1 year ago

cignahadoophadoop-ecosystemhadoop-frameworkibisingestionooziesqoopsqoop2workflowworkflow-automationworkflow-scheduler

maniram-yadav/Big_DataHadoop_Projects

Big data projects implemented by Maniram yadav

PigLatin5035Updated 3 months ago

big-data-analyticsbig-data-projectsflumehadoophadoop-hdfshadoop-mapreducehdfshivemapreducepigpig-latinsparksqoop

tejasjbansal/HELTHCARE-SYSTEM

Data cleaning, pre-processing, and Analytics on a Health care data using Spark and Python.

Jupyter Notebook4833Updated 5 months ago

big-databig-data-projectshivepythonsparkspark-sqlsqoop

san089/Cloudera_Material

Cloudera_Material: Study Material to help people preparing for Cloudera CCA Spark and Hadoop Developer Exam (CCA175). Feel free to collaborate.

4131Updated 3 weeks ago

big-databigdataccacca175certificationclouderaflumehadoophivehive-metastorepysparksparksqoopsqoop-exportsqoop-importsqoop-session

Powerspace/pg2bq

Export PostgreSQL tables to Google BigQuery

Scala3710Updated 2 years ago

bigquerygcloudpostgresqlsqoop

ven2day/Bigdata-docker-sandbox

Docker Big Data Tools: This docker-compose file is configured to run multiple nodes. This is a Hadoop Cluster that contains the necessary tools that can be used in the BigData domain, It's a collection of docker containers that you can use directly.

VBA319Updated 5 months ago

flinkhadoophbasehivehuekafkametabasemongomysqlnifisparksqoopstormstreamsetszookeeper

gouchaoer/Increment_Backup_To_Hive

一个增量备份关系数据库(MySQL, PostgreSQL, SQL Server, SQLite, Oracle等)到hive的php脚本工具

PHP215Updated 1 year ago

hivemysqlphpsqoop

zenoyang/web-click-flow

网站点击流离线日志分析

Java1911Updated 1 year ago

etlflumehadoophivemapreducesqoop

lovnishverma/bigdataecosystem

Complete Big Data Ecosystem on Docker Desktop

Shell161Updated 3 weeks ago

bigdatadockerflumehadoophdfshivemapreducesparksqoop

Jayvardhan-Reddy/BigData-Ecosystem-Architecture

Life-cycle: Internal working of HDFS, SQOOP, HIVE, SPARK, HBASE, KAFKA with code.

Shell1517Updated 2 years ago

architecturearchitecture-componentsbig-databig-data-essentialsbigdatabigdata-modulehadoophadoop-ecosystemhadooparchitecturehbasehbase-clusterhdfshivekafkasparkspark-streamingsqoopyarn-hadoop-clusterzookeeper

Pathairush/rdbms_to_hdfs_data_pipeline

A data pipeline moving data from a Relational database system (RDBMS) to a Hadoop file system (HDFS).

Python152Updated 8 months ago

dockerhadoophivepostgresqlsparksqoop

Pathairush/airflow_hive_spark_sqoop

A docker using the airflow with Hadoop ecosystem (hive, spark, and sqoop)

Shell128Updated 11 hours ago

airflowdockerhadoophivesparksqoop

Stefen-Taime/ETL-Data-Pipeline-RDBMS-TO-HDFS-using-Airflow-Apache-Sqoop-Spark-Postgres-and-Hive

This project aims to move the data from a Relational database system (RDBMS) to a Hadoop file system (HDFS)

Python114Updated 8 months ago

airflowbig-datadatadocker-composeetl-pipelinehdfshiveinfrastructure-as-coderdbmssparksqlsqoop

rss161030/ETL-processes-using-Sqoop-Hadoop-Hive-Spark-and-Scala

I implemented various ETL processes like loading the data using sqoop from mysql to hdfs, transform the data using Spark and Scala, perform analytics using Spark and Scala and loading the data back to HDFS.

106Updated 6 days ago

hdfsscalasparksqoop

conch-stack/conch-bigdata

Big Data

HTML103Updated 2 years ago

ambaribigdatacdhdata-scienceflinkflumehadoophbasehdphivekafkasoftware-engineersparksqoop

PhamThe-KHDL/DS200.M21-Big-Data

DS200.M21-Phân Tích Dữ Liệu Lớn

Jupyter Notebook92Updated 1 month ago

apache-kafkaapache-sparkbig-databig-data-analyticshadoophivemapreducepigsqoop

Sathiyarajan/big-data-pipeline

Big Data

Java86Updated 2 years ago

clouderaflumehadoophbasehcataloghivejavakafkapigpythonscalasparksqoopubuntu1804windows-10zookeeper

milindjagre/HDPCD

This repository contains all the documents related to HDPCD certification.

PigLatin710Updated 6 days ago

apachebigbigdatacertificationdataflumehadoophcataloghdpcdhivehortonworkspigpig-latinsqoop

Page 1 of 7