195 results for “topic:cloudera”
1000+ DevOps Bash Scripts - AWS, GCP, Kubernetes, Docker, CI/CD, APIs, SQL, PostgreSQL, MySQL, Hive, Impala, Kafka, Hadoop, Jenkins, GitHub, GitLab, BitBucket, Azure DevOps, TeamCity, Spotify, MP3, LDAP, Code/Build Linting, pkg mgmt for Linux, Mac, Python, Perl, Ruby, NodeJS, Golang, Advanced dotfiles: .bashrc, .vimrc, .gitconfig, .screenrc, tmux..
Oryx 2: Lambda architecture on Apache Spark, Apache Kafka for real-time large scale machine learning
450+ AWS, Hadoop, Cloud, Kafka, Docker, Elasticsearch, RabbitMQ, Redis, HBase, Solr, Cassandra, ZooKeeper, HDFS, Yarn, Hive, Presto, Drill, Impala, Consul, Spark, Jenkins, Travis CI, Git, MySQL, Linux, DNS, Whois, SSL Certs, Yum Security Updates, Kubernetes, Cloudera etc...
CDP Public Cloud is an integrated analytics and data management platform deployed on cloud services. It offers broad data analytics and artificial intelligence functionality along with secure user access and data governance features.
Hive JDBC "uber" or "standalone" jar based on the latest Apache Hive version
80+ HAProxy Configs for Hadoop, Big Data, NoSQL, Docker, Kubernetes, Elasticsearch, SolrCloud, HBase, MySQL, PostgreSQL, Apache Drill, Hive, Presto, Impala, Hue, ZooKeeper, SSH, RabbitMQ, Redis, Riak, Cloudera, OpenTSDB, InfluxDB, Prometheus, Kibana, Graphite, Rancher etc.
Educational notes,Hands on problems w/ solutions for hadoop ecosystem
CDH安装手册
Cloudera_Material: Study Material to help people preparing for Cloudera CCA Spark and Hadoop Developer Exam (CCA175). Feel free to collaborate.
Code for the deployment of Hadoop clusters, written in Bourne or Bourne Again shell.
One Click Script to Deploy CDP (CDP PvC & HDP & CDH)
Guide to data platforms and tools
FLaNK AI Weekly covering Apache NiFi, Apache Flink, Apache Kafka, Apache Spark, Apache Iceberg, Apache Ozone, Apache Pulsar, and more...
Terraform module to deploy Cloudera on Oracle Cloud Infrastructure (OCI)
Assets used in Cloudera Tutorials
CSD for Apache Airflow
Perl Utility Library for my other repos
Parcel for Apache Airflow
CDH5.16.2 离线安装脚本
Datagenerator for Data Services
Cloudera CDP SDK for Java
DocGenius AI - Generative AI Chatbot for your Documents
Modeling Lifecycle with ACME Occupancy Detection and Cloudera
CDP command line interface (CLI)
Huemul BigDataGovernance, es una framework que trabaja sobre Spark, Hive y HDFS. Permite la implementación de una estrategia corporativa de dato único, basada en buenas prácticas de Gobierno de Datos. Permite implementar tablas con control de Primary Key y Foreing Key al insertar y actualizar datos utilizando la librería, Validación de nulos, largos de textos, máximos/mínimos de números y fechas, valores únicos y valores por default. También permite clasificar los campos en aplicabilidad de derechos ARCO para facilitar la implementación de leyes de protección de datos tipo GDPR, identificar los niveles de seguridad y si se está aplicando algún tipo de encriptación. Adicionalmente permite agregar reglas de validación más complejas sobre la misma tabla.
Create Greenplum docker files
Cloudera Flow Management Workshop with Apache NiFi
A Simple Pythonic Client wrapper for Cloudera on Cloud
Homebrew Formulas for cloudera tools
Geocoding and Reverse Geocoding at Scale