3,566 results for “topic:hadoop”
Data science Python notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras), scikit-learn, Kaggle, big data (Spark, Hadoop MapReduce, HDFS), matplotlib, pandas, NumPy, SciPy, Python essentials, AWS, and various command lines.
Luigi is a Python module that helps you build complex pipelines of batch jobs. It handles dependency resolution, workflow management, visualization etc. It also comes with Hadoop support built in.
🏆 Real-Time no-code, powerful and secure ORM 🚀 providing APIs and Docs without coding by Backend, and Frontend(Client) can customize response JSONs 🏆 实时 零代码、全功能、强安全 ORM 库 🚀 后端接口和文档零代码,前端(客户端) 定制返回 JSON 的数据和结构
大数据入门指南 :star:
The official home of the Presto distributed SQL query engine for big data
Apache Hadoop
Suite of tools for deploying and training deep learning models using the JVM. Highlights include model import for keras, tensorflow, and onnx/pytorch, a modular and tiny c++ library for running math code and a java based math library on top of the core c++ library. Also includes samediff: a pytorch/tensorflow like library for running deep learn...
Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)
专注大数据学习面试,大数据成神之路开启。Flink/Spark/Hadoop/Hbase/Hive...
At LinkedIn, we are using this curriculum for onboarding our entry-level talents into the SRE role.
1000+ DevOps Bash Scripts - AWS, GCP, Kubernetes, Docker, CI/CD, APIs, SQL, PostgreSQL, MySQL, Hive, Impala, Kafka, Hadoop, Jenkins, GitHub, GitLab, BitBucket, Azure DevOps, TeamCity, Spotify, MP3, LDAP, Code/Build Linting, pkg mgmt for Linux, Mac, Python, Perl, Ruby, NodeJS, Golang, Advanced dotfiles: .bashrc, .vimrc, .gitconfig, .screenrc, tmux..
H2O is an Open Source, Distributed, Fast & Scalable Machine Learning Platform: Deep Learning, Gradient Boosting (GBM) & XGBoost, Random Forest, Generalized Linear Modeling (GLM with Elastic Net), K-Means, PCA, Generalized Additive Models (GAM), RuleFit, Support Vector Machine (SVM), Stacked Ensembles, Automatic Machine Learning (AutoML), etc.
Alluxio, data orchestration for analytics and machine learning in the cloud
Apache Hive
Apache Calcite
Apache Ignite
Example source code accompanying O'Reilly's "Hadoop: The Definitive Guide" by Tom White
DataSphereStudio is a one stop data application development& management portal, covering scenarios including data exchange, desensitization/cleansing, analysis/mining, quality measurement, visualization, and task scheduling.
大数据学习,从零开始学习大数据,包含大数据学习各阶段学习视频、面试资料
Apache Nutch is an extensible and scalable web crawler
基于Spark的电影推荐系统,包含爬虫项目、web网站、后台管理系统以及spark推荐系统
深圳地铁大数据客流分析系统🚇🚄🌟
Apache Hadoop docker image
Apache Kyuubi is a distributed and multi-tenant gateway to provide serverless SQL on data warehouses and lakehouses.
winutils.exe hadoop.dll and hdfs.dll binaries for hadoop windows
🍣100本 Java电子书 技术书籍PDF(以下载阅读为荣,以点赞收藏为耻)
Apache Drill is a distributed MPP query layer for self describing data
A large-scale entity and relation database supporting aggregation of properties
AI on Hadoop
大数据知识仓库涉及到数据仓库建模、实时计算、大数据、数据中台、系统设计、Java、算法等。