List of Apache Hadoop Customers
Wilmington, 19801, DE,
United States
Since 2010, our global team of researchers has been studying Apache Hadoop customers around the world, aggregating massive amounts of data points that form the basis of our forecast assumptions and perhaps the rise and fall of certain vendors and their products on a quarterly basis.
Each quarter our research team identifies companies that have purchased Apache Hadoop for Database Management from public (Press Releases, Customer References, Testimonials, Case Studies and Success Stories) and proprietary sources, including the customer size, industry, location, implementation status, partner involvement, LOB Key Stakeholders and related IT decision-makers contact details.
Companies using Apache Hadoop for Database Management include: Walmart, a United States based Retail organisation with 2100000 employees and revenues of $681.00 billion, Apple, a United States based Manufacturing organisation with 166000 employees and revenues of $416.16 billion, United Healthcare, a United States based Insurance organisation with 400000 employees and revenues of $400.28 billion, CVS Health, a United States based Healthcare organisation with 219000 employees and revenues of $372.81 billion, McKesson, a United States based Distribution organisation with 45000 employees and revenues of $359.10 billion and many others.
Contact us if you need a completed and verified list of companies using Apache Hadoop, including the breakdown by industry (21 Verticals), Geography (Region, Country, State, City), Company Size (Revenue, Employees, Asset) and related IT Decision Makers, Key Stakeholders, business and technology executives responsible for the IaaS software purchases.
The Apache Hadoop customer wins are being incorporated in our Enterprise Applications Buyer Insight and Technographics Customer Database which has over 100 data fields that detail company usage of IaaS software systems and their digital transformation initiatives. Apps Run The World wants to become your No. 1 technographic data source!
Apply Filters For Customers
| Logo | Customer | Industry | Empl. | Revenue | Country | Vendor | Application | Category | When | SI | Insight |
|---|---|---|---|---|---|---|---|---|---|---|---|
|
|
2K Games | Professional Services | 3000 | $930M | United States | Apache Software | Apache Hadoop | Database Management | 2017 | n/a |
In 2017, 2K Games deployed Apache Hadoop to establish a centralized Big Data Infrastructure supporting the companys analytics and data engineering functions. Apache Hadoop was positioned as the primary distributed storage and processing backbone to consolidate ingestion, transformation, and manipulation of data from across the organization for the analytics team.
The implementation centered on Hadoop as the storage and compute foundation, paired with category aligned processing frameworks to host ETL and ELT pipelines that ingest from REST APIs and internal systems. The analytics engineering responsibilities emphasized building reusable, scalable data models, managing ETL processes into data warehouses, scheduling jobs to pull data from APIs, and maintaining internal databases used for reporting and dashboarding.
Integrations were explicitly aligned to the existing toolset documented in hiring and engineering notes, including AWS EC2 and S3 for cloud compute and object storage, Amazon EMR and Spark for distributed processing, and Redshift and columnar storage for analytical warehousing. The deployment also interfaced with SQL and NoSQL databases, Pentaho style ETL patterns, and visualization backends such as Tableau to support dashboarding and access control for analytics consumers.
Governance and operational changes focused on instituting continuous integration pipelines, enforcing software development best practices on the analytics team, and formalizing access control and job scheduling for reporting workloads. Implementation emphasis was on maintainability and scalability, enabling data engineers and data scientists to build, test, and deploy pipelines while centralizing data ingestion, transformation, and model orchestration within the Big Data Infrastructure.
|
|
|
3M | Manufacturing | 61500 | $24.6B | United States | Apache Software | Apache Hadoop | Database Management | 2012 | n/a |
In 2012, 3M deployed Apache Hadoop as Big Data Infrastructure to underpin the Data Science Lab's clinical data analytics platform. The Data Science Lab was responsible for leading independent R&D projects in machine learning and advanced statistical analysis of healthcare data, extracting knowledge from large and diverse datasets and building predictive models and novel algorithms for production deployment on 3M's clinical data analytics platform.
Apache Hadoop was configured to provide distributed storage and scalable batch processing, forming the core data lake and serving as persistent staging for analytic workloads. Workflows combined Apache Hadoop with Apache Spark for in-memory processing, while standard language runtimes such as Python, R and Java were used for model development and SQL was used for data interrogation. Notebook environments including Jupyter Notebook supported exploratory analysis and iterative model building.
Integrations explicitly reflected the team’s technical experiences, including Spark and developer tooling such as PyCharm and RStudio, with Git and Gerrit used for source control and code review to support reproducible model pipelines. The environment operated alongside cloud and hybrid analytics experiences cited by the lab, including AWS, Databricks, Google Cloud and BigQuery, enabling a mix of on-premises Hadoop storage and cloud compute or analytics engines for model training and batch scoring. Apache Hadoop served as the canonical Big Data Infrastructure layer that connected raw healthcare feeds to downstream clinical analytics and R&D consumption.
Governance and operational practices centered on the Data Science Lab, with controlled data access, code review workflows using Git and Gerrit, and lifecycle handoff patterns for moving models from experimentation to production. Operational responsibilities included configuration management, job scheduling and packaging of models for deployment into 3M's clinical data analytics platform.
|
|
|
4INFO | Professional services | 100 | $28M | United States | Apache Software | Apache Hadoop | Database Management | 2010 | n/a |
In 2010, 4INFO implemented Apache Hadoop as Big Data Infrastructure to architect and build a new reporting infrastructure utilizing Hadoop and Hive, focused on delivering near real time reporting capabilities for the business. The deployment established a distributed data storage and processing foundation using Apache Hadoop as the core platform, with Apache Hive used as the SQL query layer to enable analyst-friendly reporting and ad hoc queries against large datasets.
The implementation included configuration of storage and compute tiers, a metadata and query service through Hive, and the construction of ingestion pipelines for both batch and micro‑batch data flows to support low-latency reporting. Functional capabilities emphasized scalable HDFS-based storage, a query and analytics layer via Apache Hive, and pipeline orchestration and transformation logic to normalize data for downstream reports.
Operational coverage targeted the reporting and analytics function across 4INFO, with governance workstreams for data schema management, access control and query governance to ensure consistent reporting. Apache Hadoop and Apache Hive were explicitly restated as the platform components enabling the Big Data Infrastructure, and the project outcome was to provide the business with near real time reporting capability through a centralized Hadoop-based reporting infrastructure.
|
|
|
|
Life Sciences | 55000 | $56.3B | United States | Apache Software | Apache Hadoop | Database Management | 2014 | n/a |
|
|
|
|
Professional Services | 13000 | $7.5B | United States | Apache Software | Apache Hadoop | Database Management | 2015 | n/a |
|
|
|
|
Professional Services | 3650 | $650M | United States | Apache Software | Apache Hadoop | Database Management | 2017 | n/a |
|
|
|
|
Professional services | 130 | $15M | United States | Apache Software | Apache Hadoop | Database Management | 2004 | n/a |
|
|
|
|
Media | 569 | $137M | United States | Apache Software | Apache Hadoop | Database Management | 2015 | n/a |
|
|
|
|
Insurance | 3500 | $650M | United States | Apache Software | Apache Hadoop | Database Management | 2013 | n/a |
|
|
|
|
Manufacturing | 27900 | $14.4B | United States | Apache Software | Apache Hadoop | Database Management | 2014 | n/a |
|
Buyer Intent: Companies Evaluating Apache Hadoop
- Merck, a United States based Life Sciences organization with 73000 Employees
- Bexhill College United Kingdom, a United Kingdom based Education company with 150 Employees
- Horizon Forest Products, a United States based Distribution organization with 200 Employees
Discover Software Buyers actively Evaluating Enterprise Applications
| Logo | Company | Industry | Employees | Revenue | Country | Evaluated | ||
|---|---|---|---|---|---|---|---|---|
| No data found | ||||||||