AI Buyer Insights:

Westpac NZ, an Infosys Finacle customer evaluated nCino Bank OS

Moog, an UKG AutoTime customer evaluated Workday Time and Attendance

Swedbank, a Temenos T24 customer evaluated Oracle Flexcube

Michelin, an e2open customer evaluated Oracle Transportation Management

Cantor Fitzgerald, a Kyriba Treasury customer evaluated GTreasury

Citigroup, a VestmarkONE customer evaluated BlackRock Aladdin Wealth

Wayfair, a Korber HighJump WMS customer just evaluated Manhattan WMS

Westpac NZ, an Infosys Finacle customer evaluated nCino Bank OS

Moog, an UKG AutoTime customer evaluated Workday Time and Attendance

Swedbank, a Temenos T24 customer evaluated Oracle Flexcube

Michelin, an e2open customer evaluated Oracle Transportation Management

Cantor Fitzgerald, a Kyriba Treasury customer evaluated GTreasury

Citigroup, a VestmarkONE customer evaluated BlackRock Aladdin Wealth

Wayfair, a Korber HighJump WMS customer just evaluated Manhattan WMS

List of Apache Spark Customers

loading spinner icon



Apply Filters For Customers

Logo Customer Industry Empl. Revenue Country Vendor Application Category When SI Insight Insight Source
2K Games Professional Services 3000 $930M United States Apache Software Apache Spark Analytics and BI 2018 n/a In 2018, 2K Games deployed Apache Spark as a Distributed Processing System to support data engineering and analytics workflows across its operational data estate. The Apache Spark implementation was positioned to process Hive data stored in HDFS and to analyze data held in Teradata, using the HDP platform and YARN cluster for job execution and resource management. Implementation focused on batch processing and interactive query capabilities, with engineers designing Spark batch jobs to replace slower MapReduce patterns and to accelerate ETL pipelines. Apache Spark applications were developed using Spark SQL, Spark DataFrame API, Spark RDDs, PySpark, Python and Scala, and included custom aggregate functions, data validation, cleansing, transformation, and interactive querying workflows. The data pipeline architecture integrated Apache Spark with Hive, HDFS, Teradata and custom-built input adapters to ingest and normalize disparate sources. Operational tasks included migrating Hive tables between environments, a scripted data copy from edge node to on premises data lake, and auditing YARN cluster log files while data copy processes ran. Governance and operational ownership were established through a reuse framework for interfaces and day to day coordination with offshore teams to assign tasks and maintain pipeline reliability. Engineers were responsible for building reusable Spark components, validating data with PySpark applications, and maintaining operational monitoring via log auditing in the YARN cluster.
Afiniti Professional Services 2000 $350M Bermuda Apache Software Apache Spark Analytics and BI 2020 n/a In 2020, Afiniti deployed Apache Spark as a central execution engine within a Distributed Processing System to operationalize real-time streaming and large scale data processing for its AI driven customer experience platform. Apache Spark supported in memory analytics and micro batch processing to enable predictive agent pairing and near real time model scoring for customer service routing. The Spark implementation covered stream processing and batch ETL workloads, integrated into a Medallion Architecture for staged data refinement. Functional capabilities included streaming ingestion and processing, micro batch aggregation, schema aware transformations, automated data validation, and support for change data capture workflows and dimensional modeling using DBT and ErWin. Integrations were explicitly instrumented with Kafka for real time streaming, Azure Databricks for Spark execution, Azure Data Factory and Airflow for orchestration, and downstream consumption by Snowflake and Apache Superset for analytics. The implementation also leveraged data ingestion and replication tooling such as AirByte, Talend, and QLIK Replicate to ingest MySQL, SQL Server, Greenplum, and PostgreSQL sources, and included C#.NET API integrations for system interoperability. Operational rollout targeted global analytics and engineering teams and included documentation driven onboarding, capacity planning, and staff training to sustain production operations. Governance changes formalized CDC pipelines and standardized dimensional models, while outcomes documented in project notes included a 30% reduction in streaming latency, a 40% reduction in infrastructure cost following Azure migration, a 50% increase in data processing speeds in cloud pipelines, automated validation that cut manual effort by 20% and maintained 99.9% data integrity, and a 25% reduction in query time for the enterprise data portal.
Allstate Insurance 55000 $67.7B United States Apache Software Apache Spark Analytics and BI 2017 n/a In 2017, Allstate implemented Apache Spark as its Distributed Processing System to operationalize ETL and feature engineering pipelines that support analytics and machine learning workflows. Apache Spark became the core distributed processing engine used by data engineering and data science teams to process large claims, policy and agency datasets and to enable downstream modeling efforts. The implementation centered on PySpark, Spark SQL and the DataFrame APIs to build complex ETL and feature engineering pipelines in Python, with Apache Airflow used for orchestration and job scheduling. Pipelines covered structured ETL, feature creation for Agency Analytics and Product Operations predictive models, and NLP pipelines that extract signals from unstructured claims text, while also preparing imagery datasets for computer vision models developed with TensorFlow, Keras and Scikit Learn. Apache Spark was integrated into an ecosystem that included Amazon S3 as a data lake, Oracle Database for transactional sources, and Hadoop and Hive for data storage and cataloging, with analytic outputs consumed in Tableau and model artifacts promoted into production monitoring workflows. Spark jobs were invoked and monitored by Airflow orchestrations, and outputs were surfaced to data science teams for model training and to production model deployment processes. Governance and operationalization included an internal product owner function that developed data management tools and a data catalog to simplify find and request access workflows, and an Engineering Consulting Services team that supported adoption across data science groups. Engineering best practices such as version control, unit testing and data validation were coached into teams to increase pipeline reliability while supporting machine learning initiatives focused on claims routing, handling and agency performance analytics.
Professional Services 23000 $9.0B United States Apache Software Apache Spark Analytics and BI 2022 n/a
Banking and Financial Services 3800 $3.6B Brazil Apache Software Apache Spark Analytics and BI 2020 n/a
Banking and Financial Services 93200 $28.4B Brazil Apache Software Apache Spark Analytics and BI 2020 n/a
Professional Services 840 $350M United States Apache Software Apache Spark Analytics and BI 2012 n/a
Banking and Financial Services 2600 $1.5B Ireland Apache Software Apache Spark Analytics and BI 2017 n/a
Automotive 450 $65M United States Apache Software Apache Spark Analytics and BI 2021 n/a
Banking and Financial Services 8004 $21.2B United States Apache Software Apache Spark Analytics and BI 2021 n/a
Showing 1 to 10 of 26 entries

Buyer Intent: Companies Evaluating Apache Spark

ARTW Buyer Intent uncovers actionable customer signals, identifying software buyers actively evaluating Apache Spark. Gain ongoing access to real-time prospects and uncover hidden opportunities. Companies Actively Evaluating Apache Spark for Analytics and BI include:

  1. Hoag Memorial Hospital, a United States based Non Profit organization with 2000 Employees
  2. RWTH Aachen University, a Germany based Education company with 8540 Employees
  3. JPMorgan Chase, a United States based Banking and Financial Services organization with 317233 Employees

Discover Software Buyers actively Evaluating Enterprise Applications

Logo Company Industry Employees Revenue Country Evaluated
No data found
FAQ - APPS RUN THE WORLD Apache Spark Coverage

Apache Spark is a Analytics and BI solution from Apache Software.

Companies worldwide use Apache Spark, from small firms to large enterprises across 21+ industries.

Organizations such as HCA Healthcare, Allstate, Royal Bank of Canada, Banco Itau and Freddie Mac are recorded users of Apache Spark for Analytics and BI.

Companies using Apache Spark are most concentrated in Healthcare, Insurance and Banking and Financial Services, with adoption spanning over 21 industries.

Companies using Apache Spark are most concentrated in United States, Canada and Brazil, with adoption tracked across 195 countries worldwide. This global distribution highlights the popularity of Apache Spark across Americas, EMEA, and APAC.

Companies using Apache Spark range from small businesses with 0-100 employees - 3.85%, to mid-sized firms with 101-1,000 employees - 19.23%, large organizations with 1,001-10,000 employees - 30.77%, and global enterprises with 10,000+ employees - 46.15%.

Customers of Apache Spark include firms across all revenue levels — from $0-100M, to $101M-$1B, $1B-$10B, and $10B+ global corporations.

Contact APPS RUN THE WORLD to access the full verified Apache Spark customer database with detailed Firmographics such as industry, geography, revenue, and employee breakdowns as well as key decision makers in charge of Analytics and BI.