AI Buyer Insights:

Michelin, an e2open customer evaluated Oracle Transportation Management

Citigroup, a VestmarkONE customer evaluated BlackRock Aladdin Wealth

Cantor Fitzgerald, a Kyriba Treasury customer evaluated GTreasury

Westpac NZ, an Infosys Finacle customer evaluated nCino Bank OS

Wayfair, a Korber HighJump WMS customer just evaluated Manhattan WMS

Moog, an UKG AutoTime customer evaluated Workday Time and Attendance

Swedbank, a Temenos T24 customer evaluated Oracle Flexcube

Michelin, an e2open customer evaluated Oracle Transportation Management

Citigroup, a VestmarkONE customer evaluated BlackRock Aladdin Wealth

Cantor Fitzgerald, a Kyriba Treasury customer evaluated GTreasury

Westpac NZ, an Infosys Finacle customer evaluated nCino Bank OS

Wayfair, a Korber HighJump WMS customer just evaluated Manhattan WMS

Moog, an UKG AutoTime customer evaluated Workday Time and Attendance

Swedbank, a Temenos T24 customer evaluated Oracle Flexcube

List of Apache Spark MLlib Customers

Apply Filters For Customers

Logo Customer Industry Empl. Revenue Country Vendor Application Category When SI Insight
CrowdStrike Professional Services 10363 $4.0B United States Apache Software Apache Spark MLlib ML and Data Science Platforms 2016 n/a
In 2016, CrowdStrike implemented Apache Spark MLlib to perform large-scale feature extraction and to drive machine learning classification of event data ingested from Falcon Host, its software-as-a-service endpoint protection solution, under the Apps Category . The deployment focused on embedding Apache Spark MLlib into the data processing pipeline used by the security research and engineering teams to support behavioral analysis and model training workflows. The implementation concentrated on Spark-based feature engineering and model scoring capabilities, using Apache Spark MLlib for scalable distributed machine learning workloads. CrowdStrike configured Spark jobs for batch feature extraction and iterative model development, and instrumented job lifecycle controls to align compute usage with the engineering team’s need for agility. Operational integration included coupling Apache Spark MLlib with CrowdStrike’s Apache Cassandra backed Threat Graph and running the analytics stack on AWS infrastructure to reduce operational overhead. The architecture emphasized ephemeral instance control for Cassandra, the ability to start and stop nodes for environment rebuilds, and scalable compute provisioning for Spark to address rapidly growing event volumes. Governance and operational requirements centered on high availability, scalability, and cost-effective storage for petabyte-scale Cassandra data. Rollout priorities included maintaining uptime for Falcon Host ingestion pipelines, enabling reproducible environment rebuilds for engineering, and ensuring Spark MLlib workflows could scale without increasing on-premises operational burden.
FINRA Professional Services 3600 $1.1B United States Apache Software Apache Spark MLlib ML and Data Science Platforms 2019 n/a
In 2019, FINRA deployed Apache Spark MLlib on Amazon EMR to move from SQL batch processes on-prem to cloud native distributed analytics for billions of time-ordered market events. The work was implemented within the ML and Data Science Platforms category to provide scalable machine learning infrastructure for surveillance and analytics use cases. Configuration emphasized Apache Spark MLlib based model training and machine learning pipeline orchestration, enabling feature engineering, iterative model development, and large scale distributed computation. Workloads were restructured from nightly SQL batch jobs to continuous Spark workflows to support faster training cycles and backtesting on historic market downturn datasets. Operationally the deployment used Amazon EMR for compute elasticity to process high velocity market event streams and historical order tapes at scale. These compute and ML workflows were consumed by data science teams supporting market surveillance, risk analytics, investor protection, and market integrity functions. Governance shifted from batch release cycles to pipeline and model governance with standardized validation and backtesting workflows to ensure model integrity for surveillance and compliance. FINRA can now test models on realistic data from market downturns, enhancing its ability to provide investor protection and promote market integrity.
GumGum Professional Services 480 $113M United States Apache Software Apache Spark MLlib ML and Data Science Platforms 2017 n/a
In 2017, GumGum implemented Apache Spark MLlib to operationalize machine learning across its advertising analytics stack and to handle extremely high event volumes. The implementation targeted a platform that ingests more than 1 billion events per day, approximately 6 TB of data daily, and was selected to support continuous processing and model-driven inventory forecasting, addressing the company need to expedite customer decision making and scale quickly. Apache Spark MLlib was deployed on Amazon EMR as the primary machine learning runtime, with configurations for model training, batch scoring, and feature engineering pipelines. The deployment uses Apache Spark MLlib for inventory forecasting workflows and integrates standard Spark MLlib capabilities for model fitting, transformation pipelines, and distributed feature processing to support programmatic and native advertising analytics. The architecture places ad servers at the event edge, writing event logs that are uploaded to Amazon Simple Storage Service S3 on an hourly cadence. Amazon Data Pipeline orchestrates production, testing, and development workflows, Amazon EMR runs Apache Spark MLlib workloads alongside Hadoop for hourly data processing, and processed outputs are persisted into Amazon Redshift for downstream analytics and reporting. Operational coverage includes production, testing, and development environments and impacts ad operations and analytics functions responsible for campaign forecasting and reporting. Governance and operationalization relied on pipeline-driven environment segregation and hourly ingestion patterns to remove processing bottlenecks and maintain continuous processing requirements. The implementation of Apache Spark MLlib at GumGum is positioned as a scalable, EMR-hosted machine learning layer within the larger AWS-based data pipeline, designed to support programmatic advertising, image recognition derived signals, and customer-facing analytics.
Professional Services 76453 $37.9B United States Apache Software Apache Spark MLlib ML and Data Science Platforms 2020 n/a
Professional Services 5116 $1.4B United States Apache Software Apache Spark MLlib ML and Data Science Platforms 2018 n/a
Showing 1 to 5 of 5 entries

Buyer Intent: Companies Evaluating Apache Spark MLlib

ARTW Buyer Intent uncovers actionable customer signals, identifying software buyers actively evaluating Apache Spark MLlib. Gain ongoing access to real-time prospects and uncover hidden opportunities.

Discover Software Buyers actively Evaluating Enterprise Applications

Logo Company Industry Employees Revenue Country Evaluated
No data found
FAQ - APPS RUN THE WORLD Apache Spark MLlib Coverage

Apache Spark MLlib is a ML and Data Science Platforms solution from Apache Software.

Companies worldwide use Apache Spark MLlib, from small firms to large enterprises across 21+ industries.

Organizations such as Salesforce, CrowdStrike, Yelp, FINRA and GumGum are recorded users of Apache Spark MLlib for ML and Data Science Platforms.

Companies using Apache Spark MLlib are most concentrated in Professional Services, with adoption spanning over 21 industries.

Companies using Apache Spark MLlib are most concentrated in United States, with adoption tracked across 195 countries worldwide. This global distribution highlights the popularity of Apache Spark MLlib across Americas, EMEA, and APAC.

Companies using Apache Spark MLlib range from small businesses with 0-100 employees - 0%, to mid-sized firms with 101-1,000 employees - 20%, large organizations with 1,001-10,000 employees - 40%, and global enterprises with 10,000+ employees - 40%.

Customers of Apache Spark MLlib include firms across all revenue levels — from $0-100M, to $101M-$1B, $1B-$10B, and $10B+ global corporations.

Contact APPS RUN THE WORLD to access the full verified Apache Spark MLlib customer database with detailed Firmographics such as industry, geography, revenue, and employee breakdowns as well as key decision makers in charge of ML and Data Science Platforms.