List of Apache PySpark Customers

Logo	Customer	Industry	Empl.	Revenue	Country	Vendor	Application	Category	When	SI	Insight	Insight Source
	NextEra Energy	Utilities	16800	$24.8B	United States	Apache Software	Apache PySpark	API Management	2018	n/a		In 2018, NextEra Energy deployed Apache PySpark as part of a Big Data analytics build within its IT organization, establishing a cloud native analytics foundation to process IoT sensor telemetry from Florida solar assets. A contract Cloud Engineer from ProTek Consulting led the end to end design and deployment on AWS, aligning the effort with corporate migration sequencing while enabling departmental analytics ahead of broader schedules. Apache PySpark was embedded in the ETL layer to perform scalable transformations as sensor data moved into the cloud, with ingestion and orchestration architecture using AWS DMS, AWS Glue, Amazon S3, and AWS Lambda to stage and transform records before landing in Amazon Redshift. Infrastructure provisioning was automated through AWS CloudFormation to ensure repeatability across environments, and ETL jobs and data pipelines were instrumented for operational visibility. The implementation integrated Amazon Redshift with both Power BI via ODBC and JDBC connectors and with Amazon QuickSight to enable a hybrid BI strategy that supported centralized corporate reporting alongside agile internal dashboards. Security and governance controls were implemented using IAM policies, VPC isolation, and KMS encryption, and the solution maintained close alignment with on premises network, data, and compliance stakeholders to preserve hybrid operational continuity. Operational governance included automated provisioning workflows, observability using Amazon CloudWatch to track ETL job health, data latency, and Redshift query performance, and a structured handoff to IT and analytics teams. The build reduced infrastructure setup time by 25 percent and improved issue detection and incident response by 40 percent as reported by the implementation team, and the solution served as a reference model for subsequent departmental cloud initiatives.
	Synchrony	Banking and Financial Services	20000	$16.1B	United States	Apache Software	Apache PySpark	API Management	2018	n/a		In 2018, Synchrony implemented Apache PySpark. The implementation positioned Apache PySpark as the primary engine for large scale ETL and big data processing, Apps Category . Implementation scope centered on designing and developing scalable ETL pipelines using AWS Glue, PySpark, and SQL, with Python-based orchestration using AWS Lambda and Step Functions. Functional capabilities implemented included JSON encoding and decoding with PySpark to transform semi-structured data into analytics-ready tables, automated ingestion and validation pipelines, and data validation frameworks built with PySpark and Pandas. Integrations and operational architecture leveraged AWS services including S3 for landing and persistent storage, Glue for cataloging and ETL orchestration, Redshift for analytic modeling, EMR for Spark-based batch processing, IAM and KMS for security, and CloudWatch for monitoring. Migration work included moving on-premises DB2 datasets to Amazon S3 using AWS Glue and PySpark, and Redshift schema design used distribution keys, sort keys, and materialized views to improve query execution. Governance and operational controls emphasized secure access and compliance through IAM role management, S3 bucket policies, and KMS encryption, while operationalizing monitoring and incident workflows with CloudWatch, JIRA, and ServiceNow in an Agile Scrum delivery model. The Apache PySpark implementation supported cross-functional data engineering and analytics teams, and included automated source file ingestion, data quality checks, and cleanup workflows that reduced manual intervention and addressed performance bottlenecks through SQL and ETL optimization.

Logo

Customer

Industry

Empl.

Revenue

Country

Vendor

Application

Category

When

Insight

Insight Source

NextEra Energy

Utilities

16800

$24.8B

United States

Apache Software

Apache PySpark

API Management

2018

n/a

In 2018, NextEra Energy deployed Apache PySpark as part of a Big Data analytics build within its IT organization, establishing a cloud native analytics foundation to process IoT sensor telemetry from Florida solar assets. A contract Cloud Engineer from ProTek Consulting led the end to end design and deployment on AWS, aligning the effort with corporate migration sequencing while enabling departmental analytics ahead of broader schedules. Apache PySpark was embedded in the ETL layer to perform scalable transformations as sensor data moved into the cloud, with ingestion and orchestration architecture using AWS DMS, AWS Glue, Amazon S3, and AWS Lambda to stage and transform records before landing in Amazon Redshift. Infrastructure provisioning was automated through AWS CloudFormation to ensure repeatability across environments, and ETL jobs and data pipelines were instrumented for operational visibility. The implementation integrated Amazon Redshift with both Power BI via ODBC and JDBC connectors and with Amazon QuickSight to enable a hybrid BI strategy that supported centralized corporate reporting alongside agile internal dashboards. Security and governance controls were implemented using IAM policies, VPC isolation, and KMS encryption, and the solution maintained close alignment with on premises network, data, and compliance stakeholders to preserve hybrid operational continuity. Operational governance included automated provisioning workflows, observability using Amazon CloudWatch to track ETL job health, data latency, and Redshift query performance, and a structured handoff to IT and analytics teams. The build reduced infrastructure setup time by 25 percent and improved issue detection and incident response by 40 percent as reported by the implementation team, and the solution served as a reference model for subsequent departmental cloud initiatives.

Synchrony

Banking and Financial Services

20000

$16.1B

United States

Apache Software

Apache PySpark

API Management

2018

n/a

In 2018, Synchrony implemented Apache PySpark. The implementation positioned Apache PySpark as the primary engine for large scale ETL and big data processing, Apps Category . Implementation scope centered on designing and developing scalable ETL pipelines using AWS Glue, PySpark, and SQL, with Python-based orchestration using AWS Lambda and Step Functions. Functional capabilities implemented included JSON encoding and decoding with PySpark to transform semi-structured data into analytics-ready tables, automated ingestion and validation pipelines, and data validation frameworks built with PySpark and Pandas. Integrations and operational architecture leveraged AWS services including S3 for landing and persistent storage, Glue for cataloging and ETL orchestration, Redshift for analytic modeling, EMR for Spark-based batch processing, IAM and KMS for security, and CloudWatch for monitoring. Migration work included moving on-premises DB2 datasets to Amazon S3 using AWS Glue and PySpark, and Redshift schema design used distribution keys, sort keys, and materialized views to improve query execution. Governance and operational controls emphasized secure access and compliance through IAM role management, S3 bucket policies, and KMS encryption, while operationalizing monitoring and incident workflows with CloudWatch, JIRA, and ServiceNow in an Agile Scrum delivery model. The Apache PySpark implementation supported cross-functional data engineering and analytics teams, and included automated source file ingestion, data quality checks, and cleanup workflows that reduced manual intervention and addressed performance bottlenecks through SQL and ETL optimization.

Logo	Company	Industry	Employees	Revenue	Country	Evaluated
No data found

Logo

Company

Industry

Employees

Revenue

Country

Evaluated

No data found

List of Apache PySpark Customers

Apply Filters For Customers

Buyer Intent: Companies Evaluating Apache PySpark

Discover Software Buyers actively Evaluating Enterprise Applications