Apache Spark Apache Spark feed

Apache Spark is an open source cluster computing framework that is frequently used in big data processing. 

Person standing in front of a giant computer screen with numbers, data

Case study with NASA logs to show how Spark can be leveraged for analyzing data at scale.
metrics and data shown on a computer screen

Case study with NASA logs to show how Spark can be leveraged for analyzing data at scale.
lightbulb drawing outline

As the Apache Software Foundation turns 20, let's celebrate by recognizing 20 influential and up-and-coming Apache projects.
Open data brain

Dani and Jon will give a three hour tutorial at OSCON this year called: Becoming friends with...
One lightbulb lit out of several

Apache Spark is an open source cluster computing framework. In contrast to Hadoop’s two-stage disk-...
On the scene

ApacheCon is coming up, and within that massive conference there will be a glimmering gem: a forum...
Lots of people in a crowd.

Spark's new DataFrame API is inspired by data frames in R and Python (Pandas), but designed from...
Two different paths to different outcomes

How Databricks set a new world record for sorting 100 terabytes (TB) of data, or 1 trillion 100-...