Apache Spark Apache Spark feed

Apache Spark is an open source cluster computing framework that is frequently used in big data processing. 

Person standing in front of a giant computer screen with numbers, data

Case study with NASA logs to show how Spark can be leveraged for analyzing data at scale.
metrics and data shown on a computer screen

Case study with NASA logs to show how Spark can be leveraged for analyzing data at scale.

As the Apache Software Foundation turns 20, let's celebrate by recognizing 20 influential and up-and-coming Apache projects.
Open data brain

Dani and Jon will give a three hour tutorial at OSCON this year called: Becoming friends with...

Apache Spark is an open source cluster computing framework. In contrast to Hadoop’s two-stage disk-...

ApacheCon is coming up, and within that massive conference there will be a glimmering gem: a forum...
shapes of women and men illustration

Spark's new DataFrame API is inspired by data frames in R and Python (Pandas), but designed from...

How Databricks set a new world record for sorting 100 terabytes (TB) of data, or 1 trillion 100-...