apache

Apache Spark is an open source cluster computing framework. In contrast to Hadoop’s two-stage disk-based MapReduce paradigm, Spark’s in-memory primitives provide performance up to 100 times faster for certain applications.
0 comments Posted 23 Apr 2015 by Arush Kharbanda Feed
Chris Mattmann is a frequent speaker at ApacheCon North America and has a wealth of experience in software design and the construction of large-scale data-intensive systems. His work has infected a broad set of communities, ranging from helping NASA unlock data from its next generation of earth... Read more
0 comments Posted 8 Apr 2015 by Jan Iversen Feed
ApacheCon is coming up, and within that massive conference there will be a glimmering gem: a forum dedicated to Spark. The Spark Forum will have speakers from the Hive project, the Pig project, and the Sqoop project. Plus, two talks about Spark Streaming—one will be introductory, and the other... Read more
0 comments Posted 6 Apr 2015 by Jen Wike Huger (Red Hat) Feed
Igloo house on land
University of Southern California postdoctoral fellow and NASA/JPL researcher Annie Bryant Burgess explains how her PhD is related to her involvement in open source, and tells us what Apache Tika has to do with studying polar data.
0 comments Posted 2 Apr 2015 by Jan Iversen Feed
Spark's new DataFrame API is inspired by data frames in R and Python (Pandas), but designed from the ground up to support modern big data and data science applications.
1 comment Posted 26 Mar 2015 by Reynold Xin Feed
How does OpenStack differ from other large, popular open source projects and how do these differences affect the way the project is growing and maturing?
1 comment Posted 24 Mar 2015 by Stephen R. Walli Feed
Initially, Hadoop implementation required skilled teams of engineers and data scientists, making Hadoop too costly and cumbersome for many organizations. Now, thanks to a number of open source projects, big data analytics with Hadoop has become much more affordable and mainstream. Here's a look at... Read more
3 comments Posted 4 Mar 2015 by Jonathan Buckley Feed
Five stars for Top 5 articles of the week
The Opensource.com Weekly Top 5: the best and brightest burning star articles from this week: January 26 - 30
0 comments Posted 30 Jan 2015 by Jen Wike Huger (Red Hat) Feed
How Databricks set a new world record for sorting 100 terabytes (TB) of data, or 1 trillion 100-bytes, in 23 minutes with open source software Apache Spark and public cloud infrastructure EC2.
4 comments Posted 15 Jan 2015 by Reynold Xin Feed
Top 10 open source projects of 2014 with lightbulb
Annual list of top 10 open source projects covered on Opensource.com in 2014. From cloud computing to containers to project management, this year's showing in open source has been phenomenal.
2 comments Posted 16 Dec 2014 by Jen Wike Huger (Red Hat) Feed

Pages