Enterprise IT Context for the CTO

Bob Gourley

Subscribe to Bob Gourley: eMailAlertsEmail Alerts
Get Bob Gourley via: homepageHomepage mobileMobile rssRSS facebookFacebook twitterTwitter linkedinLinkedIn

Blog Feed Post

Igniting the Spark Curiosity; 3-Part Vodcast Series

Check out Bright's three part vodcast series on Apache Spark™ and how it can be used for large-scale data processing:

In Part I, a Bright expert will discuss Bright Cluster Manager for Big Data and walk you through the fully integrated support for Apache Spark now included. The vodcast highlights how the integration with Spark can directly help end users, for instance the flexibility offered by running Apache Spark with or without the Hadoop Distributed File System (HDFS).

In Part II, the vodcast digs a bit deeper into the details of Apache Spark, and offers eight reasons why Spark is gaining such a following. For example, Spark handles iterative algorithms and interactive mining tools much more efficiently than MapReduce. Also, Spark provides a converged analytics platform, creating a comprehensive engine for big data analytics. It lets users move rapidly from building simple interactive apps to building sophisticated distributed apps.

Part III discusses how using Apache Spark without Hadoop means customers can have a comprehensive platform for big data analytics, while using a variety of HDFS alternatives. We know that installing a brand-new file system to get a solution for big data can be a real problem, considering the significant amount already invested in high-volume, distributed and scalable parallel file systems. We discuss some of the alternatives available, like Amazon S3, OpenStack Swift, and IMB’s GPFS, among others, and explain why users might want to make this choice.

Read more here.

Credit for information: Lionel Gibbons.

Read the original blog entry...

More Stories By Bob Gourley

Bob Gourley writes on enterprise IT. He is a founder of Crucial Point and publisher of CTOvision.com