| By David Strom | Article Rating: |
|
| February 18, 2013 08:00 AM EST | Reads: |
2,268 |
It isn’t often that you can get access to a thousand-node network to test your latest app, but thanks to the efforts of EMC’s Greenplum unit and some additional computing vendors, you can, and more amazingly, it is free of charge too.
The network was announced last fall at Strata and connects 1,000 specialized servers from Supermicro running dual Intel Xeon processors with 48 GB of RAM apiece along with Mellanox 10 GB Ethernet adapters and switches, and a total of 12,000 Seagate 2 TB drives. It is all contained within Greenplum’s Las Vegas data center, with the goal of having the largest publicly accessible Hadoop cluster around. While Yahoo and eBay and others have some fairly large Hadoop clusters, they generally don’t let anyone else come in and try out their apps. The cluster goes under the name of Analytics Workbench. On this page, you can click on the “learn more” button and submit your name if you are interested in using the cluster.
The goal, according to Greenplum staffers, is to have a community and collaborative big data platform that can be applied to a set of analytical problems that have wide appeal. When the Strata announcement was made last fall, Greenplum stated that they wanted to eventually publish any results from the cluster, but they haven’t yet. Intel was one of the first clients to use the workbench (and running a thousand-node job too), but they are still reviewing their results.
Other clients that are running tests on the cluster include Mellanox and VMware, who both donated gear to power it, and a research team from the University of Central Florida. A group from NASA Goddard is using it to perform an analysis of historical weather patterns. The cluster formally opened up in July, and yes, it is really is free of charge. Applicants need to be vetted and work closely with the Greenplum engineers to get their apps uploaded and configured to the cluster.
“We accept bids based on any submitted application and developers can request specific time and resources,” says William Davis, one of the Greenplum product marketers involved with the cluster’s creation. Applications are reviewed by an internal group of Hadoop experts called the Jedi Council, and they try to select who will have the best fit for the next test run on the cluster.
Greenplum intends to use the cluster in a variety of ways besides public testing. Sometime next quarter they will launch a training program for Hadoop. A unique aspect of the program is that each member of the course will be granted access to the cluster to use as a sandbox environment for their own project. They are still working out the details on how this will work. The company has other fee-based programs to leverage its experience with this cluster, including what it calls its Analytics Lab packages. This uses their team of data scientists on specific vertical markets or particular custom applications.
There are several other tools that are offered on the cluster in addition to Hadoop including MapReduce, the parallel job processing software; VMware’s Rubicon system management team; and standard Hadoop add-ons such as Hive, Pig, and Mahout.
Greenplum isn’t the first to have such a large test bed assembled, but probably the first to use this level of gear for Hadoop and other data science activities. In the late 1980s, a group of Novell engineers in Utah created the “SuperLab” which eventually grew to 1,700 PCs connected together. The lab was used to prove the features and scalability of Novell’s Netware network operating system, a piece of software that at one time could be found in most enterprises but now is largely a historical curiosity. Just to give you some perspective, in 1999 the PCs in Novell’s lab had a whopping 256 MB of RAM and 8 GB of storage (try buying that on today’s PCs). How times have changed.
Anyway, the SuperLab team left Novell a few years later and built their own private test lab for a startup called Keylabs. I was one of their early customers, using the facility to publish some of the test results in cNet and other IT publications of the first Web server comparison tests.
The Keylabs engineers very quickly discovered that automating the sequencing and actions of the individual PCs was tedious, and they wrote software that eventually spawned Altiris. Part of the assets of this company was later purchased by Symantec and is still used for their desktop imaging and management tool line.
Speaking of scaling up to a thousand machines automatically, running tests on this scale can be tricky. Greenplum has already seen several hardware failures that take down particular nodes as they have begun using their cluster. And like Keylabs, understanding how to sequence all this gear to come online quickly can be vexing: imagine if each machine takes just ten minutes to boot up and launch an app: times ten or twenty nodes that isn’t much of a big deal, but when you are trying to bring up hundreds it could tie up the cluster for the better part of a week in just starting up the tests. “It is a bit of a challenge in educating our customers on how to use and manage something of this size and how to deploy their software across the entire cluster. You can’t deploy software serially, and we have to make sure that our customers understand these issues,” says Davis.
So get your application in now for testing your app. You could be making computing history.
Published February 18, 2013 Reads 2,268
Copyright © 2013 SYS-CON Media, Inc. — All Rights Reserved.
Syndicated stories and blog feeds, all rights reserved by the author.
More Stories By David Strom
David Strom is an international authority on network and Internet technologies. He has written extensively on the topic for 20 years for a wide variety of print publications and websites, such as The New York Times, TechTarget.com, PC Week/eWeek, Internet.com, Network World, Infoworld, Computerworld, Small Business Computing, Communications Week, Windows Sources, c|net and news.com, Web Review, Tom's Hardware, EETimes, and many others.
- Cloud People: A Who's Who of Cloud Computing
- Enterasys Spotlights SDN's Impact on Traditional Networking in Upcoming Webinar
- NASA's Twitter Account Wins Back-To-Back Shorty Awards
- Google Compute enters the IaaS market
- GoBank Announces Timing of General Availability and National Distribution Relationships at FinovateSpring
- MicroStrategy Announces General Availability of MicroStrategy 9.3.1
- MicroStrategy Announces General Availability of MicroStrategy 9.3.1
- Cloud Expo | Maximizing the Small Things: Efficiencies for Cloud Hardware
- Google Submits Concessions to EC; Gets Sued in the UK
- Cloud Business Solutions, Social Media, and Platform Systems of Engagement Market Shares, Strategies, and Forecasts, Worldwide, 2013 to 2019
- Global Mobile Security (mSecurity) Market 2013-2018
- RightScale Supports Windows Azure Infrastructure Services General Availability
- Cloud People: A Who's Who of Cloud Computing
- Enterasys Spotlights SDN's Impact on Traditional Networking in Upcoming Webinar
- NASA's Twitter Account Wins Back-To-Back Shorty Awards
- RetailMeNot Shoppers Trend Report: While Over 8 in 10 U.S. Residents Cite Affordability as Their Top Vacation Priority, a Majority (58%) Could Waste Hundreds of Dollars by Booking Travel a la Carte
- ChannelAdvisor Participates in Upcoming Retail Industry Conferences RBTE and Retail Week Live
- Basho Announces Open Source Riak CS and General Availability of Riak CS Enterprise v1.3
- Enter for a Chance to Win an Apple iPad Mini During the Grand Opening of Silverleaf, Taylor Morrison’s Latest New Home Community in Denver
- How to Protect Your Facebook Account Before Graph Search is Public
- Google Compute enters the IaaS market
- Google Says Motorola’s Upcoming Phones Don’t ‘Wow’ Them
- Why Cloud Computing Skills Will Be Required for IT Workers
- GoBank Announces Timing of General Availability and National Distribution Relationships at FinovateSpring
- Where Are RIA Technologies Headed in 2008?
- Cloud People: A Who's Who of Cloud Computing
- Dolphin Announces Open API With Over 50 Add-ons Including Dropbox and Wikipedia
- ManageWP Powers Over 100,000 WordPress Sites Within Three Months of Launch
- SEO/SEM Tips & Tricks: How and When Should You Submit Your Website to Google?
- Google Version 2.0: Googzilla - The Calculating Predator
- Google's Competitive Advantage: It Leverages "The Power of Free"
- Cloud Expo 2011 East To Attract 10,000 Delegates and 200 Exhibitors
- Google Space Launches at Heathrow Airport
- AOL To Enhance Video Search Engine by Adding RSS Feeds
- Ulitzer’s Amazing First 30 Days in Public Beta
- The World's Youngest "Google Entrepreneur" Is One Month Old





















