|By Jnan Dash||
|October 10, 2012 05:16 PM EDT||
Hadoop traces its origins to Google where two early projects GFS (Google File System) and GMR (Google Map Reduce) were written besides Big Table, to manage large volumes of data. These systems are great at crunching large volumes of data in a distributed computing environment (with commodity servers) in batch mode. Any changes to the data requires streaming over the entire data-set and thus big latency. So it is good for “Data in Rest” or static data.
Now Google finds itself limited by its own invention of GFS/GMR/BigTable. Hence they have been working on the post-Hadoop set of data crunching tools – Percolator, Dremel, and Pregel. Here is a brief narration of each of these tools.
Percolator is a system for incrementally processing updates to a large data set. By replacing a batch-based indexing system with one on incremental processing with Percolator, you significantly speed up the process and reduce analysis time. Percolator’s architecture provides horizontal scalability and resilience. The best candidates for this is large indexes where the performance improvement factor can be 100. The big advantage of Percolator is that the indexing time is now proportional to the size of the page, not to the size of the index.
Dremel is for ad-hoc analytics. It is a scalable, interactive ad-hoc query system for analysis of read-only nested data. By combining multi-level execution trees and columnar data layout, it is capable of running aggregation queries over trillion-row tables in seconds. Dremel claims to be about 100 times faster than MapReduce. It’s architecture is similar to Pig and Hive, but instead of MapReduce, it’s engine is based on aggregator trees.
Pregel is a system for large-scale graph processing and graph data analysis. It is designed to execute graph algorithms faster and API is easy to use. As to be expected Pregel is architected for efficient, scalable, and fault-tolerant implementation on clusters of thousands of commodity computers. Graphs are everywhere – social networks, computer network topologies, games among soccer teams, citations among scientific papers, and the most pervasive graph is the web itself. Pregel is a scalable infrastructure to mine a wide range of graphs and programs are expressed as a sequence of iterations. Google has been using Pregel internally for some time now.
Besides Google, Facebook and Twitter are also working on new innovations. Recently Twitter released its Storm project to the Apache open source. One key trend is “Data in Motion”, or how to deal with data that is moving. This is the velocity aspect of Big Data.
- Innodisk | Efficiencies for Cloud Hardware at Cloud Expo New York
- Join Gartner, IBM, + AWS at AppSphere and save $200 when you register in August!
- In 2014 Big Data Investments Will Account for Nearly $30 Billion - Eventually Accounting for $76 Billion by 2020 End
- Global Cloud Security Market Growing at 15.7% CAGR to 2020: Forecast & Analysis in Research Report Available at ReportsnReports.com
- Video: DevOps and Security
- Worldwide Indoor Location Market Growing at 46.0% CAGR to 2019 Says a New Research Report Available at RnRMarketResearch.com
- Flexera Software's InstallAnywhere 2014 Simplifies Multi-Platform Installation for Physical, Virtual and Cloud Environments
- Mobility News Weekly – Week of August 3, 2014
- Searchmetrics Drives Over 200% World-Wide Growth As More Business Leaders Begin To Recognize The Value Of Search
- Mobility News Weekly – Week of August 17, 2014
- Digital Transformation's Impact on Enterprise Mobility and App Design Strategies
- Web Analytics Market by Solution (Search Engine Tracking & Ranking, Heat Map Analytics, Marketing Automation, Behavior Based Targeting) & by Services (Professional Services, Support & Maintenance) - Worldwide Forecasts & Analysis (2014 - 2019)
- Mobile Commerce News Weekly – Week of August 3, 2014
- Red Hat To Present At Internet of @ThingsExpo
- Mobile Cyber Security News Weekly – Week of August 10, 2014
- Where Are RIA Technologies Headed in 2008?
- Dolphin Announces Open API With Over 50 Add-ons Including Dropbox and Wikipedia
- Cloud People: A Who's Who of Cloud Computing
- 21st century Modern Alarm systems continue to play a key role in various institutions and industries
- SEO/SEM Tips & Tricks: How and When Should You Submit Your Website to Google?
- Cloud Expo 2011 East To Attract 10,000 Delegates and 200 Exhibitors
- Tips For Press Releases in Reputation Management from Industry Veteran Brandon Hopkins
- Yahoo! to Keynote 4th Cloud Expo: Accelerating Innovation with Cloud Computing
- Google Version 2.0: Googzilla - The Calculating Predator
- ManageWP Powers Over 100,000 WordPress Sites Within Three Months of Launch
- Ulitzer’s Amazing First 30 Days in Public Beta
- Google's Competitive Advantage: It Leverages "The Power of Free"
- Ulitzer vs. Ning - a Quick Review
- AOL To Enhance Video Search Engine by Adding RSS Feeds
- Confessions of a Ulitzer Addict