Archive for the ‘cluster’ Category

Yahoo! distribution of Hadoop

Tuesday, November 3rd, 2009

Yahoo! have released the Yahoo! Distribution of Hadoopa source code distribution that is based entirely on code found in the Apache Hadoop project. This source distribution includes code patches that we have added to improve the stability and performance of our clusters.

The Yahoo! Distribution of Hadoop can be downloaded from:

http://github.com/yahoo/hadoop-common/tree/yahoo-hadoop-0.20

If you want an installable image of the code, Cloudera has included the Yahoo! Distribution of Hadoop in raw package form available at:

http://www.cloudera.com/yahoo-packages/

AMI MySQL cluster database access

Friday, October 2nd, 2009

One of the challenges when deploying MySQL databases and clusters on Amazon EC2 AMIs is that the IP address of the AMIs are assigned dynamically. If your topology involves only a single instance then you can simply use localhost to access your MySQL server.

Cloud Foundry solves this problem by ensuring that ‘dbmaster’ always resolves to the IP address of the MySQL server or lets you launch the application with system property that specifies the MySQL server hostname.

Using the JVM option “-DdbHostName=${databasePrivateDnsName}” sets the the system property’dbHostName’ to the MySQL server’s host name. A Spring/Java application can then use a PropertyPlaceholderConfigurer bean to substitute this value into the database url, e.g. jdbc:mysql://${dbHostName}:3306/.

Sun Cloud

Friday, March 20th, 2009

Sun Cloud (does that make sense?) anyway have a look at http://www.sun.com/solutions/cloudcomputing/index.jsp

Sun Cloud API : http://kenai.com/projects/suncloudapis/pages/Home

Sun Cloud Beta Signup : https://www2.sun.de/dct/forms/reg_us_2409_516_0.jsp

EUCALYPTUS for “cloud computing” on clusters

Thursday, March 19th, 2009

EUCALYPTUS – Elastic Utility Computing Architecture for Linking Your Programs To Useful Systems – is an open-source software infrastructure for implementing “cloud computing” on clusters. The current interface to EUCALYPTUS is compatible with Amazon’s EC2 interface, but the infrastructure is designed to support multiple client-side interfaces. EUCALYPTUS is implemented using commonly available Linux tools and basic Web-service technologies making it easy to install and maintain.

Amazon EC2 and a Hadoop cluster

Friday, December 12th, 2008

According to http://www.ibm.com/developerworks/linux/library/l-hadoop/ the New York Times used Hadoop and Amazon EC2 to convert 4TB of TIFF images – including 405K large TIFF images, 3.3M SGML articles, and 405K XML files – into 800K Web-friendly PNG images in 36 hours.