You are currently browsing the tag archive for the ‘hadoop’ tag.

Bryan and Vivek from Rapleaf invited me to talk at there hadoop event this Tuesday. Thanks again.
I presented a experience report of a in production system we build over the last year with hadoop and katta (lucene the grid style). It was a fun event and quite a lot of people showed up.
Here you find my katta, pig and hadoop in production – experience report slides .

And here are the videos

Part 1 (Bryan):

Part 2 (Stefan):

Part 3 (Arun):



Yeah!!! Yesterday we release our first katta version!!
My colleagues worked very hard on this and yesterday we finished our first big milestone. I also gave a katta talk at the hadoop conference and I will post the slides as soon I can.

Here goes the release announcement:



After 5 month work we are happy to announce the first developer preview release of katta.This release contains all functionality to serve a large, sharded lucene index on many servers.Katta is standing on the shoulders of the giants lucene, hadoop and zookeeper.


Main features:

+ Plays well with Hadoop
+ Apache Version 2 License.
+ Node failure tolerance
+ Master failover
+ Shard replication
+ Plug-able network topologies (Shard – Distribution and Selection Polices)
+ Node load balancing at client

Please give katta a test drive and give us some feedback!



Getting started in less than 3 min:

Installation on a grid:

Katta presentation today (09/17/08) at hadoop user, yahoo mission college:
* slides will be available online later

Many thanks for the hard work:
Johannes Zillmann, Marko Bauhardt, Martin Schaaf (101tec)

RSS events

  • An error has occurred; the feed is probably down. Try again later.

some photos