You are currently browsing the category archive for the ‘java’ category.

Bryan and Vivek from Rapleaf invited me to talk at there hadoop event this Tuesday. Thanks again.
I presented a experience report of a in production system we build over the last year with hadoop and katta (lucene the grid style). It was a fun event and quite a lot of people showed up.
Here you find my katta, pig and hadoop in production – experience report slides .

And here are the videos

Part 1 (Bryan):

Part 2 (Stefan):

Part 3 (Arun):


Yeah!!! Yesterday we release our first katta version!!
My colleagues worked very hard on this and yesterday we finished our first big milestone. I also gave a katta talk at the hadoop conference and I will post the slides as soon I can.

Here goes the release announcement:



After 5 month work we are happy to announce the first developer preview release of katta.This release contains all functionality to serve a large, sharded lucene index on many servers.Katta is standing on the shoulders of the giants lucene, hadoop and zookeeper.


Main features:

+ Plays well with Hadoop
+ Apache Version 2 License.
+ Node failure tolerance
+ Master failover
+ Shard replication
+ Plug-able network topologies (Shard – Distribution and Selection Polices)
+ Node load balancing at client

Please give katta a test drive and give us some feedback!



Getting started in less than 3 min:

Installation on a grid:

Katta presentation today (09/17/08) at hadoop user, yahoo mission college:
* slides will be available online later

Many thanks for the hard work:
Johannes Zillmann, Marko Bauhardt, Martin Schaaf (101tec)

When maven1 was terrible, since the xml scripting engine was buggy. Maven2 was great – I thought and I used it heavily. But now a couple years later I understand people still using ant. I had great hope using maven but the reality proved, that maven is one of the biggest time waster in my developer life.
It is nice and low effort if you do simple one-project-just-build-me-a-jar kind of projects. But as soon you start working on serious projects – there is no chance you are productive with maven.
So what is good on maven:
* convention over configuration
* transient dependency management
* a lot of plugins
Though the bad parts are:
* all those plugins are very buggy
* it uses a descriptive language (xml), so no real possibility to script logic
* do some custom logic is a pain
** as soon you do any kind of different behavior your xml blows up
** custom plugins need to be external projects

And I could go on for hours. So it is time for something new. Sure there is buildr, but I dont want to learn even another language just for builds. So why dont use java – right use groovy.
So here is my new star on the build sky. Gradle
It has all the nice feature I want:
* very lightweight
* convention over configuration
* transient dependency management (we can still use all our maven repositories)
* a script language to do my custom logic
* java language syntax
* it works great with all build servers and moving to it is super easy since it comes with a script that download the release if needed.

So you should give it a try.

If you are a software developer you might already know this. In case there is a performance problem you should profile your application first and then improve performance on the hot spots. Unfortunately in my carrier I saw many developers doing it wrong and optimize where they guess the problem is. Of course they end up optimizing on the wrong place. Joshua Bloch talks about this problem in Effective Java.

Anyhow different topic, being a productivity freak and trying to otimize all my work on my computer I always looking for tune ups. How save me to type in passwords, autocomplete texts etc. 

But wait, didn’t I just say profile before optimize – right. So I took this approach to my computer user life. I installed slife a couple weeks ago and profiled my daily work. So where do I really spend time?


I learned I spend pretty frequently over 2 hours out of my 8 to 12 working hours in just reading through all the mails. Wow! 

Definitely time to start improving this and Merlin Manns talk looks like a very good starting point:

Do I have all key words to get a good search score? Good! Just spend 1 hours to figure out a problem where we use the JAVA_HOME environment variable in a unit test. This test works great running it with maven but did not not in eclipse on os x. The problem is of course that environment variables set in the shell are not available in apps like eclipse. So worth to spread the word how to solve this. 

Here is the best writ up I found:

It is the perfect day to make one of this once in a while blog posts. I pimped up my blog a little just for this post. 

During the last weeks we worked hard on our new open source project and we finally could release the code into the source forge svn. We named the project katta since it is a african kind of monkey that loves to life in groups and is very social – so our project is. 

Katta is distributed lucene grid. Allows to server very large lucene indexes distributed over many servers. Therefore we divide the index into shards and each server serves a couple of those shards. Each shard is replicated so in case on server crashes katta can still serve the complete index without any hick-ups. 

We do not yet allow realtime/hot index updates but people from bailey another young source forge project work on that. However Mark Butler was so kind contributing some code that solves this problem and I guess we will work hard to get this integrated soon.

So check it out and subscribe the mailing list to give us some feedback.



Spring 2.5 annotation is pretty useful however it makes PropertyPlaceholderConfigurer unusable. 

In all our projects we heavily used PropertyPlaceholderConfigurer to inject configuration values via constructor injection to configure our object stack. We did run in problems using PropertyPlaceholderConfigurer in our latest project using spring 2.5 and autowire injection. Since we useally run our projects ins different deployment modes (local, test, live) we want to switch configuration values based on a system property. 

We end up with a small hack I want to share with those that search for it with any search engine. 

We basically created a object we added to our context.xml that implements the BeanFactoryPostProcessor. Within the postProcessBeanFactory method we check the deployment mode property and based on that we load the property file. Than we we deploy peropty by property as a bean of type String, Integer or Boolean with the property key name as bean name. This looks for example like this:

RootBeanDefinition rbd = new RootBeanDefinition();

ConstructorArgumentValues constructorArgumentValues = new ConstructorArgumentValues();
boolean success = false;
 try {
   int parseInt = Integer.parseInt(value);
   registry.registerBeanDefinition(keyName, rbd);
   success = true;
  } catch (Exception e) {
    // nothing to do

Unfortunately Spring 2.5 annotation autowireing (see spring jira) does not support primitives. So we had to use non primitives in the constructor as well. However it worked pretty well. For example for the properties “maxHitsPerPage” and “maxPagesToShow” a constructor looks like this:

 public UsersController(IDesignDao designDao, IUserDao userDao, @Qualifier("maxHitsPerPage")
    Integer hitsPerPage, @Qualifier("maxPagesToShow")
    Integer pagesToShow) {

Where “maxHitsPerPage” and “maxPagesToShow”  are the property key names in our configuration files. Hope that helps a little, shoot me a mail if you need more sample code.


We use Parameterized REST URLs with Spring MVC in many of our projects. In our latest project we used Spring 2.5 annotations. 

We did run in problems using the method level annotation. 

For example we have a method signature like this:

@RequestMapping(value = "/tags/(*:tagName)/(*:page)")
public String getDetailsByTagName(ModelMap modelMap, @RequestParam("tagName") String tagName, RequestParam("page") int page) {

 Urls like “/tags/(*:tagName)/(*:page)” works without any problem if we place the RequestMapping tag on a type level but it did not work on a method level. The problem was that Spring 2.5 installs a set of deault handlers and in specific the AnnotationMethodHandlerAdapter. However the AnnotationMethodHandlerAdapter does use the AntPathMatcher that of course can not handle parameterized urls like “/tags/(*:tagName)/(*:page)”. The solution is simple:

Create a object like this:

public class ParameterizedAnnotationMethodHandlerAdapter extends AnnotationMethodHandlerAdapter {

    public ParameterizedAnnotationMethodHandlerAdapter() {
        setPathMatcher(new ParameterizedPathMatcher());

Now you need to install this handler by simply create this bean in your spring context.xml file. But this has a little side effect. Spring now will not install the other HandlerAdapters in case you install one manually, so just install all handlers like this:

<!-- we need to overwrite the pathmatcher in AnnotationMethodHandlerAdapter, there for we need to install all handlers manually -->
<bean class="com.IOItec.urags.util.spring.url.ParameterizedAnnotationMethodHandlerAdapter"/>
 <bean class="org.springframework.web.servlet.mvc.HttpRequestHandlerAdapter"/>

 <bean class="org.springframework.web.servlet.mvc.SimpleControllerHandlerAdapter"/>

 <bean class="org.springframework.web.servlet.mvc.throwaway.ThrowawayControllerHandlerAdapter"/>


Hope that helps..

RSS events

  • An error has occurred; the feed is probably down. Try again later.

some photos

fall in california. :) still enough flowers ...

flowers in my garden

neighbors cat

More Photos