Apache Kudu (incubating) Weekly Update June 1, 2016

Posted 01 Jun 2016 by Jean-Daniel Cryans

Welcome to the eleventh edition of the Kudu Weekly Update. This weekly blog post covers ongoing development and news in the Apache Kudu (incubating) project.

If you find this post useful, please let us know by emailing the kudu-user mailing list or tweeting at @ApacheKudu. Similarly, if you’re aware of some Kudu news we missed, let us know so we can cover it in a future post.

Development discussions and code in progress

  • Jean-Daniel Cryans, the release manager for 0.9.0, indicated that the release is almost ready and the first release candidate will be put up for vote this week.

  • Dan Burkert pushed a change that disallows default partitioning when creating a new table. This is due to many reports from users experiencing bad performance because their table was created with only one tablet. Kudu will now force users to partition their tables.

  • Todd Lipcon ran YCSB stress tests on a cluster and discovered that compactions were taking hours instead of seconds. He pushed a change that solves the issue as part of our general effort to improve performance for zipfian update workloads.

  • Todd also changed some flush-related defaults to encourage parallel IO and larger flushes. This is based on his previous work that he documented in this blog post.

  • Will Berkeley made a few improvements last week, but one we’d like to call out is that he removed the Java’s kudu-mapreduce module dependency on Hadoop’s hadoop-common test jar. This solved build issues while also removing a nasty dependency.