Blog Archives

Abusing Hadoop

By Robin Morris in A Word from the Engineers, Featured on July 11, 2013

blog image

Hadoop has, for good reasons, become the platform of choice for big data processing. It’s open source, it’s being developed rapidly, it runs on commodity hardware with no per-node licensing fees, and it has an active community and a steadily growing body of knowledge and experience built around it. All of which make adopting hadoop an easy sell for both start-ups and enterprise users who wish to migrate away from ... Read More »

Machine Learning in Hive

By Robin Morris in A Word from the Engineers on March 20, 2013

blog image

As I’ve written before, we do a lot of our development here at Baynote in Hive, allowing us to leverage the power of our Hadoop cluster, whilst insulating us from writing low-level map-reduce jobs. One question that comes up on the hive mailing list from time-to-time is how to implement machine learning algorithms within Hive. Twitter told ... Read More »

Writing a Hive Generic UDF

By Robin Morris in A Word from the Engineers, Featured on November 13, 2012

blog image

(How to convert an array<struct<target: bigint,quantity:int,price:float>> into an array<struct<target: bigint,quantity:int,price:float,externalid:string>> ) If you’ve been reading my blog posts over the last few months, you will have noticed that they’ve been focused more on the issues in the engineering side of Baynote, and less on the technical details. This blog post is an exception. It’s going to get very technical very quickly. Sorry. The back-end component of our ... Read More »