Nathan Bijnens



Nathan Bijnens

On request
9042 Ghent
Hadoop and Big Data Consultant at DataCrunchers
Nathan Bijnens DataCrunchers

I am Nathan Bijnens, a developer with a passion for great code, the web and Big Data. I am interested in programming and system administration, especially where they meet, from scaling platforms to designing the architecture of new and existing products and everything in between.

I am focused on data analysis and building Big Data Applications. Using Hadoop, in combination with Hadoop Pig, Hive and Cascading. I follow the rise of real-time big data closely, actively developing applications on top of Storm. And designing Lambda-like architectures. The infrastructure side interests me as well, and I am learning more about Business Intelligence and visualizing big data. I advise on Big Data Strategies and evangelise Big Data to clients and at conferences.

I have a lot of experience with PHP, Java and other related technologies like MySQL, nosql, memcached, nginx and a lot more. I strongly believe in unit tests and design patterns, to write precise and easy maintainable code that works.

I am a passionate linux system engineer, follower of the devops movement. Using Puppet and Ganglia to automate and monitor deployments.

I enjoy working with clients and partners, from giving advice, talking about the Business and Technological value of Big Data, to Requirement Analysis.

I am inquisitive, I love learning about new things and improving what I know. I am very passionate about what I do, and I have strong analytical skills.


As from April 2012 I work for


, as a

Hadoop and Big Data Consultant

. DataCrunchers is the leading Belgian consultancy firm around everything Big Data. I co-develop on our internal Semantic Analysis Engine on top of Storm, in Java, I co-presented our Hadoop Ecosystem course, I created a new website using Drupal and I am responsible for our Microsoft Big Data partnership.
I have been working for the following clients: Technicolor, Flemish Government, AWV (Macq), Octopin, HSHMRK & Ewals Intermodal.
  • Big Data & DevOps Engineer, Technicolor
    From October 2013.

    As a Big Data and DevOps Engineer, I am responsible for the scalability, quality and Operational Intelligence of a new platform. I introduced the Continuous Integration platform, using Jenkins, added unit tests. The platform is build using Chef, tries to be cloud independent (runs on Amazon AWS, Softlayer, ...).

    DevOps Big Data Java jUnit maven Jenkins Unit Testing Storm Spark HDFS Ruby Chef Ironfan Fog Scalability Amazon AWS Amazon IAM Amazon EC2 Softlayer Vagrant Linux
  • Project Manager ABBAMelda, Flemish Government (Macq)
    From June 2013 till October 2013.

    Project Manager and Lead Developer of ABBAMelda, a ticket and maintenance management system, originally developed by Siemens. ABBAMelda consists of a Java EE backend, an Informix database, with a PHP/jQuery frontend. Under my lead, we created an enhanced Tablet intranet site, improved the bulk upload possibilities, additional REST services, introduced unit testing and switched to git. I was responsible for coordinating with different teams within the Flemish Government, as well as the contractor (Macq).

    Software Architecture Project Management Agile Scrum Java JPA3 EclipseLink maven jUnit Informix REST SOAP PHP jQuery Refactoring Unit Testing Puppet Linux Mobile Development
  • Octopin
    From February 2013 till June 2013.

    Defining and implementing the architecture for Octopin, a Pinterest social media analytics startup. I designed and implented a Lambda Architecture (in Java), on top of Storm and Hadoop, using Redis, Voldemort, Cascading as well as Thrift.

    Java Spring DI maven Storm Hadoop Thrift Pail Cascading Kafka Zookeeper Voldemort Redis JSoup Guava jUnit Puppet Nagios Logstash Ganglia Linux Software Architecture Lambda Architecture
    From December 2012 to January 2013.

    Defining and implementing the architecture for hshmrk, a data visualization startup. The application backend is written as a Jersey REST (Java), service, using ElasticSearch as storage. The frontend is a AngularJS and D3 webapplication. This approach allows us to easily scale.

    Java ElasticSearch AngularJS D3 Jersey maven Jackson Spring DI jUnit Data Visualization Software Architecture Puppet
  • Ewals Intermodal
    From October 2012 and ongoing.

    Developing Oracle database views for integration of Greencat and Crystal Reports.

    Oracle Greencat Crystal Reports SQL
  • IHarvest

    I co-develop and I am the current lead on the IHarvest project. It is a distributed HTTP Fetcher & Parser on top of Storm, writen in Java, the results are stored on HDFS for more extensive querying using Hadoop.

    Storm Java HDFS Hadoop
  • Semantic Analysis Engine

    I co-develop on our internal Semantic Analysis Engine on top of Storm and ElasticSearch. The web interface is build around a Java, Jersey backend and a frontend in AngularJS.

    Storm Java ElasticSearch AngularJS Jersey Bootstrap
  • Microsoft Big Data Partnership

    Responsible for the contact with Microsoft.

    Speaker at the Microsoft Inspirience Day about Big Data.
  • website

    Creating a Drupal based website, hosted on Windows Azure. I touched all aspects of creating this website, from designing, implementing to copy writing.

    Windows Azure Bootstrap CSS HTML5 Copy Writing IIS Management Drupal
  • Business cards & various grapic design

    Designing the DataCrunchers business cards and a company flag.

    Graphic Design Inkscape QR
  • Internal test cluster

    Setting up an internal Hadoop, ElasticSearch & Storm test cluster using Puppet and Cloudera Manager. I created some Puppet modules to manage Storm, ElasticSearch and Ganglia.

    Cloudera Manager Storm ElasticSearch Puppet Hadoop Hadoop Pig HDFS Ganglia Zentyal

Devoxx '13

, the biggest European Java conference, I presented

A real-time architecture using Hadoop and Storm


JAX London '13

, a Java Enterprise and Big Data conference, I presented

A real-time architecture using Hadoop and Storm

At the

SQLUser Group Belgium

meetup of February 2013 I presented about

Big Data, Hadoop and HDInsight

, together with Wesley Backelant of Microsoft Belgium.
At the 2013 edition of the


, an open source conference, I gave a presentation about

A real-time architecture using Hadoop & Storm

, and has been viewed over 9500 times.
At the 2013

Big Data & Security Conference



, I gave a presentation about

A Vision on Data

Since March 2007 I am the

Consultant, Managing Director and founder



BVBA; mainly active as an IT consultant in the field of Big Data & hadoop, web applications (PHP, sql, scalability, best practices, …), devops & linux system administration and hosting applications.
From March 2011 till April 2012 I worked for


, as

Lead Application and Warehouse Developer

. We build a credit management web application, written in PHP. I managed a small group of developers, took the lead on everything technical and coordinated with the directors, partners and clients.
My job involved lots of PHP, refactoring, performance tuning, system engineering and bits of data analysis.
  • Lead Developer iController

    I developed our credit management web application in PHP, I managed a small group of developers, took the lead on everything technical and coordinated with the directors, partners and clients.

    Project Management Requirement Analysis PHP symfony Propel PHPUnit FPDF Jenkins ant git MySQL SQL jQuery Prototype PDO XML XSchema
  • iController Server Setup

    I virtualized and automated the whole iController server setup using Puppet. I created and extended several open-source Puppet modules.

    Puppet libvirt KVM Ubuntu MySQL PHP Release Management Jenkins git Amazon S3 BIND ssh LVM
  • iSubscriber

    Creating a small web application to organize and input subscriptions into Octopus.

    MySQL PHP symfony
  • Various clients
    From July till November 2011.
    1. Analyzing the requirements, estimating and taking the lead.
    2. Extracting data from various Accounting software to iController.
    Accounting Software AS400 ETL XML XSchema Oracle SQL MSSQL Navision
  • LeasePlan
    From July till November 2011.
    1. Analyzing the requirements, coaching my co-developers and coordinating with the client.
    2. Extracting data from and linking SAP FI to iController.
    3. Developing a custom workflow engine.
From April till August 2011 I worked as a consultant for


, the Belgian yellow pages, doing a

hadoop deduplication

job. I used Hadoop & Hadoop Pig, wrote custom UDFs in Java for Ngram matching and solved performance issues. At first Amazon Elastic MapReduce was used, later I setup a Hadoop cluster.
At the 2011 edition of the

FOSDEM DataDevRoom

, an open source conference, I gave a presentation about

Hadoop Pig: MapReduce the easy way

, it was featured on the front page of and has been viewed over 13000 times.
From September 2010 till February 2011 I was employed at


as a

Warehouse and Web Developer

, creating an analytical warehouse based on Hbase. I also created large parts of the processing infrastructure using Hadoop and Hadoop Pig. I also advised on new technologies, git migration, best practices & unit tests. Netlog is a social community network, with over 70 million members, mainly active in Europe and the Arabic world.
  • Warehouse Developer

    Setting up the Hadoop infrastructure, analyzing data with Hadoop Pig and creating a dashboard using Symfony2 and Hbase & Thrift.

    Hadoop HDFS Hadoop Pig Syslog Thrift Hive Hbase git Symfony2 Google Graphs PDO Memcached Redis MySQL SQL
  • Twoo Core Framework Developer

    Creating a new PHP Framework (no existing frameworks were allowed) as base for the new dating site Twoo. I advised on Design Patterns and Best Practices. I developed the Security and ACL platform as well.

    PHP HTTP Request routing Security ACL Design Patterns Best Practices Unit Testing PHPUnit git
  • Web Developer (Netlog)

    I introduced an open source Event framework and presented it to my co-developers. I introduced new application log functionality, for logging to Hadoop. I evangelized Unit tests, including an initial implementation and presentations. I evaluated Redis, Memcached & Membase (now Couchbase).

    PHP PHPUnit Unit Testing Syslog Event Processing MySQL Design Patterns Best Practices git svn git-svn Memcached Redis Performance testing Membase
From October 2009 to August 2010 I started working at


, a small SME in Ghent, as

web application developer

. I developed a new, greatly improved debtor & credit management web application. We used the symfony framework as a starting point to create a stable and easy to maintain application.
In 2009 I was active as a consultant to a Dutch services organization. The assignment mainly consisted of

creating a link between existing systems and a new website

From 2008 till present I perform various

short term consultancy

tasks commissioned by


. For a
semi government organisation
I set up a new tomcat server and did a general check up on their Linux servers. For another company I configured an ASP.Net application and installed an IIS server. For
Sio Hosting
(Formerly Sinergio Hosting) I planned and executed the move from traditional servers to a virtualized xen cluster. I also check the integrity and security of their hosting platform on a regular basis. I advise them on strategic planning.
During 2004 – 2010 I advised on and co-developed for


several websites, backend systems and web applications.
During my secondary education I created a


, with features as native PHP templates, module in module support (recursion) following a very basic MVC pattern. It was used and expanded by two web agencies to create over 30 websites.


I am very interested in

Big Data

, from the processing and storage of large volumes, to real-time stream processing, and machine learning. I read, tweet, and try out as much as I can about new Big Data technologies, like Spark and Storm, as well as more established technologies like Hadoop, MapReduce and learning as much as I can in the process. I program in Cascading, Hadoop Pig Latin and write queries in Hive regularly. Also the setup, administration and monitoring of Hadoop, Hbase, Storm & Zookeeper clusters interests me a lot.
Hadoop & Big Data related skills:
  • Using & administrating Hadoop clusters, including Storm, ElasticSearch, Cassandra, Hbase and Zookeeper.
    • Deploying Hadoop on Azure, Amazon & Softlayer.
    • Using Amazon Elastic MapReduce.
    • Whirr or Chef (fog) to deploy Hadoop to Amazon EC2.
  • Query the data with Cascading, Hadoop Pig & Hive.
  • Write Pig UDFs in Java.
  • Developing solutions for real-time Big Data using Storm and Kafka.
  • Combining batch and real-time technologies to create a Lambda architecture (of Nathan Marz), that is resilient to failure, scalable and fast.
Next to programming I have always been passionate about


and open source. I have used over the last 10 years several distributions from Debian & Ubuntu, CentOS over Gentoo to Linux From Scratch. I have setup countless servers from virtualized (Xen, VirtualBox, libvirt, Vagrant) to almost bare metal web servers. I administrated Apache, Nginx servers, configured and tuned MySQL and played with Postfix extensively. As well configured and monitored Cassandra, Redis, Voldemort, ElasticSearch, Hbase clusters.

I am following the DevOps movement. I am both using Puppet and Chef to automate and Ganglia to monitor critical infrastructure. I have open sourced and contributed to several Puppet modules.

In this context I also took my first steps with Ruby.

Developing a Continuous Integration strategy, with related tools like Jenkins.

I follow and try out with great interest Cloud related techniques and technologies, in all its forms: IaaS, PaaS, MaaS, … I have used as test or in production Amazon S3, Amazon EC2, Amazon MapReduce, Amazon IAM, Google BigQuery2 (private beta tester), Windows Azure Platform and Hadoop on Azure (private beta tester) and Softlayer.

I am using


, mostly using Spring, maven and Jersey in combination with the JavaScript MVC framework AngularJS.
  • Developing a Java EE application, with Glassfish, using JPA with EclipseLink.
  • Creating Threaded servers, using Thrift.
  • Autowiring & Dependency Injection using Spring.
  • Consuming the Twitter & LinkedIn APIs, using OAuth.
  • Creating a REST and SOAP based services, using Jersey or JAX-WS.
  • Using the common libraries like Guava, Apache Commons, Joda Time & slf4j.
  • Unit testing using JUnit and Mockito.
I am interested in

Functional Programming

, mostly looking at Clojure. Especially functional programming in relation to Big Data has my focus.
I have used


for over 10 years, from namespaces, closures over Iterators to PDO, as well as some more exotic multibyte & UTF8 functions, xml & xslt and pecl extensions. The last five years I've been using PHP in a combination with the symfony & Symfony2 frameworks. I use PHPUnit extensively, integrated with Jenkins, to assure that what I write works and keeps on working.
Some PHP related skills:
  • MySQL, but also various other databases, like Oracle, MSSQL, Sybase, ODBC using plain PDO, the Doctrine & Doctrine2 ORM and Propel.
  • Memcached & Redis.
  • html5, CSS & jQuery
  • Creating a Drupal site, using Bootstrap and deploying on Windows Azure.
  • git, Mercurial & svn.
I am setting my first steps into the


world. I used Python to create a Ganglia proxy and aggregator, I consumed REST and SOAP API's, wrote Threaded applications and deployed to the Google App Engine.
Some basic knowledge, but willing to improve:
  • Graph databases: Neo4J, HyperGraphDB & Titan.
  • Various NoSQL data stores, like MongoDB, Voldemort, riak, ...
  • Ruby, Clojure and Scala.
  • Scrum, Extreme Programming
  • Security & penetration testing.


Mother tongue
Very fluent

Other skills

In possession of a drivers license.


IBM Big Data Fundamentals Technical Mastery (N32)

IBM | 000-28363001 | January 2013

IBM InfoSphere BigInsights (Hadoop & Big Data) Technical Professional

IBM | 000-28146057 | October 2012
University of Antwerp
Social-Economic Sciences
2005 - 2007
Not Achieved
Hibernia Steinerschool in Antwerp
ASO, Higher Secondary education
1999 - 2005