Nathan Bijnens

Introduction

Name

Nathan Bijnens

Address
On request
9042 Ghent
Belgium
E-mail
Website
GitHub
Twitter
LinkedIn
Skype
Employment
Hadoop and Big Data Consultant at DataCrunchers
Nathan Bijnens DataCrunchers

I am Nathan Bijnens, a developer with a passion for great code, the web and Big Data. I am interested in programming and system administration, especially where they meet, from scaling platforms to designing the architecture of new and existing products and everything in between.

I am focused on data analysis and building Big Data Applications. Using Hadoop, in combination with Hadoop Pig, Hive and Cascading. I follow the rise of real-time big data closely, actively developing applications on top of Storm. And designing Lambda-like architectures. The infrastructure side interests me as well, and I am learning more about Business Intelligence and visualizing big data. I advise on Big Data Strategies and evangelise Big Data to clients and at conferences.

I have a lot of experience with PHP, Java and other related technologies like MySQL, nosql, memcached, nginx and a lot more. I strongly believe in unit tests and design patterns, to write precise and easy maintainable code that works.

I am a passionate linux system engineer, follower of the devops movement. Using Puppet and Ganglia to automate and monitor deployments.

I am inquisitive, I love learning about new things and improving what I know. I am very passionate about what I do, and I have strong analytical skills.

experience

As from April 2012 I work for

DataCrunchers

, as a

Hadoop and Big Data Consultant

. DataCrunchers is the leading Belgian consultancy firm around everything Big Data. I co-develop on our internal Semantic Analysis Engine on top of Storm, I co-presented our Hadoop Ecosystem course, I created a new website using Drupal and I am responsible for our Microsoft Big Data partnership.
I have been working for the following clients: Octopin, HSHMRK & Ewals Intermodal.
  • Octopin
    From February 2013 till May 2013.

    Defining and implementing the architecture for Octopin, a Pinterest social media analytics startup. I designed and implented a Lambda Architecture, on top of Storm and Hadoop, using Redis, Voldemort, Cascading as well as Thrift.

    Java Spring DI maven Storm Hadoop Thrift Pail Cascading Kafka Zookeeper Voldemort Redis JSoup Guava jUnit Puppet Nagios Logstash Ganglia Linux Software Architecture Lambda Architecture
  • HSHMRK
    From December 2012 to January 2013.

    Defining and implementing the architecture for hshmrk, a data visualization startup. The application backend is written as a Jersey REST (Java), service, using ElasticSearch as storage. The frontend is a AngularJS and D3 webapplication. This approach allows us to easily scale.

    Java ElasticSearch AngularJS D3 Jersey maven Jackson Spring DI jUnit Data Visualization Software Architecture Puppet
  • Ewals Intermodal
    From October 2012 and ongoing.

    Developing Oracle database views for integration of Greencat and Crystal Reports.

    Oracle Greencat Crystal Reports SQL
  • IHarvest

    I co-develop and I am the current lead on the IHarvest project. It is a distributed HTTP Fetcher & Parser on top of Storm, the results are stored on HDFS for more extensive querying using Hadoop.

    Storm Java HDFS Hadoop
  • Semantic Analysis Engine

    I co-develop on our internal Semantic Analysis Engine on top of Storm and ElasticSearch. The frontend is being written in Jersey & AngularJS.

    Storm Java ElasticSearch AngularJS Jersey Bootstrap
  • Microsoft Big Data Partnership

    Responsible for the contact with Microsoft.

    Speaker at the Microsoft Inspirience Day about Big Data.
  • website

    Creating a Drupal based website, hosted on Windows Azure. I touched all aspects of creating this website, from designing, implementing to copy writing.

    Windows Azure Bootstrap CSS HTML5 Copy Writing IIS Management Drupal
  • Business cards & various grapic design

    Designing the DataCrunchers business cards and a company flag.

    Graphic Design Inkscape QR
  • Internal test cluster

    Setting up an internal Hadoop, ElasticSearch & Storm test cluster using Puppet and Cloudera Manager. I created some Puppet modules to manage Storm, ElasticSearch and Ganglia.

    Cloudera Manager Storm ElasticSearch Puppet Hadoop Hadoop Pig HDFS Ganglia Zentyal
At the

SQLUser Group Belgium

meetup of February 2013 I presented about

Big Data, Hadoop and HDInsight

, together with Wesley Backelant of Microsoft Belgium.
At the 2013 edition of the

FOSDEM NoSqlRoom

, an open source conference, I gave a presentation about

A real-time architecture using Hadoop & Storm

.
At the 2013

Big Data & Security Conference

of

LSEC

, I gave a presentation about

A Vision on Data

.
Since March 2007 I am the

Consultant, Managing Director and founder

of

Servs

BVBA; mainly active as an IT consultant in the field of Big Data & hadoop, web applications (PHP, sql, scalability, best practices, …), devops & linux system administration and hosting applications.
From March 2011 till April 2012 I worked for

iController

, as

Lead Application and Warehouse Developer

. We build a credit management web application, written in PHP. I managed a small group of developers, took the lead on everything technical and coordinated with the directors, partners and clients.
My job involved lots of PHP, refactoring, performance tuning, system engineering and bits of data analysis.
  • Lead Developer iController

    I developed our credit management web application in PHP, I managed a small group of developers, took the lead on everything technical and coordinated with the directors, partners and clients.

    Requirement Analysis PHP Symfony Propel PHPUnit FPDF Jenkins ant git MySQL SQL jQuery Prototype PDO XML XSchema
  • iController Server Setup

    I virtualized and automated the whole iController server setup using Puppet. I created and extended several open-source Puppet modules.

    Puppet libvirt KVM Ubuntu MySQL PHP Release Management Jenkins git Amazon S3 BIND ssh LVM
  • iSubscriber

    Creating a small web application to organize and input subscriptions into Octopus.

    MySQL PHP Symfony
  • Various clients
    From July till November 2011.
    1. Analyzing the requirements, estimating and taking the lead.
    2. Extracting data from various Accounting software to iController.
    Accounting Software AS400 ETL XML XSchema Oracle SQL MSSQL Navision
  • LeasePlan
    From July till November 2011.
    1. Analyzing the requirements, coaching my co-developers and coordinating with the client.
    2. Extracting data from and linking SAP FI to iController.
    3. Developing a custom workflow engine.
    SAP FI AS400 ETL XML SQL
From April till August 2011 I worked as a consultant for

Truvo

, the Belgian yellow pages, doing a

hadoop deduplication

job. I used Hadoop & Hadoop Pig, wrote custom UDFs in Java for Ngram matching and solved performance issues. At first Amazon Elastic MapReduce was used, later I setup a Hadoop cluster.
At the 2011 edition of the

FOSDEM DataDevRoom

, an open source conference, I gave a presentation about

Hadoop Pig: MapReduce the easy way

, it was featured on the front page of slideshare.net.
From September 2010 till February 2011 I was employed at

Netlog

as a

Warehouse and Web Developer

, creating an analytical warehouse based on Hbase. I also created large parts of the processing infrastructure using Hadoop and Hadoop Pig. I also advised on new technologies, GIT migration, best practices & unit tests. Netlog is a social community network, with over 70 million members, mainly active in Europe and the Arabic world.
  • Warehouse Developer

    Setting up the Hadoop infrastructure, analyzing data with Hadoop Pig and creating a dashboard using Symfony2 and Hbase & Thrift.

    Hadoop HDFS Hadoop Pig Syslog Thrift Hive Hbase git Symfony2 Google Graphs PDO Memcached Redis MySQL SQL
  • Twoo Core Framework Developer

    Creating a new PHP Framework (no existing frameworks were allowed) as base for the new dating site Twoo. I advised on Design Patterns and Best Practices. I developed the Security and ACL platform as well.

    PHP HTTP Request routing Security ACL Design Patterns Best Practices Unit Tests PHPUnit git
  • Web Developer (Netlog)

    I introduced an open source Event framework and presented it to my co-developers. I introduced new application log functionality, for logging to Hadoop. I evangelized Unit tests, including an initial implementation and presentations. I evaluated Redis, Memcached & Membase (now Couchbase).

    PHP PHPUnit Unit tests Syslog Event framework MySQL Design Patterns Best Practices git svn git-svn Memcached Redis Performance testing Membase
From October 2009 to August 2010 I started working at

iController

, a small SME in Ghent, as

web application developer

. I developed a new, greatly improved debtor & credit management web application. We used the Symfony PHP framework as a starting point to create a stable and easy to maintain application.
In 2009 I was active as a consultant to a Dutch services organization. The assignment mainly consisted of

creating a link between existing systems and a new website

.
From 2008 till present I perform various

short term consultancy

tasks commissioned by

Sinergio

. For a
semi government organisation
I set up a new tomcat server and did a general check up on their Linux servers. For another company I configured an ASP.Net application and installed an IIS server. For
Sio Hosting
(Formerly Sinergio Hosting) I planned and executed the move from traditional servers to a virtualized xen cluster. I also check the integrity and security of their hosting platform on a regular basis. I advise them on strategic planning.
During 2004 – 2010 I advised on and co-developed for

4levels

several websites, backend systems and web applications.
During my secondary education I created a

PHP CMS

, with features as native PHP templates, module in module support (recursion) following a very basic MVC pattern. It was used and expanded by two web agencies to create over 30 websites.

skills

I am very interested in the processing and storage of

Big Data

. I read, and try out as much as I can about Hadoop, MapReduce and I'm learning Java in the process. I program in Cascading, Hadoop Pig Latin and write queries in Hive regularly. Also the setup and administration of Hadoop, Hbase, Storm & Zookeeper clusters interests me a lot.
Hadoop & Big Data related skills:
  • Using & administrating a Hadoop cluster, including Hbase and Zookeeper.
    • Deploying Hadoop on Azure.
    • Using Amazon Elastic MapReduce.
    • Whirr to deploy Hadoop to Amazon EC2.
  • Query the data with Cascading, Hadoop Pig & Hive.
  • Write Pig UDFs in Java.
  • Developing solutions for real-time big data using Storm.
Next to programming I have always been passionate about

Linux

and open source. I have used over the last 10 years several distributions from Debian & Ubuntu, CentOS over Gentoo to Linux From Scratch. I have setup countless servers from virtualized (Xen, VirtualBox, libvirt) to almost bare metal web servers. I administrated Apache and Nginx servers, configured and tuned MySQL and played with Postfix extensively.

I am following the DevOps movement. I am using Puppet to automate and Ganglia to monitor critical infrastructure. I have open sourced and contributed to several Puppet modules.

I follow and try out with great interest Cloud related techniques and technologies, in all its forms: IaaS, PaaS, MaaS, … I have used as test or in production Amazon S3, Amazon EC2, Amazon MapReduce, Google BigQuery2 (private beta tester), Windows Azure Platform and Hadoop on Azure (private beta tester) and OpenStack.

I am learning

Java

, mostly using Spring, Maven and Jersey in combination with the JavaScript MVC framework AngularJS.
  • Creating Threaded servers, using Thrift.
  • Autowiring & Dependency Injection using Spring.
  • Consuming the Twitter & LinkedIn API, using OAuth.
  • Creating a json based REST API using Jersey.
  • Using the common libraries like Guava, Apache Commons & slf4j.
  • Unit testing using JUnit.
I have used

PHP

for over 9 years, from the new namespaces, closures over Iterators to PDO, as well as some more exotic multibyte & UTF8 functions, xml & xslt and pecl extensions. The last three years I've been using PHP in a combination with the symfony & Symfony2 frameworks. I use PHPUnit extensively, integrated with Jenkins, to assure that what I write works and keeps on working.
Some PHP related skills:
  • MySQL, but also various other databases, like Oracle, MSSQL, Sybase, ODBC using plain PDO, the Doctrine & Doctrine2 ORM and Propel.
  • PHP to Hbase communication using Thrift.
  • Memcached & Redis.
  • html5, CSS & jQuery
  • Creating a Drupal site, using Bootstrap and deploying on Windows Azure.
  • GIT, Mercurial & SVN.
I am setting my first steps into the

Python

world. I used Python to create a Ganglia proxy and aggregator, I consumed REST and SOAP API's, wrote Threaded applications and deployed to the Google App Engine.
Some basic knowledge, but willing to improve:
  • Graph databases (Neo4J, HyperGraphDB).
  • NoSQL data stores, like CouchDB, MongoDB, Cassandra, riak, ...
  • Scrum, Extreme Programming
  • Security & penetration testing.

Languages

Dutch
Mother tongue
English
Very fluent

Other skills

In possession of a drivers license.

education

IBM Big Data Fundamentals Technical Mastery (N32)

IBM | 000-28363001 | January 2013

IBM InfoSphere BigInsights (Hadoop & Big Data) Technical Professional

IBM | 000-28146057 | October 2012
Period
2005 - 2007
Institute
University of Antwerp
Program
Social-Economic Sciences
Period
1999 - 2005
Institute
Hibernia Steinerschool in Antwerp
Program
ASO, Higher Secondary education
Achieved in
2005