Hadoop is not an island. To deliver a complete Big Data solution, a data pipeline needs to be developed that incorporates and orchestrates many diverse technologies. A Hadoop focused data pipeline not only needs to coordinate the running of multiple Hadoop jobs (MapReduce, Hive, Pig or Cascading), but also encompass real-time data acquisition and the analysis of reduced data sets extracted into relational/NoSQL databases or dedicated analytical engines.
This session looks at the architecture of Big Data pipelines, the challenges ahead and how to build manageable and robust solutions using Open Source software such as Apache Hadoop, Hive, Pig, Spring Hadoop, Batch and Integration.
Costin Leau is an engineer at Elasticsearch, currently working with NoSQL and Big Data technologies. An open-source veteran, Costin led various Spring projects (Spring OSGi, GemFire, Redis, Hadoop) and authored an OSGi spec.
Speaker at various editions of EclipseCon/OSGi DevCon, JavaOne, Devoxx/Javapolis, JavaZone, SpringOne, TSSJS on Java/Hadoop/Spring related topics.More About Costin »