
With YARN, Pentaho Data Integration jobs can make flexible use of Hadoop resources, expanding and contracting as data volumes and processing requirements change. Pentaho developers familiar with Pentaho Data Integration can exploit the computational power of Hadoop, without having to write complex MapReduce code. Full YARN supportĪnd finally, YARN (MapReduce 2.0)integration, announced earlier this year by Pentaho Labs, is now available in Pentaho 5.1. The pack includes an R Script Executor for Pentaho Data Integration (PDI), removing the burden of data preparation Weka Scoring for PDI to allow the user to “score” data as part of a PDI transformation by applying classification, clustering, and regression models constructed in Weka and Weka Forecasting for PDI, leveraging forecasting models created in Weka’s time series analysis and forecasting environment in order to create future predictions on incoming data within a PDI transformation. By operationalizing two commonly used technologies, R and Weka, Pentaho offloads the burden of the data flow process. The Pack equips data analysts and scientists with a toolkit to build a ‘360 degree customer view’ that blends different data sources, like social and MongoDB, and enable advanced analytics like churn prediction and customer sentiment. Pentaho 5.1 also enables large-scale analytics with the availability of its new Data Science Pack, announced earlier this month. Operationalizing R and Weka for data scientists This gives companies a greater ability to make reliable insights faster, and decreases the need for specialists’ skills. Pentaho version 5.1 enables MongoDB data collections to be analyzed directly ‘at the source,’ without hand-coding or the requirement to prepare data in a staging area.
Pentaho data integration scaling manual#
Analytics on NoSQL database MongoDB without manual coding
Pentaho data integration scaling full#
Pentaho has announced version 5.1 of its business analytics and data integration platform, which enables code-free analytics directly on MongoDB, simplifies the data preparation process for data scientists, and offers full support for YARN.
