Pentaho Data Integration For The First Time Part I
Before, running PDI, make sure your PC environtment is like the below table :
|An operating system:
• Linux (Red Hat Enterprise Linux 5, SUSE Linux Enterprise 10)
• Windows (XP, 7)
• Solaris 10
• Apple OS X (10.5 or newer)
A Java Runtime Environment
One or more data sources:
• Any JDBC-compliant database
• A spreadsheet
• A flat file containing comma-separated values
I will use Windows 7 as operating system, and mySQL as database.
FYI, Pentaho Data Integration consists of these packet :
• Spoon (graphical interface)
Spoon is a graphical user interface that allows you to design transformations and jobs that can be run with the Kettle tools — Pan and Kitchen
• Kitchen (CLI job interface)
Kitchen is a program that executes jobs designed by Spoon in XML or in a database repository.
• Pan (CLI transformation interface)
Pan is a data transformation engine that performs a multitude of functions such as reading, manipulating, and writing data to and from various data sources.
• Carte (CLI execution engine for PDI content)Carte is a simple web server that allows you to execute transformations and jobs remotely. It does so by accepting XML (using a small servlet) that contains the transformation to execute and the execution configuration. It also allows you to remotely monitor, start and stop the transformations and jobs that run on the Carte server.
A server that is running Carte is called a Slave Server in the Pentaho Data Integration terminology.
Acording to me, Spoon is the main packet of PDI. So, we will try to run spoon, create a simple transformation to make sure that PDI installation is OK. To run spoon (in windows environtment) we can execute spoon.bat. This file is in the folder where we extract the PDI files before. If everything is ok, we will see the spoon splash screen.
and the below picture is welcome screen of PDI.
1. Create Transformation
Transformation is set of steps to do something. To create a transformation, follow these steps :
a. Click CTRL+N or File > New > Transformation or Click Transformation Button in main toolbar.
b. Save , name it : MyTransformation
2. Create Database Connection
In this article, we use mySQL as database. Before creating database connection, we must provide a database. We can use phpmyadmin to do this job. Create a database, name it : kettle (you can use other name)
To create a new connection, Click View Table, right click the Database Connections in the tree and select New or New Connection Wizard. You can also double click Database Connections, or press F3. Then, there will be new window (Database concetion). Give it "DB_MySQL" as its name. Choose MySQL as Connection type. Then, fulfill settings data located at righside of Connection type. Don't forget to choose "Native(JDBC)" Access type. For details see the following picture
In this tutorial, we will import data from txt file to database. To do this, we must prepare data in a txt file. Open a notepad, write on notepad the below data :
Name;Age John;32 Hendry;40 Dunant;50 Kevin;23 Duran;19
Save it as people.txt. After creaing txt file, let's back to Spoon. Click Design tab. Then look at steps tree. Look for Input , and then drag text file input to myTransformation tab.
Labels: Data Integration