Download as pdf or txt
Download as pdf or txt
You are on page 1of 57

Tuesday, July 7, 2020 5:40 PM

Data Virtualization Basics


Folders, data sources, combination and execution
• INDEX

• NEXT TUTORIAL >


Welcome to the Denodo Platform Basic Tutorials! All new technologies require some kind of training
before you can use them capably on your own. Here you will find a complete set of lessons that
briefly describe the main topics you need to know to start your Data Virtualization project. To get
started, please download the sample data sources included in the Tutorial Files, as we will be using
them throughout the tutorials to demonstrate the features of the Denodo Data Virtualization software.

Problem to solve
First you have to stop and understand the problem before starting your project. Let's take the
following scenario:
• You are working in a company that has information about its customers (CRM) stored in a
MySQL database.
• The complete billing information for any customer is exposed in an internal Web Service.
• End users do not want to use different applications to get available information of customers
(CRM, Sales Application, etc.).
• The IT department does not like the idea of creating a specific application for this business
need and it would like to reuse this customer global view, if possible, in any other current
application.
We said in the previous paragraph: "End users do not want to use different applications to get all the
information of customers", so the problem here is to connect disparate data to create a single view of
the customers, and the IT deparment wants that single view to be reusable.
If you follow a bad design (not reusing components, creating ad-hoc code/applications, etc.) your
systems will grow as in the image below. This is a typical example of exactly what you do not want:

New Section 15 Page 1


Why data virtualization?
Data virtualization creates a single virtual layer that connects disparate data and provides unified
access for consuming applications. These applications will use the semantic components defined
in the virtual layer and reuse them as needed. In this way, your applications will
be independent from the physical sources where the data is stored.

At the end of this Tutorial you will learn that the Denodo Platform provides:
• Easy to generate data services.
• Data Services independent of the physical source(s).
• A single point to control your data sources.
• Short and Agile development cycles.
• Little to no coding!
• Intuitive solutions to simple needs.
• Reusabilty of your models by all clients
Before starting this tutorial, make sure you have all the necessary materials by checking
the Installation & Bootstrapping section.
If you have already installed all the components, please go ahead to the First Steps section.

Installation & Bootstrapping - Pre-


New Section 15 Page 2
Installation & Bootstrapping - Pre-
installation requirements
This section, Installation & Bootstrapping, shows the prerequisites to start using Denodo and
instructions for configuring your Denodo environment and starting to play with Data Virtualization.

Product Version Notes


Denodo 7.0 Go to https://community.denodo.com/express and download the installer
Platform and license
MySQL 5.0 or You will need MySQL (or any other relational database) to store some of the
higher data sources used in the tutorial. You can download it from
here: http://dev.mysql.com/downloads/mysql/
MySQL 5.0 or This is the JDBC driver that will enable Denodo to connect to
Connector/J higher MySQL: http://dev.mysql.com/downloads/connector/j/
MySQL 5.0 or These are useful tools you might want to use with MySQL but are not strictly
Workbench higher necessary for the
tutorial: http://dev.mysql.com/downloads/tools/workbench/
Denodo - Download and extract the contents of the compressed Tutorial files into a
Tutorial Files convenient local directory, which will be referred to as <tutorial_directory>
throughout this guide.

Installation & Bootstrapping - Installation


steps
• < PREV

• NEXT >
Before you start, be sure you have your development environment set up. You need to:
1. Install Denodo into a directory (avoid using the %Program Files% folder). This directory will be
referred to as <DENODO_HOME> throughout this tutorial.
2. Copy the mysql-connector-java-.jar to <DENODO_HOME>/lib-external/jdbc-drivers/mysql-5
3. Install & configure the database:
1. Install MySQL server.
2. Start MySQL and launch the MySQL Workbench application.
3. Connect to your MySQL server and then open
the <tutorial_directory>/MySQL/schema.sql script by choosing "Open SQL Script" from the
File menu.
4. Once the script has been opened, click on Execute (you can use any other method to
load the database).
5. After doing this, you should see a new database schema called "acme_crm" with three
tables defined (address, client and client_type). Test MySQL by logging in to the
acme_crm database with the credentials: acme_user / acme_user
4. Install a web server:
We are going to use Jetty to run some of the examples in the tutorials:
1. Go to <tutorial_directory>/jetty and run: java -jar start.jar from the command line. If you do
not have a Java Virtual Machine installed on your system you can use the JVM installed
with the Denodo Platform under <DENODO_HOME>/jre/bin.
2. Test the billing Web Service to see if it has been properly deployed, direct the web

New Section 15 Page 3


2. Test the billing Web Service to see if it has been properly deployed, direct the web
browser to: http://localhost:8080/billing/services. If successfully deployed, you will see
there a list of available services "BillProvider", "AdminService", and "Version", along with
their exposed methods and an option to view the WSDL descriptor.
This tutorial will incrementally build a set of views in Denodo to learn the basic concepts about Data
Virtualization, it is important to follow each step.

Denodo Installation

At this point, you should have already downloaded the Denodo Express installation package from
your user account.

The installation package is not the only file to download, make sure
you have your personal license too.
TIP
The installation package is a .zip file. After decompressing the package you will see the files shown
in the image below:

To start the installation GUI:


• Windows OS: right-click the install.exe file and select Run as administrator.
• Linux OS: execute install.sh.
The installer will show up.

New Section 15 Page 4


After accepting the terms of the license, you have to select the installation directory (for
example: C:/Denodo/7.0 or /opt/denodo/7.0).
If you already have a Denodo license file, you can select it clicking on the "Browse" button.
Otherwise, you can install the license later from the Denodo Control Center

In the next step you have to select the modules to be installed. This tutorial covers every module so
we suggest installing them all, but you only need Virtual DataPort to get started.

New Section 15 Page 5


You can leave the rest of the options with their default values and complete the installation.
Once the installation is complete, you can choose to create a desktop shortcut that can be used to
start the Denodo Control Center.
If you did not select any Denodo license during the installation process, you can do it from
the Denodo Control Center when you first start the program. (By first clicking on
the Configure button and then selecting Use a license file, a new dialog will be opened to select
the license file and confirm its installation.

The installer includes a Java Runtime Environment (JRE), so a


previously installed Java is not necessary.
New Section 15 Page 6
previously installed Java is not necessary.
TIP
That's all!
Thanks for installing Denodo Express! Now it's time to start playing with Denodo.

First steps - Index


In this first tutorial we will cover the most basic functionality to use the Denodo Platform.
The list of topics is the following:
• Start Denodo server and applications.
• Discover the Denodo Administration tool.
• Create a development database.
• Create folders for organizing your database.
• Import relational databases.
• Execute queries.
• Create a simple combination view.

We recommend that you follow the guide for Installing Denodo before
starting this tutorial.

First steps - Launching Denodo


• < PREV

• NEXT >
Denodo is a global solution for heterogeneous and dispersed data source integration using a virtual
approach. It can connect to a wide range of data sources like relational databases, web services,
XML documents, flat files, multidimensional databases, JSON sources, etc.
Denodo will create wrappers on top of those data sources to create a common interface to access
them. Then, a user can combine the data coming from different data sources by defining views,
using the Administrator Tool GUI.
The diagram below shows the general architecture of the Denodo Platform:

New Section 15 Page 7


Control Center

After the Denodo installation, a desktop icon is generated for Denodo 7.0.

First, we have to double-click on that icon to launch the Denodo Platform Control Center.

Administrator privileges are required for installing/starting the servers


in Windows environments (Right-click > Run as administrator).

In Linux environments the user who performs the installation/starting


of the servers (could be different) must have write/execute privileges
or else the error "Error accessing the metadata while loading/storing
objects" may appear.
NOTE

New Section 15 Page 8


1. Click on Virtual DataPort in the top-left.
2. Click on the blue play button next to Virtual DataPort Server (the first one from the top) to start
the Virtual Dataport Server.
3. When the text before that play button says "Running", launch the Virtual DataPort
Administration Tool by clicking the "LAUNCH" button.

First steps - Administration Tool


• < PREV

• NEXT >
The Denodo Administration Tool allows the development and administration of your Data
Virtualization projects. Specifically, you can perform the following tasks:
• Create/Edit/Drop Denodo Virtual Databases.
• Create/Edit/Drop Data Sources.
• Create/Edit/Drop Views.
• Publish Data Services.
• Execute Queries.
• Add Extensions.
• Configure the Cache System.
• Import/Export Metadata.
• Configure the Denodo Server.
At the end of the previous section you launched the Administration Tool. The first screen that the
application shows is a login dialog; the credentials you type here will be used to connect to a running
Denodo server.

New Section 15 Page 9


The installation process of the Denodo Platform creates one user by default: Login: admin /
Password: admin. Input these credentials now to connect to the Denodo Server.

The Denodo Administration Tool allows the user to connect to local or


remote servers.
NOTE
Check the URL format: //localhost:9999/admin
• Server (localhost): server name or IP address where Denodo is installed.
• Port (9999): Denodo port number (this is the default value).
• Database (admin): database name (by default, "admin" database was created at installation
time).
Once you fill the login form, click on the Connect button. The Administration Tool will show an
empty admin and itpilot databases.

New Section 15 Page 10


Now you can see the four main areas of the Denodo Administration Tool:
• Menu Bar: where you can find the options for all the sections of the administration tool.
• Elements Tree: this panel on the left will show a tree with the different components (data
sources, views, etc.) created in the database.
• Quick Search: this box allows the user to filter specific elements in the Elements Tree.
• Workspace: this is the main panel where the selected element from the Elements Tree and
opened views will be shown.

First steps - Creating a Virtual Database


• < PREV

• NEXT >
We saw in the previous section that we need valid credentials to connect to a Denodo database. In
our case, we used the default admin user to connect to the admin database.
Now, we are going to learn how to create another database.

To complete the tutorial this step is not neccessary. If you are going
to use the default admin database you can continue to the next
section.
NOTE
The Denodo server can contain different virtual databases. A virtual database is a schema
comprised of data sources, views, stored procedures, web services, etc. Each virtual database is
independent of the rest of the virtual databases created in the Denodo server (and different users
can have different privileges for each virtual database).
We are going to create a new database called tutorial.

New Section 15 Page 11


We are going to create a new database called tutorial.
1. First, click on Administration in the Menu Bar and select the Database Management option.

2. In the workspace, you will see the predefined Denodo databases (these databases cannot be
dropped):
○ admin: default database for Denodo Virtual DataPort.
○ itpilot: default database for Denodo ITPilot.

3. You are going to start a new project, so you will need to create a new database (this is a best
practice):
1. Click on the New button.
2. Specify the name of the database: tutorial.
3. Click Ok.
That's all!
Now, you can follow this Tutorial using this new database, so disconnect your current session (File >
Disconnect) and log in again into the tutorial database:

• Login: admin
• Password: admin
• URI: //localhost:9999/tutorial
When changing the user, it is not necessary to disconnect and reconnect. Another option is to
navigate through the Elements Tree and simply selecting the tutorial database.

New Section 15 Page 12


First steps - Organizing the Database
• < PREV

• NEXT >
Now it is time to create elements in our virtual database. But wait, first we have to stop and think
about what are good practices when creating elements. One of the best practices is to have good
organization of elements inside of our database.
For that purpose, the Denodo Administration Tool offers the option to organize elements
inside folders in the Elements Tree, making it easier to work with them.

Creating a folder

To create a new folder follow these steps:


1. Right-click on the database name and select New > Folder or simply press Ctrl+Alt+F at any
time.

2. Name the folder 1 - first steps.


3. Later you can simply drag & drop elements into this folder.

Folders can be nested and also rearranged by dragging and dropping


New Section 15 Page 13
Folders can be nested and also rearranged by dragging and dropping
them.
TIP

If a folder is deleted, all the children elements are also deleted so


double-check that the contents of the folder can be safely deleted.
NOTE

In the next section you will learn how to create elements inside this folder.

First steps - Connecting to Sources


• < PREV

• NEXT >
The MySQL database that you installed in the Installation & Bootstrapping section contains the data
of the CRM of a company. This data is split into several tables:
• a table for client data,
• a table for client types (a client can be residential or business)
• and a table for addresses.
The diagram of the organization of this database is the following:

Your goal here is to combine this data using a Data Virtualization approach, this will enable us to
create views that are more meaningful for the consumers of the data without having to modify the
underlying data source (in many real-world scenarios we are not the owners of the data, just
consumers, so changing the data schema will not be possible).
The first step we need to follow to virtualize this relational database is to connect to it using the
Denodo Platform. Connecting to the data source will allow us to introspect it and graphically select
which of its tables are to be virtualized within the Denodo Platform. Once connected, we will create
one base view per table in the CRM.
A base view is a representation, in the Denodo Platform, of existing data in a remote data source.
This base view is only metadata that describes how the information is stored and accessed in the

New Section 15 Page 14


This base view is only metadata that describes how the information is stored and accessed in the
original datasource but it does not contain any data. When a base view is queried, the data is
retrieved in real-time from the source, unless it is cached (see the Performance tutorial).

Importing relational databases

First, let's create the data source for the CRM database. In this case, we will create a JDBC
connection to MySQL, but other possibilities are Oracle, Microsoft SQL Server, DB2, PostgreSQL,
Hive, Nettezza, Teradata, Denodo VDP, etc.
The recommended way to connect to databases when using Denodo is through JDBC (this is an
acronym referring to Java Database Connectivity), so let's start this tutorial creating a new JDBC
data source to import a table with a primary key.
In the Installation & Bootstrapping section, you installed a MySQL database server and copied its
driver into the Denodo Virtual DataPort extensions folder: <DENODO_HOME>/lib-external/jdbc-
drivers/mysql-5. With this driver added to the Denodo installation, you are ready to create the JDBC
Data Source following these steps:
1. Create two folders nested under the "1 - first steps" folder you have made, one for data
sources called "1 - Data Sources" and another for base views called "2 - Base Views".
2. Right-click on the "1 - Data Sources" folder and select "New > Data source > JDBC".

3. Name the data source, for example, acme_crm_ds.


4. Select "MySQL 5" as Database Adapter.
5. Change the Database URI to match your MySQL installation, for
example, jdbc:mysql://localhost:3306/acme_crm
6. Enter the login and password fields: acme_user / acme_user.
7. Click on "Connection Pool configuration" and check "Test connections".
8. Click on the "

Save" button.
9. Click the "Create base view" button at the top.
The Administration Tool will show the introspected schema of the relational database:

New Section 15 Page 15


To incorporate some of the tables into the Denodo virtual schema, you have to check the box near
the tables or views you want to import. In this case, check address, client and client_type and then
click on the Create selected button.

Later, you will be able to query these base views or combine them with other views.
When the importing process is finished, you will see the new views in the elements tree panel. If you
double-click on the view name, the schema of the base view will be shown in the workspace.

New Section 15 Page 16


Any primary key found in the source table gets marked in the
Denodo base view. This is important because this information will be
exposed to Denodo clients (for example, reporting tools).
NOTE
As you can see, each new base view will be listed in the Elements Tree in two places:
• In the folder where the view has been created ("1 - Data Sources").
• As a child of the data source the base view belongs to.

Data Source child nodes cannot be moved to other folders. They are
added to provide an easy way to see the base views created from a
data source.
TIP
Finally, let's move the base views to the folder that we created for them by dragging them to the "2 -
Base Views" folder.
After these steps are completed, we have a virtual representation of our CRM in Denodo. In the next
section, we are going to learn how to query it to see how the data comes in real time from our
MySQL database, and, after that, we will start creating data combinations that will add semantic
value to the client applications that are consuming this data.

First steps - Execution of Queries


• < PREV

• NEXT >

New Section 15 Page 17


• NEXT >
After creating the base views in the previous section, we are ready to query our Data Virtualization
server to retrieve the data from our CRM. Let's learn how to issue queries to the Denodo server.
There are several ways to perform the execution of a view, we will use the "client" view for this
example, from the Administration Tool:

Context menu

To use the Context menu you have to:


1. Right-click on the name of the view in the Elements Tree panel.
2. Select "Execute...".

3. Then click the Execute button at the bottom of the panel to send the current sentence (shown
at the top) to the Denodo server.

View detail window

In this case you have to:


1. Double-click on the name of the view.
2. Press the Execution panel button at the top of the workspace.

VQL shell

Another option is to use the VQL Shell Client:


1. In the Menu Bar, go to Tools > VQL Shell

New Section 15 Page 18


1.

2. Right-click on the name of the view and select "VQL Shell > Select ..." (a sentence is created
in the top-right panel).

3. Then click the Execute button.

Show results

In the three examples above, a simple query is executed:

SELECT * FROM client


CODE
After receiving that query, the Denodo server will create a physical query plan to query the data
sources (in this case, your MySQL database) and get the results:

New Section 15 Page 19


You can filter the results obtained in the output using a WHERE condition in the sentence. This can
be done using the graphical interface in the execution window shown when using the Context
menu or View detail window options.
In this panel, you have to click on the

button, then select the field, the operator and the value for the filter:

To summarize, now you have learned how to execute queries over Denodo views and how the
server is querying the source database in real-time to return the results. In the next section, we are
going to learn how to create new views using combinations between existing Denodo views.

First steps - Simple Derived Views


• < PREV

• NEXT >
Now you are ready to explore the capabilities of the Denodo Platform that make Data Virtualization a
very powerful tool. In the previous sections, you have connected to the CRM database and queried

New Section 15 Page 20


very powerful tool. In the previous sections, you have connected to the CRM database and queried
its tables. Client, client_type and address data is now available within Denodo, so you can start to
build a unified view of a customer that will contain the information of the three tables and remove the
need for each consumer of the data to create this unified view.
We create this unified view within the virtualization platform so it can be reused by all clients. Having
it defined in a single point will make it easier to manage changes to our definition of the customers
(For example, if, in the future, we only want to expose residential customers to the consuming
applications or the source database is changed.) With Denodo, you will maintain the proper data
definition and your final applications won't require any changes.
The way of creating this unified view of a customer is through a JOIN operation, just like it's done in
a traditional relational database.

Join operation

Let's see an example of a derived view creation process using the join Operation. Before we begin,
let's create a new folder named 3 - Derived Views to stay organized. Then, right-click in the elements
tree and select New > Join. Now, an empty Join View panel will be shown in the Administration
Tool workspace.

To select the views on which the join operation is going to be executed, you have to drag &
drop them from the list of views that appear on the Elements Tree. As input views are added, the
schema of the resulting join view is generated automatically.
In our example, you have to follow these steps:
1. Drag & drop the client, client_type and address views into the workspace.
2. Drag the client.client_id field and connect the arrow to the address.client_fid field to set one of
the join conditions.
3. Drag the client.client_type field and drop the arrow on the client_type.code field.

New Section 15 Page 21


4. Then, go to the Output tab. Here you will see both address and client views are producing the
same field with different names: client_id and client_fid. You have to remove one of them
selecting the field and clicking on "Remove selected" button.

5. Rename the view "personal_data_crm" by typing in the input box labeled "View name" at the
top.

New Section 15 Page 22


6. Click on the

button to save the view.


7. Drag & drop the new element into our new 3 - Derived Views folder.
The Administration Tool will show the schema of the new derived view:

Once these steps are completed, you will have a derived view (virtual, data is not stored in Denodo)
that represents the concept of a customer within your organization. This view can be queried in the
same way that you did for the base views (check how to in the previous section). Now that this data
is defined, your client applications can just retrieve this information directly from the virtualization
server without having to define the data combination themselves.
The Denodo Administration Tool also provides a full set of relational operations, in addition to the
join, to create new views:
• UNION
• PROJECTION
• SELECTION
• AGGREGATION
• MINUS/INTERSECTION
• FLATTEN
These operations can be used in the same way than the Join operation (Right-click >
New > OPERATION), you can try yourself!

The next section (Advanced Operations) will show examples


of Aggregation and Flatten operations.
New Section 15 Page 23
of Aggregation and Flatten operations.

Advanced Operations - Index


• < PREV

• NEXT >
In the previous section, you learned how to access a customer database to get personal and contact
information about the company's customers.
The billing department of the sample company exposes the billing information using an internal web
service that exposes all the open bills for each customer: amount due, due date, etc. In this section
you will combine the unified customer view that you have built with this billing information so you can
have a report that lists the total amount that is due for each of your customers.
List of topics:
• Create a Web Service data source.
• Flatten: a hierarchical structure.
• See how derived views are constructed using Tree View.
• Create a join view between heterogeneous sources.
• Create an Aggregation view.

Advanced Operations - Importing SOAP


Web services
• < PREV

• NEXT >
As part of the installation steps, you have deployed a billing web application that exposes several
SOAP web services. The different services are available at http://localhost:8080/billing/services.
Make sure the billing web service is up and running before following the steps of this section
(Installation & Bootstrapping)
These web services have been created by the billing department and expose the information about
the customers' bills using three different operations:
• getBills: returns a list with all the bills in the system.
• getBillByCustomerID: returns a list with all the bills for the specified input customer id.
• getBillByPhoneCenter: returns a list with all the bills for the specified input phone center.
All the different operations will return the billing information using a hierarchical structure: the bills
will be returned as part of a list. For instance, if you invoke the getBillByCustomerId operation
using a customer id as input parameter, you will get a list with all the bills for that customer. For each
item (bill) in the list, you will see: the customer id and ssn, the amount due for that bill, the billing
start date, end date and due date, the phone center that provided the service for that bill, and the bill
id.

New Section 15 Page 24


To import a SOAP web service as a data source you will just have to
point to the location of the WSDL descriptor for the web service.
TIP
In this case, you are going to use the BillProvider service from the list of available services. The
descriptor for this service will be available at http://localhost:8080/billing/services/BillProvider?wsdl.
The Denodo Platform can integrate both SOAP and REST web services. For SOAP web services
you will use the Web service type of data source, for REST web services you could use
either JSON or XML data sources depending on the output format of the REST web service. Follow
these steps to create a SOAP data source:
1. Right click on the '1 - Data Sources' folder and select New > Data source > Web service.
2. In the New Web service Data Source creation wizard enter:
○ Name: soap_billing_ds.
○ WSDL: http://localhost:8080/billing/services/BillProvider?wsdl

3. Click on the

button (you can leave the default values for the remaining options).
Now you have to click on Create base view button. You will see on the screen a list with the
different operations available in the web service and the option of creating a new base view for each
one of them.
In this case, you are interested in the getBillByCustomerId operation, so click on the Create Base
View link associated with this operation.

New Section 15 Page 25


In the Creating Web service View for operation <getBillByCustomerId> screen just choose Do
not stream output and click Ok.
Before confirming the creation of the view, rename it to get_bill_by_customer_id and rename the
input parameter for the web service operation from in0 to customer_id by simply clicking on the in0
field name.

Click on

New Section 15 Page 26


after these changes.
The web service operation that you just imported has a mandatory input parameter: the customer
id.
To reflect this restriction in the data source, the new base view will also have limited query
capabilities and when querying the view (clicking the 'Execute' button) we will have to provide a
value for this input parameter (customer_id) as a WHERE condition.

Any time a data source has a mandatory input parameter, it is going


to become a mandatory WHERE condition that you cannot remove
from the view execution dialog when querying the view.
NOTE
Actually, if you try to execute a query from the VQL shell without providing the WHERE condition you
will get an error.
If we execute and query and use 'C003' as the value for the customer id we will get the complete list
of bills for that customer as the result of the query.

New Section 15 Page 27


A list of items is represented with an Array data type. You can see in the results of the query that
the return column is displayed as [Array]... If you double-click on this value you can see the
contents of the array. In this example, you can see the 2 bills that belong to this customer.

Advanced Operations - Flatten Operation


• < PREV

• NEXT >
In the previous section, you have created a base view on top of a SOAP web service data source.
This web service returns a hierarchichal structure that includes a return element of type array.

New Section 15 Page 28


This web service returns a hierarchichal structure that includes a return element of type array.
To represent these hierarchical structures, Denodo supports two different complex data types:
• Registers or Records: a register or record represents an element that is formed by several
subelements or subfields. Each one of the subelements in a record will have their own name,
data type and value.
• Arrays: an array represents multi-valued data that is presented as a list of items. An element
of the type array can be thought of as a sub-view included in a main view. An array type
always has an associated record type that acts like the schema of the sub-view and defines
the columns that make up each item in the list.

Although Denodo is capable of working with these kinds of complex


data types, these are not standard relational data types and in most
of the scenarios it will be more useful and easier to simplify these
hierarchies.
TIP
To simplify array columns, Denodo has the special Flatten operation.
Let's see how to create a new view that flattens the get_bill_by_customer_id base view:
1. Right-click on the get_bill_by_customer_id view and select New > Flatten.
2. In the Flatten View wizard, we will select the element that needs to be flattened. In this case,
the only array that we have in the base view is: return.

3. Right-click on return and click Flatten array 'return':

New Section 15 Page 29


4. Go to the Output tab, rename the view to billing_information and then click on

New Section 15 Page 30


You will see that the array has dissapeared from the schema of the
new view and the fields that were inside the array in the base view
are now regular fields.
NOTE
The new derived view has the same query capabilites as get_bill_by_customer_id but if you query
the new view using the same customer_id = 'C003' condition, you will get the fields that were part
of the array in the base view schema. This customer has two pending bills so there are two rows in
the result set of the query, one for each bill.

Advanced Operations - Tree View


• < PREV

• NEXT >
In the previous section, you saw how to create new derived views by combining other views. Once
you start combining views the complexity of the new views will grow and it will be useful to visualize
how those views are built. For this, you have the Tree View functionality.
In the Simple Derived Views section, you created the personal_data_crm view using two join
operations over several base views. To see the Tree View you have to right-click on the view name
and select Tree View (see image below).

New Section 15 Page 31


In the tree view you can see how the view has been created. In this example, you can see that the
view personal_data_crm is created as a join

between three views: client, address and client_type.


At the same time that the join is performed, a projection

operation takes place because only a subset of fields is projected.


In addition to the general overview, you can see the details of the different operations involved in the
creation of the view as well as the data sources and wrappers information by clicking on the different
elements.
For instance, clicking on a join icon will display information about the join type, method used and the
join condition.

New Section 15 Page 32


If you want to save a snapshot for documentation purposes just click on the Save image button.
You can open several elements at once in the Workspace. They will be opened in tabs. Any tab can
be moved to a new window by clicking the button

Advanced Operations - Combination


Between Different Data Sources
• < PREV

• NEXT >
At the beginning of the tutorial, you saw how to create new views using the join operation but the
views involved in the joins were all coming from the same datasource. In this section, you will see
how you can create a new join view using the exact same procedure but coming from two different
and heterogeneous data sources.
To create the derived view you can follow these steps:
1. Right-click on the personal_data_crm view and select New > Join.
2. Drag & drop the billing_information view to the Join View wizard Model tab.
3. We will use client_id = customer_id as the join condition so drag & drop a line from
the client_id field in the personal_data_crm view to the customer_id field in
the billing_information view.

New Section 15 Page 33


In the Output tab:
• Name the view: client_with_bills.
• Remove the field return_customer_id from the output schema of the view (you already have
the customer_id field).

• click

New Section 15 Page 34


• click

Save.
Now, if you execute the new view we will get the information about the bills from the different clients.

The Web service needs a mandatory input parameter and Denodo is


automatically taking care of providing this parameter in the join view
using the data coming from the personal_data_crm view.

Advanced Operations - Aggregation


Operation
• < PREV

• NEXT >
In the previous section,you have created a view that obtained the billing information for all the
customers in your database. In this view we have one record for each bill, but we want to calculate
the total amount due by the customers instead of having the separate bills.
To do so, you can create a new view that aggregates (group by) the different customers using
the customer_id to compute the total amount.
To create the new aggregation view you can follow these steps:

New Section 15 Page 35


To create the new aggregation view you can follow these steps:
1. Right click on the client_with_bills view and select New > Selection.
2. In the Projection / Selection View wizard go to the Group by tab and check the Use group
by checkbox.
3. Click on Ok in the pop-up dialog.
4. From the list of fields on the left Add the following to the Selected group by fields and
expressions box: name, surname and client_id.

5. Go to the Output tab and rename the view to amount_due_by_client.


6. Now, you have to create a new field that computes the addition of the amount due for each
client. Click on New aggr. field button and define a new field called total_amount that will be
calculated using the SUM aggregation function and the amount_due field.

SUM(CAST('float', client_with_bills.amount_due))
CODE

New Section 15 Page 36


7. Click Ok to add the new field to the view.
8. And click

to confirm the creation of the view.

Since the amount_due field comes as a text from the Web service,
you will have to cast the field to a numeric value using
the CAST function.
NOTE
Now, if you execute the new view, you will get as result one record per customer with the total billing
balance:

New Section 15 Page 37


Connecting from your application - Index
• < PREV

• NEXT >
In previous sections, you created several views with Denodo to allow client applications to retrieve
the information directly from the Denodo server. In particular, you have created the
view 'amount_due_by_client' which combines data from several sources and exposes the
information about the billing balance of a company's clients.
You already know how to execute queries over that view, but now it's time to connect to the Denodo
server from your external applications. Denodo Platform is based on a client-server architecture,
where clients issue requests to the server. These requests can be sent using one of the following
interfaces:
• JDBC: Denodo provides its own JDBC driver.
• ODBC: Denodo provides an ODBC interface (requires the installation of additional
components).
• ADO .Net: Denodo is compatible with the Npgsql ADO.Net provider for PostgreSQL.
• RESTful Web service (XML, JSON, HTML outputs): useful for applications that cannot use
the JDBC or ODBC interfaces to connect to Denodo.

New Section 15 Page 38


In this section, you will see several ways to access Denodo from external applications:
• Using a Third Party JDBC Client.
• Using ODBC.
• Consume a Denodo RESTful Web service.
• Browse Linked Data.

Connecting from your application - JDBC


Clients
• < PREV

• NEXT >
JDBC (Java DataBase Connectivity) is a Java data access technology from Oracle Corporation.
JDBC provides an API for the Java programming language for database-independent connectivity,
and it is based on the use of drivers for each database. A client application requires separate
drivers, usually vendor supplied, to connect to different types of databases.
Denodo includes a JDBC driver jar file named denodo-vdp-jdbcdriver.jar, and it is located under
the <DENODO_HOME>/tools/client-drivers/jdbc/ directory.

Accessing Denodo using a JDBC client

In this section, you are going to see how to access to the Denodo server using a JDBC client. This
information is valid for any Java-based application. For the example, we will use DBVisualizer (a
generic database management tool for developers) but feel free to use any other JDBC client.
The first thing that you have to do when connecting using JDBC is to add the Denodo's JDBC driver
to the client application.
To use the JDBC driver in your client, you have to add the .jar file to the classpath of your
application.
In DBVisualizer, you have to go to Tools > Driver Manager... and in the Driver Manager window go
to Driver > Create Driver and then browse to the VDP's driver file. Use the following driver settings
and close the window to save the configuration:

New Section 15 Page 39


and close the window to save the configuration:
• Name: Denodo 7.0
• URL Format: jdbc:vdb://<hostname>:<port>/<database>
• Driver Class: com.denodo.vdp.jdbc.Driver

Now that you have added the driver, you can configure a connection to your Denodo virtual
database. Go to Database > Create Database Connection and use the following settings for the
connection:
• Driver (JDBC): Denodo 7.0
• Database URL: jdbc:vdb://localhost:9999/tutorial
• Database Userid: admin
• Database Password: admin

Click on the Connect button, and you will establish a connection to the tutorial database. In the left

New Section 15 Page 40


Click on the Connect button, and you will establish a connection to the tutorial database. In the left
panel of the window, you will see that the base views are listed as tables and the derived
views are listed as views.
If you run any query from the SQL Commander, you will get the same results that you were seeing
before from the Administration Tool. Try, for example:

SELECT * FROM amount_due_by_client


CODE

Connecting from your application - ODBC


Clients
• < PREV

• NEXT >
ODBC (Open DataBase Connectivity) is a standard to access databases originally developed by
Microsoft. ODBC provides an API to make the code independent of database systems and operating
systems.
Denodo provides an ODBC interface, but it requires the installation of the ODBC driver. Like any
other ODBC driver, you have to install it on the machine where the client application is running.
In this section you will learn how to access to the Denodo server using an ODBC client. This
information is also valid for any other ODBC connection. For the example, we will use MS Excel but
feel free to use any other ODBC client.

New Section 15 Page 41


feel free to use any other ODBC client.

Creating the DSN

The first thing that we have to do when connecting using ODBC is to install the Denodo ODBC
driver. Denodo Platform 7.0 includes an ODBC driver named DenodoODBC and it is located under
the <DENODO_HOME>\tools\client-drivers\odbc directory. Extract the folders in this directory and run
the programs inside to install the drivers. Once this is complete, restart your Virtual DataPort Server
from the Denodo Control Panel.

Select the 32-bit or 64-bit version depending on the client that will use
it.
E.g. Clients such as old MS Excel versions can use only the 32-bits
ODBC driver, even if it is running on a 64 bits O.S.
NOTE
Once you've installed the ODBC driver you will need to add a new user data source:
1. Go to Control Panel > Administrative Tools > Data Sources (ODBC).
2. Select Add User DSN or Add System DSN. The difference is that "User DSN" can only be used
by the current user and "System DSN" can be used by all the users of the system.
3. Select the DenodoODBC ANSI or Unicode driver, and click on the Finish button.

In the configuration dialog fill in the following information:


1. Data Source: name of the ODBC source (e.g. Denodo Tutorial).
2. Database: database in Denodo. (e.g. tutorial).
3. Server: host name of the Denodo server (e.g. localhost).
4. Port: port of the Denodo server (e.g. 9996).
5. User Name / Password: credentials to connect to Denodo (e.g. admin/admin).

New Section 15 Page 42


Now, you have to configure some of the Advanced properties by clicking on the Datasource button
(a pop-up will open). Select the same options shown in the screenshots below and write "SET
QUERYTIMEOUT TO 3600000" in the Connect Settings Box on page 2:

Finally, click the Ok button and then click Save button to finish.

Accessing Denodo using an ODBC client

Now, you have your environment ready to connect to Denodo using ODBC (remember than the
previous steps are only valid to connect to the "tutorial" virtual database, so if you want to connect to
another database you will have to create a new DSN).
For an example of an ODBC client application you can use the well-known Microsoft Excel. You will
only have to select this DSN as a data provider to import the customer data into the spreadsheet.

This example was tested in the MS Office 2010 version


NOTE
Please, follow these steps and see the results:
1. Open a new workbook.
2. Select Data > From Other Sources > From Microsoft Query
3. In the pop-up, select "Denodo Tutorial" and click on Ok.

New Section 15 Page 43


4. Select "amount_due_by_client" view.

5. Click the > button in the middle and you should see name, surname, client_id,
and total_amount appear underneath "Columns in your query:"
6. Click on Next (three times) and then Finish.
And voilà! The results from Denodo are populated into the MS Excel spreadsheet!

New Section 15 Page 44


Connecting from your application - RESTful
Web service
• < PREV

• NEXT >
RESTful Web service apply the ideas of the web to data-delivery, by providing scalable, flexible and
stateless access to data assets based on well-known protocols and formats like HTTP, HTML, XML
and JSON. Additionally to traditional SQL type access methods such as JDBC or ODBC, all the
views in Denodo can be accessed using the RESTful interface.
The Denodo RESTful Web service is an HTTP service deployed by default in the
URL http://localhost:9090/denodo-restfulws that exposes resources like databases and views in the
following standard representation formats:
• XML
• JSON
• XHTML (more user-friendly)
This Web service allows Denodo to work inside applications following the REST service architecture
style and provides support for linked data in the enterprise deployment (see next section for more
information about this).
When accessed from a browser, the Denodo RESTful endpoint will look like this:

See the examples below to know how to query this service:


• Example 1: http://localhost:9090/denodo-restfulws/tutorial -> returns the list of views of the
virtual database tutorial.

• Example 2: http://localhost:9090/denodo-restfulws/tutorial/views/client -> returns the content of


the view client of the database tutorial.

New Section 15 Page 45


the view client of the database tutorial.

• Example 3: http://localhost:9090/denodo-restfulws/tutorial/views/client?client_id=C005 ->


returns the content of the view client whose client_id is C005.

Representations

We mentioned at the beginning of this section that Denodo supports three representation formats
(XML, JSON and XHTML). You saw in the previous examples the XHTML format using a browser.
How can you get the response in the other formats? The answer is easy, by adding a query
parameter to the URL:
• JSON output: you have to add $format=json. For example,
http://localhost:9090/denodo-restfulws/tutorial/views/client?client_id=C005&$format=json

• XML output: you have to add $format=xml. For example,


http://localhost:9090/denodo-restfulws/tutorial/views/client?client_id=C005&$format=xml

New Section 15 Page 46


Another way to choose the desired representation is using HTTP
'Accept' headers.
TIP
In the next section, you will learn how to link your Denodo views to allow browsing linked data
through the RESTful Web service.

Connecting from your application - Linked


Data
• < PREV

• NEXT >
In the previous section, you learned how to issue queries from the RESTful interface of Denodo.
Now you will see how to enable linked data using a new Denodo element: associations.

Associations in Denodo

Associations represent a relationship between elements of two Denodo views. The concept is very
similar to the Primary Key / Foreign Key restrictions in relational databases.

Associations allow third-party consumers (like Modeling or Analytical


BI tools) to instrospect relationships between virtual views.
NOTE
For example, the elements of the client view can be related with the elements of the address view
(every customer is related with an address).

Based on the definition of the associations, the Denodo RESTful Web service will show links that will
allow you to traverse the associations. Let's see how it works with an example, by using the views
created in previous sections of the tutorial.

New Section 15 Page 47


created in previous sections of the tutorial.

How to create a new association

To create a new association in Denodo, you can follow these steps:


1. Right-click inside the elements tree and select New > Association.

2. Drag & drop the client and address views involved in the association into the workspace.
3. Link the client_id and client_fid fields that map the association.
4. Go to the Output tab and give a name for the association: client_address.
5. You have to provide names for each of the endpoints (Role name fields).
○ Endpoint 'client': address.
○ Endpoint 'address': belongs_to_client.

New Section 15 Page 48


6. Mark the Referential contraint checkbox: this will allow the association to be exposed like as a
Primary Key / Foreign Key restriction (for JDBC and ODBC clients).
7. Finally, click

Save.

Using the association

Now it's time to return to the RESTful Web service and get the results of
the client view: http://localhost:9090/denodo-restfulws/tutorial/views/client (see the previous
section for more information about how to query using the RESTful Web Sevice).
As you can see in the screenshot below, a new column with a link is added to the output table and
the text of the link is the Role name configured in the association for the endpoint. In this example, if
we click on address link for the customer John Smith, Denodo will follow the association and display
his address.

New Section 15 Page 49


In the same way, since we made the association, a belongs_to_client column appears with the
address that loads the associated customer information when clicked.

Agile performance - Index


• < PREV

• NEXT >
Data Virtualization software accesses and extracts information from target sources at runtime and
combines them in real-time to get the results. As you know, no local copy of the data will be
available within Denodo.
With this in mind, it is clear that some of the more traditional performance optimization practices
used in database and data warehouse implementations, such as index construction, will fall outside
the scope of a real-time Data Virtualization framework but strategies such as caching can help to
improve the performance of real-time source access and combination goals.
The Denodo advanced cache system is based on a relational database (traditional or in-memory
database).
Denodo is an important component of any data management infrastructure, but not the only one.
When measuring performance, it is important to make sure which of the elements are bottlenecks.
For example, a data source might be returning data in a slow fashion; in some cases you will be able
to increase the performance by adding a new index to that source. If these actions cannot be
perfomed, you can configure an intelligent caching system in Denodo to speed up your queries.

What are the motivations for using a cache?

1. Some data sources might be slow and you want to speed up your queries.
2. You want to avoid workloads in the data sources.
3. Pre-computed transformations are done in the Denodo layer, so they do not need to be
recomputed every time.
4. You want to delegate some queries with data coming from several different data sources.
5. Data sources temporal unavailability (especially when they are external sources).

New Section 15 Page 50


In this section you will see how to:
• Configure Denodo to Use Cache.
• Activate Cache in Base/Derived Views
• View Some Examples of Different Cache Modes

Agile performance - Cache at Server Level


• < PREV

• NEXT >
As you already know, Denodo includes a module to store local copies of the data as required. This
cache will use a Relational Database accessible through JDBC protocol (MySQL, Microsoft SQL
Server, Oracle, DB2, Netezza, Oracle TimesTen, etc).
To use the cache system it can be enabled at server level or at database level. For this tutorial we
will configure the cache at server level. Let's see how to configure the Denodo server to use cache:
1. Log-in using a global administrator user (for example, the default admin).
2. In the Menu Bar, go to Administration > Server configuration and then click Cache.

3. Set the Cache status to On.

4. The default Embedded Derby server is the Database adapter that will be used this tutorial.

New Section 15 Page 51


5. Click Ok.

The Embedded Derby Server should only be used for testing


purposes with very low loads.
TIP
In the next section, you will learn how to configure what views have to be cached.

Agile performance - Cache at View Level


• < PREV

• NEXT >
In the previous section, you activated the cache module in your Denodo server. Now, you need to
configure your views to make use of the cache.
For example, let's activate the cache in the client_with_bills view. We have two main reasons to
select this view as a cached view:
• It queries two data sources (a MySQL database and a SOAP Web Service).
• Usually Web Service response times are worse than traditional databases response times.

New Section 15 Page 52


How to activate cache for a view

Please, follow these steps:


1. In the Elements Tree, double-click on the view name client_with_bills. The view schema will
show in the Workspace.
2. Click on the Options tab at the top (third tab).
3. Under Cache mode:, check Partial. It will enable a partial cache for the client_with_bills view.

4. Leave the rest of fields with the default values.


5. Finally, click

Save.
Now test if the cache works as expected by performing the following test:
1. Open Tools > VQL Shell.
2. Execute the following query (make sure you have the tutorial database selected from the drop-
down Database menu):
SELECT * FROM client_with_bills TRACE
3. After the execution, click on the Execution Trace button (above the results) to see the query
execution plan. You can see that the data comes directly from the different data sources.

New Section 15 Page 53


4. Click Execute and perform the same query again.
5. Click on the Execution Trace button again and you will see that the data comes from the cache
(displayed with a box a icon), and the execution time of the query is faster.

In the next section, you will learn more about the available cache modes. In the above example,

New Section 15 Page 54


In the next section, you will learn more about the available cache modes. In the above example,
you configured the view to use Partial mode, in the next section we will learn the the behavior of the
different cache modes.

Agile performance - Cache Modes


• < PREV

• INDEX
In the previous section, you activated the cache in one of your views. Now, it's time to learn more
about how the Denodo Cache works. Denodo has the following cache modes:
• Partial:
the first time a query over the view is executed, the cache table will be populated with the
tuples in the output from the datasource. At runtime, when a user queries the view, the Denodo
server checks if the cache contains the data required to answer the query. If it does not have
this data, Denodo will query the data source and populate the cache with that output.
When the Time To Live (TTL) of the data has passed, the cache system will invalidate the
cached data of the view so the next query will hit the data source.
This mode supports the following options:
○ With explicit loads: if this option is selected, the cache has to be loaded explicitly.
○ Match exact queries only: if this option is selected, the cache stores the result of each
query. Then, if the same query is executed, and the entries of this query in cache have
not expired (TTL), the data returned to the client is retrieved from the cache.
• Full:
The data of the view is always retrieved from the cache engine instead of from the source, this
mode always requires explicit cache loads.
The main benefit of this mode over the partial cache is that complex operations (joins, unions,
group by…) involving several views (even from different data sources) can be delegated to the
cache database. Therefore, the performance of these operations is significantly improved.

Cache examples

Let's see how the cache works using the following example (make sure you have
the tutorial database selected from the drop-down Database menu):

SELECT * FROM client_with_bills


WHERE client_id = 'C077'
CODE

New Section 15 Page 55


CODE
Run this query and check the Execution Trace with these different cache configurations:
• With the Cache Off
The data sources are always queried.
• With a Partial Cache
The data source will be queried the first time and subsequent queries will retrieve the data
from the cache, if it is available.
To demonstrate, execute these queries:
1. SELECT * FROM client_with_bills WHERE CLIENT_ID = 'C066' will query the data source
again.
2. SELECT * FROM client_with_bills WHERE CLIENT_ID = 'C077' AND CODE = '02' will query
the cache.
3. SELECT * FROM client_with_bills WHERE CLIENT_ID = 'C066' queries the cache now.
• Partial Cache with Explicit loads
Denodo server will query the data source until the cache is told to be loaded
with CONTEXT('cache_preload' = 'true').
○ SELECT * FROM client_with_bills WHERE CLIENT_ID = 'C077' will query the data source.
○ SELECT * FROM client_with_bills WHERE CLIENT_ID = 'C077' CONTEXT('cache_preload' =
'true') will query the data source and load the cache.
○ SELECT * FROM client_with_bills WHERE CLIENT_ID = 'C077' will now query the cache.
• Partial cache with Match exact queries only
○ SELECT * FROM client_with_bills WHERE CLIENT_ID = 'C077' will query the cache.
○ SELECT * FROM client_with_bills WHERE CLIENT_ID = 'C077' AND CODE = '02' will query
the data source and load the cache for this query.
○ SELECT * FROM client_with_bills WHERE CLIENT_ID = 'C077' AND CODE = '02' will query
the cache now.
• Full Cache
All of the view's underlying data will be loaded into the cache, and all queries will only go to the
cache.

In Full mode, you have to explicitly load the cache.(this action can be
scheduled)
SELECT * FROM <view> CONTEXT('cache_preload'='true')
TIP
Congratulations! You have completed the Denodo Basics Tutorial.
This is just the first step in understanding the Denodo Data Virtualization software. Now, you are
prepared to go even further with the rest of our tutorials and become a Master Data Ninja!

New Section 15 Page 56


New Section 15 Page 57

You might also like