You are on page 1of 45

Nemo Analyze Training

Custom KPI Workbench &


Custom SQL Queries

NEVER SETTLE FOR ANYTHING LESS THAN EXCELLENCE. NEMO NETWORK TESTING SOLUTIONS

Custom queries and KPIs


SQL queries and KPI Workbench

3rd party
query tool

Nemo Analyze
(visualisation in
map, graph, etc)

Nemo Analyze
(visualisation in
map, graph, etc)

KPI workbench

ODBC interface
(SQL queries)

Nemo Analyze database

2 ways of creating custom KPI:

SQL query
KPI workbench

SQL queries can be used as custom KPIs as


such in Nemo Analyze, or from any 3rd
party tool that has ODBC interface
KPI workbench can be used to further
process the output of SQL queries

SQL Queries vs. KPI Workbench

SQL pros
Efficient for power users
Standard language
Scalability of the queries is good
Excellent tool for grouping and filtering of data
SQL cons
Requires SQL and database schema knowledge
Temporal relations between tables are difficult or impossible to build
Difficult or impossible to track sequence of events, and do other more advanced logic
KPI workbench pros
Graphical UI, no need for SQL knowledge
Possibility to create temporal correlations between tables
Possibility to create state machines, and other advanced sequences
KPI workbench cons
Currently data can be correlated only per file, e.g. MMS end to end delivery time cannot be
calculated from two separate files
Scalabilty is not as good as with SQL, may run into Japa heap space error with large datasets
Use SQL if you need to add complex filter or formatting for one of the exsiting queries in the parameter
tree
Use KPI workbench if you cannot do what you want with SQL

KPI Workbench

NEVER SETTLE FOR ANYTHING LESS THAN EXCELLENCE. NEMO NETWORK TESTING SOLUTIONS

General

KPI workbench is a graphical scripting environment for creating user-defined custom


parameters and key performance indicators
KPI workbench is an additional data processing layer in top of the database SQL interface
Predefined SQL queries (parameter tree), or custom SQL queries are used as an input for KPI
workbench
Results of a custom KPI can be visualised in maps, charts and grids in Nemo Analyze user
interface
Why to use KPI workbench instead of custom SQL queries?

Easier to use: no need for SQL knowledge, or Nemo Analyze database schema
Many advanced queries are not possible to do with SQL: tracking certaing signaling message
sequences, call drop root cause analysis, etc.

Input and Output of the Custom KPI

Custom KPI can have one or multiple


input parameters
Input parameters can be selected
from the parameter tree:

Input parameter is a dataset in


tabular format that has column
containing parameter values, and
typically other related columns

Predefined parameters
Custom SQL queries
Existing custom KPIs

E.g. Ec/No best active set contains


columns for Ec/No, scrambling code,
channel number, time, lat, long

Output is always one dataset

Correlations Merging Input Datasets Together

Input can have multiple datasets, output has always only


one, therefore the input datasets need to be merged, i.e.,
correlated to one table
Mathematical and logical operation nodes take only one
input dataset, therefore if input data is needed from
multiple datasets (parameters), they need to be
combined to one dataset with Correlations
Time based correlations:
Previous value
Current value
Next value
All values within time range
SQL type correlations
Inner Join
Left outer join
Union
Cartesian product

Correlation Using Time Intervals

The underlying data recording principle of all Nemo measurement tools is that the data is
written to the log file only when values change, i.e. values are not written periodically.
More precisely, if any of the information elements of a given event change, the event is
written to the log file.

Correlations
Previous, Current, and Next Value

Each row of the lefmost input dataset is


taken directly to the output
From other input datasets,
previous/current/next row from the
timestamp of each leftmost dataset row
is taken into the ouput
Useful when taking snapshots of the
network
E.g. value of each radio parameter in the
event of dropped call

Correlations
All Values Within Time Range

Useful when combining non-event type data (RSCP, RxLevel, Tx power, etc) to
one table

Scatter plots (Ec/N0 vs RSCP, Throughput vs. RxLevel, etc.)


Custom Events with multiple conditions: filter rows where Ec/N0 < x AND RSCP > y
AND etc..)

Correlations
All Values Within Time Range

Correlates two or more input datasets based on


time
Validity time intervals of the samples are taken
into account in the correlation so that no data is
lost
Output row is written for each timestamp where
any of the input datasets has a row
Output is also written if the validity time interval
of a non-first input dataset ends
Leftmost input dataset works as a filter: output
rows are only written if the leftmost input
dataset has a valid row for the given timestamp
Output contains all columns from all input
tables

= Timestamp of a row in table

= Validitity time interval of a row

Example correlation
A1

Table A
(input)

B1
B2

Table A+B
(output)

A1+B1

B3

A1+B2
A2+B2
A2+B3

B4

A2+B4

A2

A3
B5

A4

A2
A3
A3+B5
A4+B5

A5
B6

There is no new line in the input datasets,


output is written because the validity of B4
ends. The table B columns in the output row
are empty.

Table B
(input)

A6

A5+B5
A5+B6

A6+B6

B7

time
Output is not written from B7 because
Table A (leftmost input) does not have valid
row at the time of B7

Correlations
Union

Each row of each input


dataset is put to output
Union does not merge
rows together from input
tables
Rows are not in time
order
Union + order by time is
used as state machine
input

Example correlation

Table A+B
A1

Table A
A1

Table B

A2

B1
A3

A2

B2

A3

B3

A4

A5

B4

B5

A4
A5

Union

A6
B1

A6

B6

B2
B7
B3

B4
= A row in the table

B5
B6
B7

Aggregate Functions

Average, Minimum, Maximum, Sum, and Count


Input: dataset of N rows
Output: One row with the aggregate of selected input column
Definitions needed for the node:

Column: selected the column from the input dataset to which the
aggregate will be calculated
Weight by: Each sample can be weighted (multiplied) by other
column of input data. Typical usage is to weight average by time or
by distance
Group by: Data can be grouped, e.g. Avg RSCP per scrambling code

State Machine

Idle

Call attempt

Call failure

Powerful tool for various applications:


Calculating delay of any signaling procedure, e.g. call setup
Call
delay, handover delay
disconnected
Tracking certain signaling message sequence and create
event the occurence
Track call, or packet data transfer state, and calculate
statistics binned by call/data transfer e.g. Avg tput per data
transfer
Input: one dataset with the needed input data in time order
(UNION + Order by time)
Output: Row per each transition arrow in the state machine, if
output is set
start_time: timestamp of the transition
end_time: timestamp of the next transition
time_interval: time difference in milliseconds between
start_time and end_time, that is, the time spent in the
target state of the transition
Time_inter, with columns for the time spent in the old state,
running index number, and user-definable text

Call
dropped

Call connected

Connecting call

Alerting

Configuring State Machine

Required States are added from


the properties of the state
machine
Transitions are defined in the
properties of source state
For each transition, the
conditions must be defined
based on the values of input data
row
If text in the State transition
Output field is defined, and
output will be written from every
occurrence of the transition

GROUP BY Node

GROUP BY node can be used group table of


data based on any column or columns of the
table, and to calculated aggregates (min, max,
avg, etc) per group for other columns
E.g. Statistics per cell table, with average RSCP,
average Ec/N0, nr. of dropped calls, nr. of failed
calls

Creating Per Session Aggregation

It is often necessary to aggregate


data per session (call, attach, PDP
context, data transfer, etc
For this purpose, session
parameters are available for every
session type in the parameter tree
The session parameters are
returning one row per session, with
session specific information such as
status (fail/success), cause code,
call setup time, data connection
setup time, data transfer protocol,
etc.
The session parameters have
timestamp at the beginning of the
session, with time range/time
interval ranging to the end of the
session

This makes it possible to


correlate any metrics with the
session by using the session as
master input in All Values
Within Time Range (see next
slide)
By adding Group By after that,
the metrics can be aggregated
per session

All Values Within Time Range


with Session as master input

= Timestamp of a row in table

= Validitity time interval of a row

Example correlation
Call session
(Master input)

B1

Call 1

BLER
(input)

Output
Call1+B1

B2

Call1+B2

B3

Call1+B3

B4

Call1+B4
Call1

Call 2

B5

B6

There is no new line in the input datasets,


output is written because the validity of B4
ends. The table B columns in the output row
are empty.
B7

time
Output is not written from BLER 7 because
Call session does not have valid row at the
time of BLER 7 (that is, the call was ended
before)

Call2+B5

Call2+B6

KPI Workbench improvements (Professional Edition)

Possibility to execute a KPI simultaneously on all


measurement files
Needed when correlating data across files, e.g.
Comparing scanner and mobile data
Time-triggered state transition in state machine
Enables event creation when given message was not
received within defined time period
State transition triggered by changed value in state machine
Enables event creation when e.g. Serving cell changes
New Session parameters in the Parameter Tree
Aggregation of data per call, data transfer, PDP
context, Attach, without having to create state
machine to track the session

Time shift node

Time shift node can be used to modify the timestamp and the time range of any input data.
One of the most relevant use cases of this node is when one wants to automatically capture details
before and/or after a particular event for custom root cause analysis purposes
Picture below illustrates capturing of RF details 4s before and after each dropped call using time shift
node

Resample node

Resample node can be used to resample


any input data containing the time column
Sampling period is user-definable in
milliseconds
Nemo logging tools write data to the log
file when the value changes, not
periodically
Data is handled properly in Nemo Analyze,
but if the data is to be exported to a 3rd
party tool, it is usually better to export the
data with a constant sample period
Together with Time shift and Group by
nodes, Resample can be also used to bin
data over a longer period to reduce the
data amount.

Running KPIs per File/Measurement/All

KPI can be run in three modes:


Per file. Input queries are run per file
Per measurement. Input queries are
run over all files of a measurement at
time
All. Input queries are run over the
whole data set as one shot.
Per file should be use always when possible
Input data is processed one logfile at
time, even when running the KPI over
multiple KPIs
Lower memory consumption
Can be used always when data is not
correlated across multiple logfile
Execute per measurement is when comparing
data across two or more logfiles of the same
measurement
Excute per all when comparing data across
arbitrary sets of data, e.g. two measurements
collected with different drive test kits at the
same time or when comparing imported OSS
data to drive test data
Inputs must be sorted by time

Execute per file, Example

In this example, Ec/No,


RSCP, and TX power per
logfile are correlated to
same table

Execute per Measurement, Example

For example, comparing


scanner RSRP and mobile
RSRP of the same drive
Note that all the input
queries are run over all the
files of a measurement,
when comparing e.g. RSCPs
of two terminals, the RSCP
queries must be filtered by
device number

SQL Queries

NEVER SETTLE FOR ANYTHING LESS THAN EXCELLENCE. NEMO NETWORK TESTING SOLUTIONS

To get started:

Read this document


Get SQL editor. Queries can be written with Nemo Analyze database browser, but better SQL
freeware editors are available, such as http://gpoulose.home.att.net/Tools/QTODBC61.msi
Login/Passwd for the database: administrator/password
Get following reference documents from the Nemo User ClubUser manuals and user
documents downloads

Nemo Analyze database schema (Describes the table structure of the database)
Nemo File format specification (Describes the events in Nemo measurement fileformat)
Open access SQL reference (Describes the supported SQL syntax)
Nemo Analyze user manual (Describes the special, Nemo-specific SQL scalar functions and stored
procedures supported by the Nemo Analyze SQL driver)

Nemo Analyze Database Solution

Database solution from ObjectStore


OBDC interface from OpenAccess
+ Nemo added scalar functions and stored procedures

The solution is optimized for fast and convenient use as a standalone tool:
No high system requirements standalone Analyze runs in a standard PC
Queries over single measurement file are always fast regardless of the amount of files in the
database
Maintenance free, no DB administrator needed

Tables

Data in relational DB is stored in relations, which


are perceived by the user as tables
Table represents an object, or an event (e.g.
employee, sales order, etc.)
Each row of the table represents an unique
instant of the object or event
Each column represents different peace of
information regarding the object or event (e.g.
first name, last name, salary, customer name,
order charge, etc.
When referred to a table in a query, the syntax is
{schema name}.{table name}
E.g. "Nemo.UMTS"."BLER
Scema: Nemo.UMTS
Table: BLER

Relations Between Tables

If rows in given table can be associated with rows in


another table, tables are said to have a relationship
between them
One-to-one, one-to-many, and many-to-many relations
are possible (first two exist in Nemo Analyze DB)
E.g. one-to-one: in a given table, each row has a relation
with a single row of another table.
Every table has a primary key (red arrow)
Primary key is a column that uniquely identifies the row within a
table. Using scalar OID_SHORT(oid) will convert oid columns
into a 9 digit integer.

Foreign key (yellow arrow) is the oid of the related row


in another table
- E.g. BLER and Event: the_event of each row in the
BLER table equals oid of one row in the Event table

Overview of the Database Schema

Database schema is based on Nemo logfile format


Nemo fileformat is event based
New event is recorded when recorded value changes, not periodically
Data is grouped to different events based on relations between data, e.g. SHO event records all data that is
related to soft handover
Every event of the Nemo fileformat has a corresponding table in the database
See Nemo file format specification and database schema for more information about different events and
mapping of events to tables
oid (Object ID) column is present in every table of the database
oid is the unique identifier of a row in the table
Measurement table has one row per measurement. Measurement in Nemo language means a logfile, or set of
logfiles
ECNO
collected as multi-device measurement.
the_event
Device table has one row per logfile
...
SHO
Event table contains time and GPS information of all data
the_event
... CAA

Measurement
oid
title
...

Device
oid
the_measurement
device_extension

Event
oid
time
gps_latitude
gps_longitude
the_device

...

the_event
...
All event tables
the_event
...

Nemo Analyze Database Schema


Mapping of Fixed Size Events to DB Tables

If the event has always fixed number of


parameters, it is mapped straight to
corresponding table
For example, mapping of SHO event to the DB:
For each SHO event, there is one row in the
SHO table
For each measurement there is one row in
Measurement table
SHO and Event tables have one-to-one
relation
Device and event have one-to-many relation
Measurement and Device have one-to-many
relation

Nemo Analyze Database Schema


Mapping of Variable Size Events to DB Tables

If the event has variable amount of parameters, it is


mapped to multiple tables in the DB
Mapping of each event is described in database schema
documentation
Example: ECNO event. Number of parameters in ECNO
event depends on the amount of cells recorded
For each ECNO event, there is one row in ECNO table
For each carrier measured in one ECNO event, there
is a row in Channel table
For each cell (active, monitored, etc.) measured in the
ECNO event, there is a row in Cell table
Event, Device, and Measurement relations are as with
fixed size events

+ Views

There is a + view for every table that has relation with


the Event table
Views are displayed as tables, where the table name has
+ in the end
In the view, the corresponding table has been joined
with the Event, Device, and Measurement tables
Time, GPS data, and the measurement file name are
associated with the actual parameter
For example: BLER+ table contains also the time and
coordinates, joined from the Event table
+ views are faster to query, and easier to use, compared
to self-made JOIN
+ views should be always used in queries it time,
or lat&long is needed!

BLER

BLER+

Right way to do it

Timestamp in the Database

Time is recorded to the database in


Timestamp format (column: sql_time)
Native binary format (column: time)
Data is recorded event based, not periodically in Nemo
fileformat
In order to get correct statistics, and to plot correct
graphs, two things are needed:
Timestamp. Point of time when the event occurred
Interval. Elapsed time of how long the recorded event
has been/will be valid
Both the timestamp, and the interval are embedded to
the binary timestamp:
Timestamps can be fetched with scalar function T_
(native binary timestamp)
Interval in milliseconds can be fetched with scalar
function TI_INTERVAL(native binary timestamp)
When custom queries are made to Analyze UI binary
time should be used
Analyze UI fetches both the timestamp and the time
interval from the native binary timestamp
automatically

Filtering Results of Certain Measurement File(s)

Fastest to query and easy way to do it is to use HINT


HINT is added at the end of the query, in the following
format:

/* MEAS({measurement_file_name_1}:{file_extension}|
{measurement_file_name_2}:{file_extension}|....) */

When using HINT, queries over certain file(s) are


always fast, regardless of the amount of other files in
the DB
JOIN can be done with regular SQL syntax also
This is needed only when data from Device, or Measurement
table is needed to return

Even in that case it is recommended to use HINT to


perform the actual filtering, because it makes the
query faster
When custom queries are made to Analyze UI,
measurement filtering MUST NOT BE DONE, unless
query is wanted to be statically limited to certain
measurement file(s)
HINT is added automatically to the query in
runtime

Right way to do it

Value Enumeration

Lots of data is recorded in number format, where the


meaning of each value is enumerated in the fileformat
Numbers are written to the DB
In the DB there is ValueEnum table, which has enumeration
of all the number parameters
Real meaning of a number value can be fetched with
following scalar function:
VAL_TO_STRING({parameter_name_in_ValueEnum}, {value})
When custom queries are made to Analyze UI,
VAL_TO_STRING is not needed!
Number values are displayed automatically in
decoded format

Connections 1/2

Certain tables, like DAS have relation to


Connection table
Every DAS event belongs to some connection
DAS can belong to: Attach, PpdContextActivation,
Data, and DataTransfer connection(s)
Attach is parent connection of
PDPContextActivation, PdpContextActivation is
parent connection of Data, etc.
Two scalars are available related to connections:
CONN_IS_SHARED(conn1.oid, conn2.oid).
Checks if the connections are the same, or if
the one is parent for another
CONN_IS_TYPE(conn.oid, numeric_exp).
Checks if connection is of given type

Connections 2/2

Examples:
1. Get only those throughput samples that were
recorded during data transfer:

2.

Get all throughput samples recorded when PDP


context has been activated and Access point has
been Internet

If the table does not have relation to connection


(column the_connection does not exist), These
things have to be done with time correlation
(described later)

Connection types:
0="Unknown"
1="Voice"
2="Handover"
3="Attach"
4="PDP contect
activation"
5="Data"
6="RRC"
7="Data transfer"
8="MMS"
9="SMS"
10="POC"
11="LAU"
12="RAU"
13="Ping"

Correlating Tables Without Relation 1/3

1.

2.

Most tables do not have any relation between each other, but you may still want to join data between such tables
Two type of JOINs can be made based on time:
Sample based correlation
Each row of table x and a row from table y WHERE the timestamp of x row is within time range of the y row
E.g. Tx power vs. RSCP scatter
Time range condition
- Each row of table x WHERE the timestamp is in certain range, defined by some other tables.
E.g. Tx power samples between Attach request and Attach Accept L3 messages.

Correlating Tables Without Relation 2/3


Sample Based Correlation
1.
2.

3.
4.

List tables to be joined in FROM clause: FROM table x, table y


make x.time = y.time condition to WHERE clause

When x table is the first one in FROM clause:


each y sample is taken, and checked, if there is a x sample that has the same timestamp as y sample OR if the x timestamp falls
to the validity time interval of the y sample

When y table is the first one in FROM clause, the comparison is vice versa
Comparison is not bi-directional! It there is a big difference in the sample periods of x and y, the one that has smaller sample
period, should be selected as y in order to get highest possible resolution
Make x.the_device = y.the_device condition to WHERE clause (this limits the time correlation within each measurement file, improves the
performance)
Add /* OPTIONS(USE_TIME_SCOPE) */ hint at the end of the sql query. This enables the time scope join.

For example: Get Tx power vs. RSCP, when RSCP < -95 dBm

Tx power is in Nemo.UMTS.TXPC table, RSCP is in Nemo.UMTS.ECNO.Cell table, no relation between the tables

All Tx power samples are taken and checked if they fall to the validity time interval of an RSCP sample that is < -95

SELECT a.sql_time AS "Time", "tx_power_umts", "rscp"


FROM "Nemo.UMTS.ECNO"."Cell+" b, "Nemo.UMTS"."TXPC+" a
WHERE b.order = 1 AND b.cell_type = 0 AND a.time = b.time AND a.the_device = b.the_device AND
rscp < -95
/* OPTIONS(USE_TIME_SCOPE) */

Correlating Tables Without Relation 3/3


Time Range Correlation

sql_time columns is the timestamp in datetime format


sql_time column has to be used always in the conditions where it is needed to test whether a
timestamp is in the range of set by two other timestamps
Time indexing does not work if the T_ ( time) is used, or sql_time is used inside scalar function, this
will degrade the performance of this kind of queries

Guidelines for Custom Queries in Analyze UI

Binary timestamp (time) can (and should) be returned in the SELECT row, UI converts it automatically
to readable timestamp
Binary timestamp MUST be used if query is wanted to see correctly in line and bar graph
This is because the validity time interval of the samples is needed also to plot correct graphs
Measurement file filter HINT must not be used in the query, unless it is wanted to be statically limited
to certain measurement file(s) UI filters the query results in runtime
VALUE_ENUM scalar is not needed, UI uses it automatically
Results have to be ordered by time (ORDER BY sql_time) in order to see results correct in graphs and
map

Adding Custom Queries to Analyze UI


Manual Query

Own queries can be added from query manager


1.Query manager -- > add manual query
2.Give name to the query
3.Select Edit SQL manually checkbox
4.Type the query in
5. Set aliases for different graphs, i.e. what
column will be in x-axis, what will be in y-axis,
etc.
6. When finished, the query is available in the
User branch in the parameter tree in Analyze
workspace.

Adding Custom Queries to Analyze UI


Query Wizard

Simple queries can be done with query wizard


Joins are made automatically between tables that have
relationships
Time-based correlation is not done
If two selected tables dont have relation, result
set will be Cartesian product of the tables!

Step-by-step procedure:

Select tables from where data is get

Select columns to be displayed from the table(s)

Select column for ordering the results, if time based


ordering is not sufficient

Set filtering conditions for results

Define aliases between controls and result columns


(which columns will be x and which will be y in graph
etc.)

COMPLETE TOOLSET FOR WIRELESS NETWORK TESTING

www.anite.com/nemo

You might also like