Professional Documents
Culture Documents
• Statistical functions
• SQL model clause
• Introduction of
• Partition Outer Join • Pattern matching
Window functions
• Data mining I • Top N clause
• Data Mining III
4 5
Partitions
– Groupings of rows within a query result set
Orderings
– Rows can be ordered within a partition
Windows (logical or physical)
– A moving group of rows within a partition
– Defines the range of an aggregate calculation
Current Row
> 1 min
Apply expressions across A 2 LAX
A 2 LAX
B 2 SFO
rows C 2 LAX B 2 SFO
C 3 LAS C 2 LAX
Soon to be in ANSI SQL A 3 SFO
Standard B 3 NYC
C 4 NYC
Big
Data
Law
Financial Utilities & Session-
Fraud Retail ization
Telcos Call
Services Fraud Order Quality
Tracking Unusual
Stock Monitoring Returns
Money Network Usage Buying SIM Card Money
Market Suspicious Fraud
Laundering Analysis Patterns Fraud Laundering
Activities
Conceptual Example
* For conceptual clarity, the statement is simplified and ignores an always-true start event.
16 Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
See the notes or documentation for further explanation
SQL Pattern Matching in Action
Example: Find W-Shape*
Stock price
* For conceptual clarity, the statement is simplified and ignores an always-true start event.
17 Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
See the notes or documentation for further explanation
SQL Pattern Matching in Action
Example: Find W-Shape*
Stock price
* For conceptual clarity, the statement is simplified and ignores an always-true start event.
18 Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
See the notes or documentation for further explanation
SQL Pattern Matching inX Action
Example: Find W-Shape*
Stock price
* For conceptual clarity, the statement is simplified and ignores an always-true start event.
19 Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
See the notes or documentation for further explanation
SQL Pattern Matching inX Action
Y
Example: Find W-Shape*
Stock price
* For conceptual clarity, the statement is simplified and ignores an always-true start event.
20 Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
See the notes or documentation for further explanation
SQL Pattern Matching inX Action
Y W Z
Example: Find W-Shape*
Stock price
* For conceptual clarity, the statement is simplified and ignores an always-true start event.
21 Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
See the notes or documentation for further explanation
SQL Pattern Matching inX Action
Z
Example: Find W-Shape*
Stock price
* For conceptual clarity, the statement is simplified and ignores an always-true start event.
22 Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
See the notes or documentation for further explanation
First_x Last_z
SQL Pattern Matching in Action 1 9
Example: Find W-Shape* 13 19
Stock price
* For conceptual clarity, the statement is simplified and ignores an always-true start event.
23 Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
See the notes or documentation for further explanation
SQL Pattern Matching Can refer to previous variables
X Z
Example: Find W-Shape lasts < 7 days*
Stock price
* For conceptual clarity, the statement is simplified and ignores an always-true start event.
24 Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
See the notes or documentation for further explanation
SQL Pattern Matching Average stock price: $52.00
Example: Find average price within W-Shape*
Stock price
2. Define the pattern of events and pattern variables identifying the individual
events within the pattern
Use framework of Perl regular expressions (conditions on rows)
– PATTERN (X+ Y+ W+ Z+)
Java vs. SQL: Stock Markets - Searching for ‘W’ Patterns in Trade Data
if (gt(q, prev) && gt(q, next)) {
}
state = "T";
return state;
@Override
long c = 0;
String line = "";
12 Lines of SQL
String pbkey = "";
V0Line nextLine;
V0Line thisLine;
V0Line processLine;
V0Line evalLine = null;
V0Line prevLine;
boolean noMoreValues = false;
String matchList = "";
20x less code, 5x faster
ArrayList<V0Line> lineFifo = new ArrayList<V0Line>();
boolean finished = false;
if (input == null) {
return null;
45 }Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
if (input.size() == 0) {
return null;
Analytical SQL in the Database
Summary
Ranking functions Descriptive Statistics
– rank, dense_rank, cume_dist, percent_rank, ntile – DBMS_STAT_FUNCS: summarizes numerical columns of a table
and returns count, min, max, range, mean, stats_mode, variance,
Window Aggregate functions (moving and cumulative) standard deviation, median, quantile values, +/- n sigma values,
–
Avg, sum, min, max, count, variance, stddev, first_value, top/bottom 5 values
last_value Correlations
LAG/LEAD functions – Pearson’s correlation coefficients, Spearman's and Kendall's (both
– Direct inter-row reference using offsets nonparametric).
Reporting Aggregate functions Cross Tabs
– Sum, avg, min, max, variance, stddev, count, – Enhanced with % statistics: chi squared, phi coefficient, Cramer's V,
ratio_to_report contingency coefficient, Cohen's kappa
Statistical Aggregates Hypothesis Testing
– Correlation, linear regression family, covariance – Student t-test , F-test, Binomial test, Wilcoxon Signed Ranks test,
Chi-square, Mann Whitney test, Kolmogorov-Smirnov test, One-way
Linear regression ANOVA
– Fitting of an ordinary-least-squares regression line to a set Distribution Fitting
of number pairs.
– Kolmogorov-Smirnov Test, Anderson-Darling Test, Chi-Squared
– Frequently combined with the COVAR_POP, Test, Normal, Uniform, Weibull, Exponential
COVAR_SAMP, and CORR functions