Oracle analytic function-percentile_cont

percentile_cont是连续分布模型的逆向函数。这个函数接收一个百分位数和一个排序说明,返回一个与输入的分布百分比值相对应的数据值,分布百分比的计算方法与percent_rank相同。Nulls在计算中被忽略。
如果输入的百分位数没有正好对应的数据值,就使用如下的算法来计算得到:
RN = 1+ (P*(N-1)) 其中P是输入的分布百分比值,N是组内的行数
CRN = CEIL(RN) FRN = FLOOR(RN)
If (CRN = FRN = RN) then the result is
(value of expression from row at RN)
Else the result is
(CRN – RN) * (value of expression for row at FRN) + (RN – FRN) * (value of expression for row at CRN)

当没有值与指定的percent_rank精确匹配的时候,percentile_cont(百分位数)会计算两个离得最近的值的平均值。

SQL> select first_name || ' ' || last_name as Name, salary
from employees
where department_id = 30 order by salary;

NAME                                               SALARY
---------------------------------------------- ----------
Karen Colmenares                                     2500
Guy Himuro                                           2600
Sigal Tobias                                         2800
Shelli Baida                                         2900
Alexander Khoo                                       3100
Den Raphaely                                        11000

SQL> select department_id,percentile_cont(0.5) within group (order by salary) percent_count_salary
from employees where department_id = 30
group by department_id;
DEPARTMENT_ID PERCENT_COUNT_SALARY
------------- --------------------
           30                 2850

计算步骤:
RN = 1 + (0.5 * (6-1)) = 3.5
CRN=4 FRN=3
计算结果为(4-3.5) * 2800 +(3.5-3) * 2900 = 1400 + 1450 = 2850

SQL> select first_name,last_name,salary,
percent_rank() over(partition by department_id order by salary)  percent_rank,
percentile_cont(0.5) within group (order by salary) over(partition bydepartment_id) percentile_cont
from employees
where department_id = 30;

FIRST_NAME           LAST_NAME                     SALARY PERCENT_RANK    PERCENTILE_CONT
-------------------- ------------------------- ---------- ---------------- -------------------
Karen                Colmenares                      2500                0           2850
Guy                  Himuro                          2600               .2           2850
Sigal                Tobias                          2800               .4           2850 --与计算的相同,为0.4和0.6分别对应的值的平均值
Shelli               Baida                           2900               .6           2850
Alexander            Khoo                            3100               .8           2850
Den                  Raphaely                       11000                1           2850

SQL> select first_name,last_name,department_id,salary,percent_rank() over(partition by department_id order by salary) percent_rank,
percentile_cont(0.7) within group (order by salary) over(partition bydepartment_id) percentile_cont
from employees
where department_id = 30;  

FIRST_NAME           LAST_NAME                 DEPARTMENT_ID     SALARY PERCENT_RANK    PERCENTILE_CONT
-------------------- ------------------------- ------------- ---------- ---------------- -------------------
Karen                Colmenares                           30       2500                0                3000
Guy                  Himuro                               30       2600               .2                3000
Sigal                Tobias                               30       2800               .4                3000
Shelli               Baida                                30       2900               .6                3000    ----(2900 + 3100)/2
Alexander            Khoo                                 30       3100               .8                3000
Den                  Raphaely                             30      11000                1                3000

SQL> select first_name,last_name,department_id,salary,percent_rank() over(partition by department_id order by salary) percent_rank,
percentile_cont(0.5) within group (order by salary) over(partition bydepartment_id) percentile_cont
from employees;

FIRST_NAME           LAST_NAME                 DEPARTMENT_ID     SALARY PERCENT_RANK    PERCENTILE_CONT
-------------------- ------------------------- ------------- ---------- ---------------- -------------------
Jennifer             Whalen                               10       4400                0                4400
Pat                  Fay                                  20       6000                0                9500
Michael              Hartstein                            20      13000                1                9500
Karen                Colmenares                           30       2500                0                2850
Guy                  Himuro                               30       2600               .2                2850
Sigal                Tobias                               30       2800               .4                2850
Shelli               Baida                                30       2900               .6                2850
Alexander            Khoo                                 30       3100               .8                2850
Den                  Raphaely                             30      11000                1                2850
Susan                Mavris                               40       6500                0                6500
TJ                   Olson                                50       2100                0                3100
Hazel                Philtanker                           50       2200       .022727273                3100
Steven               Markle                               50       2200       .022727273                3100
Ki                   Gee                                  50       2400       .068181818                3100
James                Landry                               50       2400       .068181818                3100
Joshua               Patel                                50       2500       .113636364                3100
Peter                Vargas                               50       2500       .113636364                3100
James                Marlow                               50       2500       .113636364                3100
Randall              Perkins                              50       2500       .113636364                3100
Martha               Sullivan                             50       2500       .113636364                3100
Donald               OConnell                             50       2600       .227272727                3100
Douglas              Grant                                50       2600       .227272727                3100
Randall              Matos                                50       2600       .227272727                3100
John                 Seo                                  50       2700       .295454545                3100
Irene                Mikkilineni                          50       2700       .295454545                3100
Girard               Geoni                                50       2800       .340909091                3100
Vance                Jones                                50       2800       .340909091                3100
Mozhe                Atkinson                             50       2800       .340909091                3100
Timothy              Gates                                50       2900       .409090909                3100
Michael              Rogers                               50       2900       .409090909                3100
Anthony              Cabrio                               50       3000       .454545455                3100
Kevin                Feeney                               50       3000       .454545455                3100
Curtis               Davies                               50       3100               .5                3100    --对应0.5存在
Alana                Walsh                                50       3100               .5                3100
Jean                 Fleaur                               50       3100               .5                3100
Winston              Taylor                               50       3200       .568181818                3100
Stephen              Stiles                               50       3200       .568181818                3100
Samuel               McCain                               50       3200       .568181818                3100
Julia                Nayer                                50       3200       .568181818                3100
Jason                Mallin                               50       3300       .659090909                3100
Laura                Bissot                               50       3300       .659090909                3100
Julia                Dellinger                            50       3400       .704545455                3100
Trenna               Rajs                                 50       3500       .727272727                3100
Jennifer             Dilly                                50       3600              .75                3100
Renske               Ladwig                               50       3600              .75                3100
Kelly                Chung                                50       3800       .795454545                3100
Britney              Everett                              50       3900       .818181818                3100
Sarah                Bell                                 50       4000       .840909091                3100
Alexis               Bull                                 50       4100       .863636364                3100
Nandita              Sarchand                             50       4200       .886363636                3100
Kevin                Mourgos                              50       5800       .909090909                3100
Shanta               Vollman                              50       6500       .931818182                3100
Payam                Kaufling                             50       7900       .954545455                3100
Matthew              Weiss                                50       8000       .977272727                3100
Adam                 Fripp                                50       8200                1                3100
Diana                Lorentz                              60       4200                0                4800
David                Austin                               60       4800              .25                4800
Valli                Pataballa                            60       4800              .25                4800
Bruce                Ernst                                60       6000              .75                4800
Alexander            Hunold                               60       9000                1                4800
Hermann              Baer                                 70      10000                0               10000
Sundita              Kumar                                80       6100                0                8900
Charles              Johnson                              80       6200        .03030303                8900
Amit                 Banda                                80       6200        .03030303                8900
Sundar               Ande                                 80       6400       .090909091                8900
David                Lee                                  80       6800       .121212121                8900
Sarath               Sewall                               80       7000       .151515152                8900
Oliver               Tuvault                              80       7000       .151515152                8900
Mattea               Marvins                              80       7200       .212121212                8900
Elizabeth            Bates                                80       7300       .242424242                8900
William              Smith                                80       7400       .272727273                8900
Nanette              Cambrault                            80       7500       .303030303                8900
Louise               Doran                                80       7500       .303030303                8900
Lindsey              Smith                                80       8000       .363636364                8900
Christopher          Olsen                                80       8000       .363636364                8900
Jack                 Livingston                           80       8400       .424242424                8900
Jonathon             Taylor                               80       8600       .454545455                8900
Alyssa               Hutton                               80       8800       .484848485                8900
Allan                McEwen                               80       9000       .515151515                8900
Peter                Hall                                 80       9000       .515151515                8900
Danielle             Greene                               80       9500       .575757576                8900
David                Bernstein                            80       9500       .575757576                8900
Patrick              Sully                                80       9500       .575757576                8900
Tayler               Fox                                  80       9600       .666666667                8900
Harrison             Bloom                                80      10000       .696969697                8900
Peter                Tucker                               80      10000       .696969697                8900
Janette              King                                 80      10000       .696969697                8900
Eleni                Zlotkey                              80      10500       .787878788                8900
Clara                Vishney                              80      10500       .787878788                8900
Gerald               Cambrault                            80      11000       .848484848                8900
Ellen                Abel                                 80      11000       .848484848                8900
Lisa                 Ozer                                 80      11500       .909090909                8900
Alberto              Errazuriz                            80      12000       .939393939                8900
Karen                Partners                             80      13500        .96969697                8900
John                 Russell                              80      14000                1                8900
Lex                  De Haan                              90      17000                0               17000
Neena                Kochhar                              90      17000                0               17000
Steven               King                                 90      24000                1               17000
Luis                 Popp                                100       6900                0                8000
Ismael               Sciarra                             100       7700               .2                8000
Jose Manuel          Urman                               100       7800               .4                8000   ---(7800 + 8200)/2
John                 Chen                                100       8200               .6                8000
Daniel               Faviet                              100       9000               .8                8000
Nancy                Greenberg                           100      12008                1                8000
William              Gietz                               110       8300                0               10154
Shelley              Higgins                             110      12008                1               10154
Kimberely            Grant                                         7000                0                7000
发表在 analytic function, SQL, Uncategorized | 留下评论

Oracle analytic function-percentile_disc

语法:
PERCENTILE_DISC(expr) WITHIN GROUP (ORDER BY expr [ DESC | ASC ]) [ OVER (query_partition_clause) ]

函数返回一个与输入的分布百分比值相对应的数据值,分布百分比的计算方法同函数cume_dist(),如果没有正好对应的数据值,就取大于该分布百分比值的下一个值对应的值。

percentile_disc函数在功能上类似于percentile_cont函数,只是percentile_cont函数使用了连续分布模型,而percentile_disc函数使用了离期分布模型。

SQL> select department_id,salary,CUME_DIST() OVER (PARTITION BY department_id ORDER BY salary) Cume_Dist,
     percentile_disc(0.5) within group (order by salary) over(partition by department_id) percentile_dist_sal
from employees where department_id not in (50,80);  

DEPARTMENT_ID     SALARY  CUME_DIST PERCENTILE_DIST_SAL
------------- ---------- ---------- -------------------
           10       4400          1                4400
           20       6000         .5                6000    ----取cume_dist为0.5的数据值
           20      13000          1                6000
           30       2500 .166666667                2800
           30       2600 .333333333                2800
           30       2800         .5                2800
           30       2900 .666666667                2800
           30       3100 .833333333                2800
           30      11000          1                2800
           40       6500          1                6500
           60       4200         .2                4800
           60       4800         .6                4800
           60       4800         .6                4800
           60       6000         .8                4800
           60       9000          1                4800
           70      10000          1               10000
           90      17000 .666666667               17000
           90      17000 .666666667               17000
           90      24000          1               17000
          100       6900 .166666667                7800
          100       7700 .333333333                7800
          100       7800         .5                7800
          100       8200 .666666667                7800
          100       9000 .833333333                7800
          100      12008          1                7800
          110       8300         .5                8300
          110      12008          1                8300

27 rows selected.

SQL> select department_id,salary,CUME_DIST() OVER (PARTITION BY department_id
         ORDER BY salary) Cume_Dist,
         percentile_disc(0.6) within group (order by salary) over(partition by department_id) percentile_dist_sal
from employees 
where department_id not in (50,80);  
DEPARTMENT_ID     SALARY  CUME_DIST PERCENTILE_DIST_SAL
------------- ---------- ---------- -------------------
           10       4400          1                4400
           20       6000         .5               13000 --百分比为0.6的时候,没有对应的cume_dist的值,取大于该百分比的下一个cust_dist对应的数值
           20      13000          1               13000
           30       2500 .166666667                2900
           30       2600 .333333333                2900
           30       2800         .5                2900
           30       2900 .666666667                2900    --取2900
           30       3100 .833333333                2900
           30      11000          1                2900
           40       6500          1                6500
           60       4200         .2                4800
           60       4800         .6                4800
           60       4800         .6                4800
           60       6000         .8                4800
           60       9000          1                4800
           70      10000          1               10000
           90      17000 .666666667               17000
           90      17000 .666666667               17000
           90      24000          1               17000
          100       6900 .166666667                8200
          100       7700 .333333333                8200
          100       7800         .5                8200
          100       8200 .666666667                8200   --取8200
          100       9000 .833333333                8200
          100      12008          1                8200
          110       8300         .5               12008
          110      12008          1               12008

27 rows selected.
发表在 analytic function, SQL | 留下评论

Oracle analytic function-percent_rank

聚合语法:PERCENT_RANK (expression1, … expression_n) WITHIN GROUP (ORDER BY expression_order1, …expression_order_n)
分析语法:PERCENT_RANK () OVER ([ query_partition_clause ] order_by_clause)

percent_rank和分析函数cume_dist类似,也是计算累计分布的函数。不同的地方是percent_rank计算的时候是:(相对位置-1)/总行数。对于重复行,计算的时候取重复行的第一行的位置。函数返回值为[0,1]之间。

原始数据

SQL> select first_name || ' ' || last_name as Name, salary
  2    from employees
  3   where department_id = 30 order by salary;

NAME                                                                                             SALARY
-------------------------------------------------------------------------------------------- ----------
Karen Colmenares                                                                                   2500
Guy Himuro                                                                                         2600
Alexander Khoo                                                                                     2600
Sigal Tobias                                                                                       2600
Shelli Baida                                                                                       2900
Den Raphaely                                                                                      11000
已选择6行。

作为聚合函数使用,当使用percent_rank计算薪水值为2600的累计分布的时候,2600为重复行,
计算的时候取2600出现的第一行即所在的集合的第二行。计算公式为2-1/6=0.1666667。

SQL> select percent_rank(2600.00) within group(order by salary) as percent_rank_sal_2600
  2    from employees
  3   where department_id = 30;

PERCENT_RANK_SAL_2600
---------------------
           .166666667

作为分析函数使用,计算每个人在部门按照薪水排序中的相对位置。

SQL> select first_name,last_name,salary,percent_rank() over(order by salary) as percent_rank_sal_analytic
  2  from employees
  3  where department_id = 30;

FIRST_NAME                               LAST_NAME                               SALARY PERCENT_RANK_SAL_ANALYTIC
---------------------------------------- -------------------------------------- ---------- -------------------------
Karen                                    Colmenares                                2500                         0
Guy                                      Himuro                                    2600                        .2
Alexander                                Khoo                                      2600                        .2
Sigal                                    Tobias                                    2600                        .2
Shelli                                   Baida                                     2900                        .8
Den                                      Raphaely                                 11000                         1
已选择6行。
发表在 analytic function, SQL | 留下评论

Oracle analytic function-cume_dist

函数语法如下:
聚合函数的语法
CUME_DIST (expression1, … expression_n) WITHIN GROUP (ORDER BY expression_order1, … expression_order_n)

分析函数的语法
CUME_DIST() OVER ([ query_partition_clause ] order_by_clause)

数据的累积分布,也即小于等于当前数据值的所有数据的概率分布,对于表示数据点在某个区间内出现的概率有很大的帮助。

累积分布函数为Cumulative Distribution Function, 简称CDF。

cume_dist()计算某个值在一组值中的累计分布,为相对位置/总行数,对于重复行,计算的时候取重复行中的最后一行的位置,返回值为(0,1]之间。这个函数可以使用数值数据类型也可以使用非数值类型作为参数,原因是函数会对非数值类型的参数进行隐式转换。

cume_dist()函数在聚合函数的用途中,用参数中的指定的数据构造一条假定的数据并插入到现存行中,然后计算这条假定数据在所有行中的相对位置。

聚合函数的例1:
下面的实验中,原始数据如下

SQL> select first_name || ' ' || last_name as Name, salary
  2    from employees
  3   where department_id = 30 order by salary;

NAME                                                                                             SALARY
-------------------------------------------------------------------------------------------- ----------
Karen Colmenares                                                                                   2500
Guy Himuro                                                                                         2600
Alexander Khoo                                                                                     2600
Sigal Tobias                                                                                       2600
Shelli Baida                                                                                       2900
Den Raphaely                                                                                      11000
已选择6行。

当计算薪水值2500.00在部门中的累计分布时,当2500.00加入到记录中后所谓的位置为第2行。Cume_dist计算的结果为2/7= 0.2857142857

SQL> select cume_dist(2500.00) within group(order by salary) as cume_dist_sal_2500
  2  from employees
  3  where department_id = 30;

CUME_DIST_SAL_2500
------------------
        .285714286
计算薪水值为2500在组内的累计分布情况。

聚合函数例2

SQL> select first_name ||' '|| last_name as Name, salary,commission_pct
  2    from employees
  3   where department_id = 80
  4   order by 2 desc;

NAME                                                                                             SALARY COMMISSION_PCT
-------------------------------------------------------------------------------------------- ---------- --------------
John Russell                                                                                      14000             .4
Karen Partners                                                                                    13500             .3
Alberto Errazuriz                                                                                 12000             .3
Lisa Ozer                                                                                         11500            .25
Ellen Abel                                                                                        11000             .3
Gerald Cambrault                                                                                  11000             .3
Eleni Zlotkey                                                                                     10500             .2
Clara Vishney                                                                                     10500            .25
Harrison Bloom                                                                                    10000             .2
Peter Tucker                                                                                      10000             .3
Janette King                                                                                      10000            .35
Tayler Fox                                                                                         9600             .2
David Bernstein                                                                                    9500            .25
Patrick Sully                                                                                      9500            .35
Danielle Greene                                                                                    9500            .15
Peter Hall                                                                                         9000            .25
Allan McEwen                                                                                       9000            .35
Alyssa Hutton                                                                                      8800            .25
Jonathon Taylor                                                                                    8600             .2
Jack Livingston                                                                                    8400             .2
Christopher Olsen                                                                                  8000             .2
Lindsey Smith                                                                                      8000             .3
Nanette Cambrault                                                                                  7500             .2
Louise Doran                                                                                       7500             .3
William Smith                                                                                      7400            .15
Elizabeth Bates                                                                                    7300            .15
Mattea Marvins                                                                                     7200             .1
Oliver Tuvault                                                                                     7000            .15
Sarath Sewall                                                                                      7000            .25
David Lee                                                                                          6800             .1
Sundar Ande                                                                                        6400             .1
Charles Johnson                                                                                    6200             .1
Amit Banda                                                                                         6200             .1
Sundita Kumar                                                                                      6100             .1

已选择34行。

SQL> select cume_dist(7500.00, 0.20) within group(order by salary, commission_pct) as cume_dist_sal_7500
  2    from employees
  3   where department_id = 80;

CUME_DIST_SAL_7500
------------------
        .342857143
上述的SQL反应的结果:薪水值为7500,commission_pct为0.20的员工在部门ID为80的组内的累计分布。

分析函数例1:

SQL> col first_name for a20
SQL> col last_name for a20
SQL>
SQL> select first_name,last_name,salary,cume_dist() over(order by salary desc) as cume_dist_sal_analytic
  2  from employees
  3  where department_id = 80;

FIRST_NAME           LAST_NAME                SALARY CUME_DIST_SAL_ANALYTIC
-------------------- -------------------- ---------- ----------------------
John                 Russell                   14000             .029411765
Karen                Partners                  13500             .058823529
Alberto              Errazuriz                 12000             .088235294
Lisa                 Ozer                      11500             .117647059
Ellen                Abel                      11000             .176470588
Gerald               Cambrault                 11000             .176470588
Eleni                Zlotkey                   10500             .235294118
Clara                Vishney                   10500             .235294118
Harrison             Bloom                     10000             .323529412
Peter                Tucker                    10000             .323529412
Janette              King                      10000             .323529412
Tayler               Fox                        9600             .352941176
David                Bernstein                  9500             .441176471
Patrick              Sully                      9500             .441176471
Danielle             Greene                     9500             .441176471
Peter                Hall                       9000                     .5
Allan                McEwen                     9000                     .5
Alyssa               Hutton                     8800             .529411765
Jonathon             Taylor                     8600             .558823529
Jack                 Livingston                 8400             .588235294
Christopher          Olsen                      8000             .647058824
Lindsey              Smith                      8000             .647058824
Nanette              Cambrault                  7500             .705882353
Louise               Doran                      7500             .705882353
William              Smith                      7400             .735294118
Elizabeth            Bates                      7300             .764705882
Mattea               Marvins                    7200             .794117647
Oliver               Tuvault                    7000             .852941176
Sarath               Sewall                     7000             .852941176
David                Lee                        6800             .882352941
Sundar               Ande                       6400             .911764706
Charles              Johnson                    6200             .970588235
Amit                 Banda                      6200             .970588235
Sundita              Kumar                      6100                      1

已选择34行。
上述语句计算每行的salary的累计分布。

分析函数例2:

SQL> select job_id,first_name,last_name,salary,cume_dist() over(partition by job_id order by salary) as cume_dist_sal_analytic
  2  from employees
  3  where department_id = 80
  4  order by job_id, last_name, salary,cume_dist_sal_analytic;
JOB_ID               FIRST_NAME                               LAST_NAME                       SALARY CUME_DIST_SAL_ANALYTIC
-------------------- ---------------------------------------- --------------------------- ---------- ----------------------
SA_MAN               Gerald                                   Cambrault                        11000                     .4
SA_MAN               Alberto                                  Errazuriz                        12000                     .6
SA_MAN               Karen                                    Partners                         13500                     .8
SA_MAN               John                                     Russell                          14000                      1
SA_MAN               Eleni                                    Zlotkey                          10500                     .2
SA_REP               Ellen                                    Abel                             11000             .965517241
SA_REP               Sundar                                   Ande                              6400             .137931034
SA_REP               Amit                                     Banda                             6200             .103448276
SA_REP               Elizabeth                                Bates                             7300             .310344828
SA_REP               David                                    Bernstein                         9500              .75862069
SA_REP               Harrison                                 Bloom                            10000             .896551724
SA_REP               Nanette                                  Cambrault                         7500             .413793103
SA_REP               Louise                                   Doran                             7500             .413793103
SA_REP               Tayler                                   Fox                               9600             .793103448
SA_REP               Danielle                                 Greene                            9500              .75862069 
SA_REP               Peter                                    Hall                              9000             .655172414
SA_REP               Alyssa                                   Hutton                            8800             .586206897
SA_REP               Charles                                  Johnson                           6200             .103448276
SA_REP               Janette                                  King                             10000             .896551724
SA_REP               Sundita                                  Kumar                             6100             .034482759
SA_REP               David                                    Lee                               6800             .172413793
SA_REP               Jack                                     Livingston                        8400             .517241379
SA_REP               Mattea                                   Marvins                           7200             .275862069
SA_REP               Allan                                    McEwen                            9000             .655172414
SA_REP               Christopher                              Olsen                             8000             .482758621
SA_REP               Lisa                                     Ozer                             11500                      1
SA_REP               Sarath                                   Sewall                            7000              .24137931
SA_REP               William                                  Smith                             7400             .344827586
SA_REP               Lindsey                                  Smith                             8000             .482758621
SA_REP               Patrick                                  Sully                             9500              .75862069
SA_REP               Jonathon                                 Taylor                            8600             .551724138
SA_REP               Peter                                    Tucker                           10000             .896551724
SA_REP               Oliver                                   Tuvault                           7000              .24137931
SA_REP               Clara                                    Vishney                          10500             .931034483

根据上述的查询结果,SA_REP工作类型的员工中,有75.86%的人薪水低于或等于Greene的薪水(9500)。

发表在 analytic function, SQL | 一条评论

Oracle analytic function-ntile

NTILE的语法如下
NTILE(expr) OVER ([ query_partition_clause ] order_by_clause)

NTILE函数对一个数据分区中的有序结果集进行划分,将其分组为各个桶,并为每个小组分配一个唯一的组编号。这个函数在统计分析中是很有用的。例如,如果想移除异常值,你可以将它们分组到顶部或底部的桶中,然后在统计分析的时候将这些值排除。ORACLE数据库统计信息收集也使用NTILE函数来计算直方图信息边界。在统计学术语中,NTILE函数创建等宽直方图信息。
原始数据

SQL> select first_name ||' '|| last_name as Name, salary
  2    from employees
  3   where department_id = 30;

NAME                                                                                             SALARY
-------------------------------------------------------------------------------------------- ----------
Den Raphaely                                                                                      11000
Alexander Khoo                                                                                     2600
Shelli Baida                                                                                       2900
Sigal Tobias                                                                                       2600
Guy Himuro                                                                                         2600
Karen Colmenares                                                                                   2500
已选择6行。

接下来对上述6个员工的薪水分配4,5,6三种桶数,观察一下当分配不同的桶数的情况下,员工的薪水值落在的桶号。

SQL> select first_name || ' ' || last_name as Name,
  2         salary,
  3         ntile(4) over(order by salary desc) as n4,
  4         ntile(5) over(order by salary desc) as n5,
  5         ntile(6) over(order by salary desc) as n6
  6    from employees
  7   where department_id = 30;
NAME                                                             SALARY         N4         N5         N6
--------------------------------------------------------------- ---------- ---------- ---------- ----------
Den Raphaely                                                      11000          1          1          1
Shelli Baida                                                       2900          1          1          2
Sigal Tobias                                                       2600          2          2          3
Guy Himuro                                                         2600          2          3          4
Alexander Khoo                                                     2600          3          4          5
Karen Colmenares                                                   2500          4          5          6
已选择6行。

当分配的桶数为6个的时候,因为部门ID=30的员工有6个人,每个员工的薪水值都被分配到不同的桶中。当桶数小于6的时候,会有多个值分配到相同的桶中的情况发生。

发表在 analytic function, SQL | 留下评论

Oracle analytic function-ratio_to_report

分析函数RATIO_TO_REPORT 用来计算当前记录的指标expr占开窗函数over中包含记录的所有同一指标的百分比. 这里如果开窗函数的统计结果为null或者为0,就是说占用比率的被除数为0或者为null, 则得到的结果也为0.
开窗条件query_partition_clause决定被除数的值, 如果用户忽略了这个条件, 则计算查询结果中所有记录的汇总值.
用户不能使用其他分析函数或者ratio_to_report作为分析函数ratio_to_report的参数expr, 也就是说这个函数不能循环使用. 但我们可以使用其他普通函数作为这个分析函数的查询结果.

计算每个员工的工资在部门中的所占的比例

SQL> col employee_name for a35
SQL>
SQL> select department_id,
  2         first_name || ' ' || last_name employee_name,
  3         hire_date,
  4         salary,
  5         ratio_to_report(salary) over(partition by department_id) as ratio_salary   --over()每个员工占全体员工的百分比
  6    from employees
  7   where department_id <= 40;

DEPARTMENT_ID EMPLOYEE_NAME                       HIRE_DATE               SALARY RATIO_SALARY
------------- ----------------------------------- ------------------- ---------- ------------
           10 Jennifer Whalen                     2003-09-17 00:00:00       4400            1
           20 Michael Hartstein                   2004-02-17 00:00:00      13000   .684210526
           20 Pat Fay                             2005-08-17 00:00:00       6000   .315789474
           30 Den Raphaely                        2002-12-07 00:00:00      11000   .454545455
           30 Alexander Khoo                      2003-05-18 00:00:00       2600   .107438017
           30 Shelli Baida                        2005-12-24 00:00:00       2900   .119834711
           30 Sigal Tobias                        2005-07-24 00:00:00       2600   .107438017
           30 Guy Himuro                          2006-11-15 00:00:00       2600   .107438017
           30 Karen Colmenares                    2007-08-10 00:00:00       2500   .103305785
           40 Susan Mavris                        2002-06-07 00:00:00       6500            1

已选择10行。
发表在 analytic function, SQL | 留下评论

Oracle analytic function-first_value and last_value

First_value返回排序结果集中的第一个值,如果结果集中第一个值是NULL,则函数返回NULL,当然你也可以指定IGNORE NULLS.
示例:

SQL> select department_id,
  2         first_name || ' ' || last_name employee_name,
  3         hire_date,
  4         salary,
  5         first_value(salary) ignore nulls over (partition by department_id order by salary) as sal_lowest_dep
  6    from employees
  7   where department_id <= 40;

DEPARTMENT_ID EMPLOYEE_NAME                       HIRE_DATE               SALARY SAL_LOWEST_DEP
------------- ----------------------------------- ------------------- ---------- --------------
           10 Jennifer Whalen                     2003-09-17 00:00:00       4400           4400
           20 Pat Fay                             2005-08-17 00:00:00       6000           6000
           20 Michael Hartstein                   2004-02-17 00:00:00      13000           6000
           30 Karen Colmenares                    2007-08-10 00:00:00       2500           2500
           30 Alexander Khoo                      2003-05-18 00:00:00       2600           2500
           30 Guy Himuro                          2006-11-15 00:00:00       2600           2500
           30 Sigal Tobias                        2005-07-24 00:00:00       2600           2500
           30 Shelli Baida                        2005-12-24 00:00:00       2900           2500
           30 Den Raphaely                        2002-12-07 00:00:00      11000           2500
           40 Susan Mavris                        2002-06-07 00:00:00       6500           6500
已选择10行。
SQL> select department_id,
  2         first_name || ' ' || last_name employee_name,
  3         hire_date,
  4         salary,
  5         first_value(salary) ignore nulls over (partition by department_id order by salary rows 1 preceding) as sal_lowest_dep---类似lag函数
  6   from employees
  7   where department_id <= 50;

DEPARTMENT_ID EMPLOYEE_NAME                       HIRE_DATE               SALARY SAL_LOWEST_DEP
------------- ----------------------------------- ------------------- ---------- --------------
           10 Jennifer Whalen                     2003-09-17 00:00:00       4400           4400
           20 Pat Fay                             2005-08-17 00:00:00       6000           6000
           20 Michael Hartstein                   2004-02-17 00:00:00      13000           6000
           30 Karen Colmenares                    2007-08-10 00:00:00       2500           2500
           30 Guy Himuro                          2006-11-15 00:00:00       2600           2500
           30 Sigal Tobias                        2005-07-24 00:00:00       2600           2600
           30 Alexander Khoo                      2003-05-18 00:00:00       2600           2600
           30 Shelli Baida                        2005-12-24 00:00:00       2900           2600
           30 Den Raphaely                        2002-12-07 00:00:00      11000           2900
           40 Susan Mavris                        2002-06-07 00:00:00       6500           6500

Last_value返回排序结果集中的最后一个值,如果最后一个值是NULL,则函数返回时NULL,可以使用IGNORE NULLS来处理。

SQL> select department_id,
  2         first_name || ' ' || last_name employee_name,
  3         hire_date,
  4         salary,
  5         last_value(salary) ignore nulls over (partition by department_id order by salary desc) as sal_latest_dep
  6    from employees
  7   where department_id <= 40;

DEPARTMENT_ID EMPLOYEE_NAME                       HIRE_DATE               SALARY SAL_LATEST_DEP
------------- ----------------------------------- ------------------- ---------- --------------
           10 Jennifer Whalen                     2003-09-17 00:00:00       4400           4400
           20 Michael Hartstein                   2004-02-17 00:00:00      13000          13000
           20 Pat Fay                             2005-08-17 00:00:00       6000           6000
           30 Den Raphaely                        2002-12-07 00:00:00      11000          11000
           30 Shelli Baida                        2005-12-24 00:00:00       2900           2900
           30 Alexander Khoo                      2003-05-18 00:00:00       2600           2600
           30 Guy Himuro                          2006-11-15 00:00:00       2600           2600
           30 Sigal Tobias                        2005-07-24 00:00:00       2600           2600
           30 Karen Colmenares                    2007-08-10 00:00:00       2500           2500
           40 Susan Mavris                        2002-06-07 00:00:00       6500           6500

Last_value的窗后函数默认是RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW,按照官方文档的说法,这个默认的特性可能生成非预期结果,为了避免这个情况,可以使用
RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING。

SQL> select department_id,
  2         first_name || ' ' || last_name employee_name,
  3         hire_date,
  4         salary,
  5         last_value(salary) ignore nulls over (partition by department_id order by salary) as sal_last_dep
  6    from employees
  7   where department_id <= 40;

DEPARTMENT_ID EMPLOYEE_NAME                       HIRE_DATE               SALARY SAL_LAST_DEP
------------- ----------------------------------- ------------------- ---------- ------------
           10 Jennifer Whalen                     2003-09-17 00:00:00       4400         4400
           20 Pat Fay                             2005-08-17 00:00:00       6000         6000
           20 Michael Hartstein                   2004-02-17 00:00:00      13000        13000
           30 Karen Colmenares                    2007-08-10 00:00:00       2500         2500
           30 Alexander Khoo                      2003-05-18 00:00:00       2600         2600
           30 Guy Himuro                          2006-11-15 00:00:00       2600         2600
           30 Sigal Tobias                        2005-07-24 00:00:00       2600         2600
           30 Shelli Baida                        2005-12-24 00:00:00       2900         2900
           30 Den Raphaely                        2002-12-07 00:00:00      11000        11000
           40 Susan Mavris                        2002-06-07 00:00:00       6500         6500

已选择10行。

RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW

SQL> select department_id,
  2         first_name || ' ' || last_name employee_name,
  3         hire_date,
  4         salary,
  5         last_value(salary) ignore nulls over (partition by department_id order by salary RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) as sal_last_dep
  6    from employees
  7   where department_id <= 40;

DEPARTMENT_ID EMPLOYEE_NAME                       HIRE_DATE               SALARY SAL_LAST_DEP
------------- ----------------------------------- ------------------- ---------- ------------
           10 Jennifer Whalen                     2003-09-17 00:00:00       4400         4400
           20 Pat Fay                             2005-08-17 00:00:00       6000         6000
           20 Michael Hartstein                   2004-02-17 00:00:00      13000        13000
           30 Karen Colmenares                    2007-08-10 00:00:00       2500         2500
           30 Alexander Khoo                      2003-05-18 00:00:00       2600         2600
           30 Guy Himuro                          2006-11-15 00:00:00       2600         2600
           30 Sigal Tobias                        2005-07-24 00:00:00       2600         2600
           30 Shelli Baida                        2005-12-24 00:00:00       2900         2900
           30 Den Raphaely                        2002-12-07 00:00:00      11000        11000
           40 Susan Mavris                        2002-06-07 00:00:00       6500         6500

已选择10行。

RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING

SQL> select department_id,
  2         first_name || ' ' || last_name employee_name,
  3         hire_date,
  4         salary,
  5         last_value(salary) ignore nulls over (partition by department_id order by salary RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) as sal_last_dep
  6    from employees
  7   where department_id <= 40;

DEPARTMENT_ID EMPLOYEE_NAME                       HIRE_DATE               SALARY SAL_LAST_DEP
------------- ----------------------------------- ------------------- ---------- ------------
           10 Jennifer Whalen                     2003-09-17 00:00:00       4400         4400
           20 Pat Fay                             2005-08-17 00:00:00       6000        13000
           20 Michael Hartstein                   2004-02-17 00:00:00      13000        13000
           30 Karen Colmenares                    2007-08-10 00:00:00       2500        11000
           30 Alexander Khoo                      2003-05-18 00:00:00       2600        11000
           30 Guy Himuro                          2006-11-15 00:00:00       2600        11000
           30 Sigal Tobias                        2005-07-24 00:00:00       2600        11000
           30 Shelli Baida                        2005-12-24 00:00:00       2900        11000
           30 Den Raphaely                        2002-12-07 00:00:00      11000        11000
           40 Susan Mavris                        2002-06-07 00:00:00       6500         6500

已选择10行。
发表在 analytic function, SQL | 留下评论