Wednesday, December 29, 2010

SUM with NULL Values in SQL Server

Follow on the SUM with NULL Values in Teradata post, I wanted to check the same thing in SQL Server.

When you SUM a value that is nullable, does it work correctly if there are null values? Here's the test and the result. The answer is yes, it works fine here too; it just requires some change in syntax.

Note that if you set up your flag column as a bit column, you have to add a CONVERT in the SUM since you can't add bit columns.

It is also interesting to note that SQL Server throws the following warning after the results of the first query: "Warning: Null value is eliminated by an aggregate or other SET operation."

DECLARE @rgWork1 TABLE (
    
client      INT         NOT NULL,
    
flag        bit 

);

INSERT INTO @rgWork1 (client, flag) VALUES (1, 1); 

INSERT INTO @rgWork1 (client, flag) VALUES (2, 0);
INSERT INTO @rgWork1 (client, flag) VALUES (3, NULL);

SELECT SUM(CONVERT(INT, flag)) AS FlagSum 

FROM @rgWork1;

SELECT SUM(CONVERT(INT, flag)) AS FlagSum 

FROM @rgWork1
WHERE flag = 1;

SELECT SUM(CONVERT(INT, flag)) AS FlagSum

FROM @rgWork1 
WHERE flag IS NOT NULL;

SELECT SUM(CONVERT(INT, flag)) AS FlagSum 

FROM @rgWork1
WHERE (flag = 1 OR flag IS NOT NULL);

All of these queries return the correct answer of 1.

SUM with NULL Values in Teradata

In Teradata, when you SUM a value that is nullable, does it work correctly if there are null values? Here's the test and the result. The answer is yes, it works fine.

CREATE VOLATILE TABLE rgWork1 (
    
client      INT         NOT NULL,
    
flag        byteint 

) UNIQUE PRIMARY INDEX (client) 
ON COMMIT PRESERVE ROWS;

INSERT INTO rgWork1 (client, flag) VALUES (1, 1); 

INSERT INTO rgWork1 (client, flag) VALUES (2, 0);
INSERT INTO rgWork1 (client, flag) VALUES (3, NULL);

SELECT SUM(flag)

FROM rgWork1;
 

SELECT SUM(flag)
FROM rgWork1 

WHERE flag = 1;
 

SELECT SUM(flag)
FROM rgWork1

WHERE flag IS NOT NULL;
 

SELECT SUM(flag)
FROM rgWork1

WHERE (flag = 1 OR flag IS NOT NULL);

All of these queries return the correct answer of 1.

Volatile (Temp) Tables in Teradata

I keep having to look this up, so I'm going to record it here. This is the syntax for creating temp tables (volatile tables) in Teradata.

CREATE VOLATILE TABLE rgWork1 (
    
client      INT         NOT NULL,
    
flag        byteint
)
UNIQUE PRIMARY INDEX (client)
ON COMMIT PRESERVE ROWS;

DROP TABLE rgWork1;


The table is linked to the session, so you don't have to drop it if you're going to log out.

Using UNIQUE PRIMARY INDEX (...) or PRIMARY INDEX (...) is optional. It works fine without it, but leaving it off can affect performance.

However, if you leave off ON COMMIT PRESERVE ROWS, you can INSERT rows, but they won't be there when you SELECT from the table.

Tuesday, December 14, 2010

Teradata Date Dimension

In our project, we needed a date dimension. I did a quick check on-line, but most of the examples were not specific to Teradata.

Teradata has views and an underlying table that made this relatively easy:
  • sys_calendar.CALENDAR: This is just a simple view on top of the CALENDARTMP view.
  • sys_calendar.CALENDARTMP: This is a set of calculations on top of the CALBASICS view.
  • sys_calendar.CALBASICS: This is a set of calculations on top of the CALDATES table.
  • sys_calendar.CALDATES: This is a single-column table (cdate) with 73,414 rows and dates from 01/01/1900 to 12/31/2100.

Here is the actual code I used. I'm sure there are more elegant ways to pull this data, but this was straightforward and maintainable.

(Code updated 2010-12-16. All columns are now NOT NULL, and calculation for week_end_dt has been updated to deal with last six days of internal calendar.)

DROP TABLE leap;
CREATE VOLATILE TABLE leap (
    
leap_year_nbr   INT
) ON COMMIT PRESERVE ROWS;

INSERT INTO leap (leap_year_nbr)
SELECT year_of_calendar
FROM sys_calendar.CALENDAR
WHERE month_of_year = 2
  
AND day_of_month  = 29;

--SELECT * FROM leap;
--SELECT COUNT(*) AS LeapCt FROM leap;

-- sys_calendar.calendar
-- 73,414 entries (01/01/1900 to 12/31/2100)
-- ================================================
-- date_dim
-- ================================================
DROP TABLE date_dim;
CREATE TABLE date_dim (
    
calendar_dt                     date                        NOT NULL,
    
year_nbr                        smallint                    NOT NULL,
    
day_of_week_nbr                 byteint                     NOT NULL,
    
day_of_month_nbr                byteint                     NOT NULL,
    
day_of_year_nbr                 smallint                    NOT NULL,
    
week_of_month_nbr               byteint                     NOT NULL,
    
week_of_year_nbr                byteint                     NOT NULL,
    
month_of_quarter                byteint                     NOT NULL,
    
month_of_year                   byteint                     NOT NULL,
    
month_of_calendar               smallint                    NOT NULL,
    
quarter_of_year                 byteint                     NOT NULL,
    
day_of_week_nm                  CHAR(9)                     NOT NULL,
    
day_of_week_abbr2               CHAR(2)                     NOT NULL,
    
day_of_week_abbr3               CHAR(3)                     NOT NULL,
    
month_nm                        CHAR(15)                    NOT NULL,
    
month_abbr2                     CHAR(2)                     NOT NULL,
    
month_abbr3                     CHAR(3)                     NOT NULL,
    
week_end_dt                     date                        NOT NULL,
    
end_of_month_flag               CHAR(1)                     NOT NULL,
    
first_of_quarter_flag           CHAR(1)                     NOT NULL,
    
end_of_quarter_flag             CHAR(1)                     NOT NULL,
    
end_of_year_flag                CHAR(1)                     NOT NULL,
    
weekday_flag                    CHAR(1)                     NOT NULL,
    
weekend_flag                    CHAR(1)                     NOT NULL,
    
leap_year_flag                  CHAR(1)                     NOT NULL
)
UNIQUE PRIMARY INDEX (calendar_dt);

INSERT INTO date_dim (
    
calendar_dt
    
,year_nbr
    
,day_of_week_nbr
    
,day_of_month_nbr
    
,day_of_year_nbr
    
,week_of_month_nbr
    
,week_of_year_nbr
    
,month_of_quarter
    
,month_of_year
    
,month_of_calendar
    
,quarter_of_year
    
,day_of_week_nm
    
,day_of_week_abbr2
    
,day_of_week_abbr3
    
,month_nm
    
,month_abbr2
    
,month_abbr3
    
,week_end_dt
    
,end_of_month_flag
    
,first_of_quarter_flag
    
,end_of_quarter_flag
    
,end_of_year_flag
    
,weekday_flag
    
,weekend_flag
    
,leap_year_flag
)
SELECT
    
--TOP 1000
    
calendar_date,
    
year_of_calendar,
    
day_of_week,
    
day_of_month,
    
day_of_year,
    
week_of_month,
    
week_of_year,
    
month_of_quarter,
    
month_of_year,
    
month_of_calendar,
    
quarter_of_year,
    
CASE day_of_week            -- day_of_week_nm
        
WHEN 1 THEN 'Sunday'
        
WHEN 2 THEN 'Monday'
        
WHEN 3 THEN 'Tuesday'
        
WHEN 4 THEN 'Wednesday'
        
WHEN 5 THEN 'Thursday'
        
WHEN 6 THEN 'Friday'
        
WHEN 7 THEN 'Saturday'
    
END,
    
CASE day_of_week            -- day_of_week_abbr2
        
WHEN 1 THEN 'Su'
        
WHEN 2 THEN 'Mo'
        
WHEN 3 THEN 'Tu'
        
WHEN 4 THEN 'We'
        
WHEN 5 THEN 'Th'
        
WHEN 6 THEN 'Fr'
        
WHEN 7 THEN 'Sa'
    
END,
    
CASE day_of_week            -- day_of_week_abbr3
        
WHEN 1 THEN 'Sun'
        
WHEN 2 THEN 'Mon'
        
WHEN 3 THEN 'Tue'
        
WHEN 4 THEN 'Wed'
        
WHEN 5 THEN 'Thu'
        
WHEN THEN 'Fri'
        
WHEN 7 THEN 'Sat'
    
END,
    
CASE month_of_year          -- month_nm
        
WHEN  1 THEN 'January'
        
WHEN  2 THEN 'February'
        
WHEN  3 THEN 'March'
        
WHEN  4 THEN 'April'
        
WHEN  5 THEN 'May'
        
WHEN  6 THEN 'June'
        
WHEN  7 THEN 'July'
        
WHEN  8 THEN 'August'
        
WHEN  9 THEN 'September'
        
WHEN 10 THEN 'October'
        
WHEN 11 THEN 'November'
        
WHEN 12 THEN 'December'
    
END,
    
CASE month_of_year          -- month_abbr2
        
WHEN  1 THEN 'Ja'
        
WHEN  2 THEN 'Fe'
        
WHEN  3 THEN 'Mr'
        
WHEN  4 THEN 'Ap'
        
WHEN  5 THEN 'My'
        
WHEN  6 THEN 'Jn'
        
WHEN  7 THEN 'Jl'
        
WHEN  8 THEN 'Au'
        
WHEN  9 THEN 'Se'
        
WHEN 10 THEN 'Oc'
        
WHEN 11 THEN 'No'
        
WHEN 12 THEN 'De'
    
END,
    
CASE month_of_year          -- month_abbr3
        
WHEN  1 THEN 'Jan'
        
WHEN  2 THEN 'Feb'
        
WHEN  3 THEN 'Mar'
        
WHEN  4 THEN 'Apr'
        
WHEN  5 THEN 'May'
        
WHEN  6 THEN 'Jun'
        
WHEN  7 THEN 'Jul'
        
WHEN  8 THEN 'Aug'
        
WHEN  9 THEN 'Sep'
        
WHEN 10 THEN 'Oct'
        
WHEN 11 THEN 'Nov'
        
WHEN 12 THEN 'Dec'
    
END,
    
CASE    -- week_end_dt
        
WHEN calendar_date BETWEEN DATE '2100-12-26' AND DATE '2100-12-31' THEN DATE '2101-01-01'   -- For these dates, the calculations run off the end of the internal calendar
        
WHEN day_of_week = 1 THEN (SELECT calendar_date FROM sys_calendar.CALENDAR AS t WHERE t.day_of_calendar = c.day_of_calendar + 6)
        
WHEN day_of_week = 2 THEN (SELECT calendar_date FROM sys_calendar.CALENDAR AS t WHERE t.day_of_calendar = c.day_of_calendar + 5)
        
WHEN day_of_week = 3 THEN (SELECT calendar_date FROM sys_calendar.CALENDAR AS t WHERE t.day_of_calendar = c.day_of_calendar + 4)
        
WHEN day_of_week = 4 THEN (SELECT calendar_date FROM sys_calendar.CALENDAR AS t WHERE t.day_of_calendar = c.day_of_calendar + 3)
        
WHEN day_of_week = 5 THEN (SELECT calendar_date FROM sys_calendar.CALENDAR AS t WHERE t.day_of_calendar = c.day_of_calendar + 2)
        
WHEN day_of_week = 6 THEN (SELECT calendar_date FROM sys_calendar.CALENDAR AS t WHERE t.day_of_calendar = c.day_of_calendar + 1)
        
WHEN day_of_week = 7 THEN calendar_date
        
ELSE DATE '9999-12-31'
    
END,
    
CASE    -- end_of_month_flag
        
WHEN month_of_year =  1 AND day_of_month = 31 THEN 'Y'
        
WHEN month_of_year =  2 AND (l.leap_year_nbr IS NOT NULL AND day_of_month = 29) THEN 'Y'
        
WHEN month_of_year =  2 AND (l.leap_year_nbr IS     NULL AND day_of_month = 28) THEN 'Y'
        
WHEN month_of_year =  3 AND day_of_month = 31 THEN 'Y'
        
WHEN month_of_year =  4 AND day_of_month = 30 THEN 'Y'
        
WHEN month_of_year =  5 AND day_of_month = 31 THEN 'Y'
        
WHEN month_of_year =  6 AND day_of_month = 30 THEN 'Y'
        
WHEN month_of_year =  7 AND day_of_month = 31 THEN 'Y'
        
WHEN month_of_year =  8 AND day_of_month = 31 THEN 'Y'
        
WHEN month_of_year =  9 AND day_of_month = 30 THEN 'Y'
        
WHEN month_of_year = 10 AND day_of_month = 31 THEN 'Y'
        
WHEN month_of_year = 11 AND day_of_month = 30 THEN 'Y'
        
WHEN month_of_year = 12 AND day_of_month = 31 THEN 'Y'
        
ELSE 'N'
    
END,
    
CASE    -- first_of_quarter_flag
        
WHEN month_of_year =  1 AND day_of_month = 1 THEN 'Y'
        
WHEN month_of_year =  4 AND day_of_month = 1 THEN 'Y'
        
WHEN month_of_year =  7 AND day_of_month = 1 THEN 'Y'
        
WHEN month_of_year = 10 AND day_of_month = 1 THEN 'Y'
    
ELSE 'N' END,
    
CASE    -- end_of_quarter_flag
        
WHEN month_of_year =  3 AND day_of_month = 31 THEN 'Y'
        
WHEN month_of_year =  6 AND day_of_month = 30 THEN 'Y'
        
WHEN month_of_year =  9 AND day_of_month = 30 THEN 'Y'
        
WHEN month_of_year = 12 AND day_of_month = 31 THEN 'Y'
    
ELSE 'N' END,
    
CASE WHEN month_of_year = 12 AND day_of_month = 31 THEN 'Y' ELSE 'N' END,   -- end_of_year_flag
    
CASE day_of_week WHEN 1 THEN 'N' WHEN 7 THEN 'N' ELSE 'Y' END,              -- weekday_flag
    
CASE day_of_week WHEN 1 THEN 'Y' WHEN 7 THEN 'Y' ELSE 'N' END,              -- weekend_flag
    
CASE    -- leap_year_flag
        
WHEN l.leap_year_nbr IS NOT NULL THEN 'Y' ELSE 'N'
    
END
FROM
sys_calendar.CALENDAR  AS c
LEFT OUTER JOIN leap        AS l ON c.year_of_calendar = l.leap_year_nbr
;

Tuesday, December 07, 2010

Using Teradata's CURRENT_TIMESTAMP(n)

CURRENT_TIMESTAMP can take an argument stating the precision. This allows the result to easily be passed to a column with matching precision.

CREATE MULTISET TABLE test1,
    
NO FALLBACK,
    
NO BEFORE JOURNAL,
    
NO AFTER JOURNAL (
    
id          INT           NOT NULL,
    
start_dtm0  TIMESTAMP(0)  NOT NULL DEFAULT CURRENT_TIMESTAMP(0),
    
start_dtm1  TIMESTAMP(1)  NOT NULL DEFAULT CURRENT_TIMESTAMP(1),
    
start_dtm2  TIMESTAMP(2)  NOT NULL DEFAULT CURRENT_TIMESTAMP(2),
    
start_dtm3  TIMESTAMP(3)  NOT NULL DEFAULT CURRENT_TIMESTAMP(3),
    
start_dtm4  TIMESTAMP(4)  NOT NULL DEFAULT CURRENT_TIMESTAMP(4),
    
start_dtm5  TIMESTAMP(5)  NOT NULL DEFAULT CURRENT_TIMESTAMP(5),
    
start_dtm6  TIMESTAMP(6)  NOT NULL DEFAULT CURRENT_TIMESTAMP(6)
);


Results:

Column Value
start_dtm0 12/07/2010 3:10:30
start_dtm1 12/07/2010 3:10:30.8
start_dtm2 12/07/2010 3:10:30.82
start_dtm3 12/07/2010 3:10:30.820
start_dtm4 12/07/2010 3:10:30.8200
start_dtm5 12/07/2010 3:10:30.82000
start_dtm6 12/07/2010 3:10:30.820000

Does Teradata Support BIT Columns? Yes and No.

In Teradata 13.0, I can create a column with a type of bit.

CREATE MULTISET TABLE test1,
    
NO FALLBACK,
    
NO BEFORE JOURNAL,
    
NO AFTER JOURNAL 

(
    
id         INT  NOT NULL,
    
bitColumn  bit

);

But did it really create a bit column? No.

Query

SELECT RequestText
FROM DBC.Tables
WHERE DatabaseName = 'rgarrison'
  
AND TableName    = 'test1';


Result

CREATE MULTISET TABLE test1,
    NO FALLBACK,
    NO BEFORE JOURNAL,
    NO AFTER JOURNAL
(
    id INT NOT NULL,
    bitColumn BYTEINT FORMAT '9'
);

Query

SELECT ColumnType, ColumnLength
FROM DBC.Columns
WHERE DatabaseName = 'rgarrison'
  
AND TableName    = 'test1'
  
AND ColumnName   = 'bitColumn';


Result

ColumnType ColumnLength
I1 1

So, Teradata will allow you to specify bit as a datatype, but it will silently translate that request to byteint.