meta data for this page
Data Manipulation Language (DML)
In this chapter are the additions and improvements that have been added to the SQL data manipulation language subset in Firebird 2.5.
Quick links
RegEx search support using SIMILAR TO
Adriano dos Santos Fernandes
Tracker reference CORE-769.
A new SIMILAR TO predicate is introduced to support regular expressions. The predicate's function is to verify whether a given SQL-standard regular expression matches a string argument. It is valid in any context that accepts Boolean expressions, such as WHERE clauses, CHECK constraints and PSQL IF() tests.
Syntax patterns
<similar predicate> ::= <value> [ NOT ] SIMILAR TO <similar pattern> [ ESCAPE <escape character> ] <similar pattern> ::= <character value expression: regular expression> <regular expression> ::= <regular term> | <regular expression> <vertical bar> <regular term> <regular term> ::= <regular factor> | <regular term> <regular factor> <regular factor> ::= <regular primary> | <regular primary> <asterisk> | <regular primary> <plus sign> | <regular primary> <question mark> | <regular primary> <repeat factor> <repeat factor> ::= <left brace> <low value> [ <upper limit> ] <right brace> <upper limit> ::= <comma> [ <high value> ] <low value> ::= <unsigned integer> <high value> ::= <unsigned integer> <regular primary> ::= <character specifier> | <percent> | <regular character set> | <left paren> <regular expression> <right paren> <character specifier> ::= <non-escaped character> | <escaped character> <regular character set> ::= <underscore> | <left bracket> <character enumeration>... <right bracket> | <left bracket> <circumflex> <character enumeration>... <right bracket> | <left bracket> <character enumeration include>... <circumflex> <character enumeration exclude>... <right bracket> <character enumeration include> ::= <character enumeration> <character enumeration exclude> ::= <character enumeration> <character enumeration> ::= <character specifier> | <character specifier> <minus sign> <character specifier> | <left bracket> <colon> <character class identifier> <colon> <right bracket> <character specifier> ::= <non-escaped character> | <escaped character> <character class identifier> ::= ALPHA | UPPER | LOWER | DIGIT | SPACE | WHITESPACE | ALNUM
Note:
1. <non-escaped character> is any character except <left bracket>, <right bracket>, <left paren>, <right paren>, <vertical bar>, <circumflex>, <minus sign>, <plus sign>, <asterisk>, <underscore>, <percent>, <question mark>, <left brace> and <escape character>.
2. <escaped character> is the <escape character> succeeded by one of <left bracket>, <right bracket>, <left paren>, <right paren>, <vertical bar>, <circumflex>, <minus sign>, <plus sign>, <asterisk>, <underscore>, <percent>, <question mark>, <left brace> or <escape character>.
Table 10.1. Character class identifiers
Identifier | Description | Note |
---|---|---|
ALPHA | All characters that are simple latin letters (a-z, A-Z). | Includes latin letters with accents when using accent-insensitive collation. |
UPPER | All characters that are simple latin uppercase letters (A-Z). | Includes lowercase latters when using case-insensitive collation. |
LOWER | All characters that are simple latin lowercase letters (a-z). | Includes uppercase latters when using case-insensitive collation. |
DIGIT | All characters that are numeric digits (0-9). | |
SPACE | All characters that are the space character (ASCII 32). | |
WHITESPACE | All characters that are whitespaces (vertical tab (9), newline (10), horizontal tab (11), carriage return (13), formfeed (12), space (32)). | |
ALNUM | All characters that are simple latin letters (ALPHA) or numeric digits (DIGIT). |
Usage Guide
Return true for a string that matches <regular expression> or <regular term>:
<regular expression> <vertical bar> <regular term> 'ab' SIMILAR TO 'ab|cd|efg' -- true 'efg' SIMILAR TO 'ab|cd|efg' -- true 'a' SIMILAR TO 'ab|cd|efg' -- false
Match zero or more occurrences of <regular primary>: <regular primary> <asterisk>
'' SIMILAR TO 'a*' -- true 'a' SIMILAR TO 'a*' -- true 'aaa' SIMILAR TO 'a*' -- true
Match one or more occurrences of <regular primary>: <regular primary> <plus sign>
'' SIMILAR TO 'a+' -- false 'a' SIMILAR TO 'a+' -- true 'aaa' SIMILAR TO 'a+' -- true
Match zero or one occurrence of <regular primary>: <regular primary> <question mark>
'' SIMILAR TO 'a?' -- true 'a' SIMILAR TO 'a?' -- true 'aaa' SIMILAR TO 'a?' -- false
Match exact <low value> occurrences of <regular primary>: <regular primary> <left brace> <low value> <right brace>
'' SIMILAR TO 'a{2}' -- false 'a' SIMILAR TO 'a{2}' -- false 'aa' SIMILAR TO 'a{2}' -- true 'aaa' SIMILAR TO 'a{2}' -- false
Match <low value> or more occurrences of <regular primary>: <regular primary> <left brace> <low value> <comma> <right brace>
'' SIMILAR TO 'a{2,}' -- false 'a' SIMILAR TO 'a{2,}' -- false 'aa' SIMILAR TO 'a{2,}' -- true 'aaa' SIMILAR TO 'a{2,}' -- true
Match <low value> to <high value> occurrences of <regular primary> <regular primary> <left brace> <low value> <comma> <high value> <right brace>:
'' SIMILAR TO 'a{2,4}' -- false 'a' SIMILAR TO 'a{2,4}' -- false 'aa' SIMILAR TO 'a{2,4}' -- true 'aaa' SIMILAR TO 'a{2,4}' -- true 'aaaa' SIMILAR TO 'a{2,4}' -- true 'aaaaa' SIMILAR TO 'a{2,4}' -- false
Match any (non-empty) character: <underscore>
'' SIMILAR TO '_' -- false 'a' SIMILAR TO '_' -- true '1' SIMILAR TO '_' -- true 'a1' SIMILAR TO '_' -- false
Match a string of any length (including empty strings): <percent>
'' SIMILAR TO '%' -- true 'az' SIMILAR TO 'a%z' -- true 'a123z' SIMILAR TO 'a%z' -- true 'azx' SIMILAR TO 'a%z' -- false
Group a complete <regular expression> to use as one single <regular primary> as a sub-expression: <left paren> <regular expression> <right paren>
'ab' SIMILAR TO '(ab){2}' -- false 'aabb' SIMILAR TO '(ab){2}' -- false 'abab' SIMILAR TO '(ab){2}' -- true
Match a character identical to one of <character enumeration>: <left bracket> <character enumeration>… <right bracket>
'b' SIMILAR TO '[abc]' -- true 'd' SIMILAR TO '[abc]' -- false '9' SIMILAR TO '[0-9]' -- true '9' SIMILAR TO '[0-8]' -- false
Match a character not identical to one of <character enumeration>: <left bracket> <circumflex> <character enumeration>… <right bracket>
'b' SIMILAR TO '[^abc]' – false 'd' SIMILAR TO '[^abc]' – true Match a character identical to one of <character enumeration include> but not identical to one of <character enumeration exclude>: <left bracket> <character enumeration include>… <circumflex> <character enumeration exclude>…
'3' SIMILAR TO '[[:DIGIT:]^3]' -- false '4' SIMILAR TO '[[:DIGIT:]^3]' -- true
Match a character identical to one character included in <character class identifier>. Refer to the table of Character class identifiers, above. May be used with <circumflex> to invert the logic (see above): <left bracket> <colon> <character class identifier> <colon> <right bracket>
'4' SIMILAR TO '[[:DIGIT:]]' -- true 'a' SIMILAR TO '[[:DIGIT:]]' -- false '4' SIMILAR TO '[^[:DIGIT:]]' -- false 'a' SIMILAR TO '[^[:DIGIT:]]' -- true
Examples
create table department ( number numeric(3) not null, name varchar(25) not null, phone varchar(14) check (phone similar to '\([0-9]{3}\) [0-9]{3}\-[0-9]{4}' escape '\') ); insert into department values ('000', 'Corporate Headquarters', '(408) 555-1234'); insert into department values ('100', 'Sales and Marketing', '(415) 555-1234'); insert into department values ('140', 'Field Office: Canada', '(416) 677-1000'); insert into department values ('600', 'Engineering', '(408) 555-123'); -- check constraint violation select * from department where phone not similar to '\([0-9]{3}\) 555\-%' escape '\';
Hex literal support
Bill Oliver
Adriano dos Santos Fernandes
Tracker reference CORE-1760.
Support for hexadecimal numeric and binary string literals has been introduced.
Syntax patterns
<numeric hex literal> ::= { 0x | 0X } <hexit> [ <hexit>... ] <binary string literal> ::= { x | X } <quote> [ { <hexit> <hexit> }... ] <quote> <digit> ::= 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 <hexit> ::= <digit> | A | B | C | D | E | F | a | b | c | d | e | f
Numeric hex literals
- The number of <hexit> in the string cannot exceed 16.
Tip: That means 0xF0000000 is -268435456 and 0x0F0000000 is 4026531840.
Binary string literals
The resulting string is defined as a CHAR(n/2) CHARACTER SET OCTETS, where n is the number of <hexit>.
Examples
select 0x10, cast('0x0F0000000' as bigint) from rdb$database; select x'deadbeef' from rdb$database;
Important change to GEN_UUID() function
Adriano dos Santos Fernandes
Prior to Firebird 2.5.2, the built-in function GEN_UUID() was returning completely random strings, making it non-compliant with the RFC-4122 (UUID specification). From Firebird 2.5.2 forward, GEN_UUID() returns a compliant UUID version 4 string, where some bits are reserved and the others are random.
Note: The string format of a compliant UUID is
XXXXXXXX-XXXX-4XXX-YXXX-XXXXXXXXXXXX
where 4 is fixed (version) and Y is 8, 9, A or B.
See Tracker item CORE-3238.
New UUID conversion functions
Adriano dos Santos Fernandes
Tracker references CORE-1656 and CORE-1682.
Two new built-in functions, UUID_TO_CHAR and CHAR_TO_UUID, enable conversion between a UUID in the form of a CHAR(16) OCTETS string and the RFC4122-compliant form.
Important for big-Endian servers:
It was discovered that CHAR_TO_UUID and UUID_TO_CHAR worked incorrectly in Firebird 2.5 and 2.5.1 on big-Endian servers, where bytes or characters were swapped and went into the wrong positions when converting. The bug was fixed in versions 2.5.2 and above: see Tracker item CORE-3887. However, it means that, from v.2.5.2 onward, CHAR_TO_UUID and UUID_TO_CHAR return different values than in the earlier versions, for the same input parameter.
CHAR_TO_UUID()
The function CHAR_TO_UUID() converts the CHAR(32) ASCII representation of a UUID (XXXXXXXXXXXX-XXXX-XXXX-XXXXXXXXXXXX) to the CHAR(16) OCTETS representation, optimized for storage.
Syntax model
CHAR_TO_UUID( <string> )
Example
select char_to_uuid('93519227-8D50-4E47-81AA-8F6678C096A1') from rdb$database;
UUID_TO_CHAR()
The function UUID_TO_CHAR() converts a CHAR(16) OCTETS UUID (as returned by the GEN_UUID() function) to the CHAR(32) ASCII representation (XXXXXXXX-XXXX-XXXX-XXXXXXXXXXXXXXXX).
Syntax model
UUID_TO_CHAR( <string> )
Example
select uuid_to_char(gen_uuid()) from rdb$database;
SOME_COL = ? OR ? IS NULL predication
Adriano dos Santos Fernandes
Tracker reference CORE-2298.
By popular request, particularly from Delphi programmers, support has been implemented for a predication that is able to OR test both the equivalence between a column and a parameter and whether the value passed to the parameter is NULL. This construct is often desired as a way to avoid the need to prepare one query to request a filtered result set and another for the same query without the filter.
Users of Delphi and other programming interfaces that apply client-side object names to parameters wanted the ability for the DSQL engine to recognise a usage like the following:
WHERE col1 = :param1 OR :param1 IS NULL
At the API level, the language interface translates the query to
WHERE col1 = ? OR ? IS NULL
That presented two problems:
1. While the programmer treated the parameter :param1 as though it were a single variable with two references, the API could not: it is presented with two parameters.
2. The second parameter is of an unknown data type and the program has no way to assign to it.
What was needed to solve this problem was to introduce a new data type to handle the ? is NULL condition and teach Firebird to do the right thing when it received such a request.
The implementation works like this. To resolve the first problem, the request must supply two parameters (for our Delphi example):
WHERE col1 = :param1 OR :param2 IS NULL
- If param1 is not NULL, the language interface is required to assign the correct value for the first parameter, set the XSQLVAR.sqlind to NOT NULL and leave XSQLVAR.sqldata NULL.
- If param2 is NULL, the language interface is required to set the XSQLVAR.sqlind of both parameters to NULL and leave the XSQLVAR.sqldata NULL.
In other words, for the parameter ( ? ) in ? IS NULL:
- XSQLVAR.sqlind should be set in accordance with NULL/NON-NULL state of the parameter. This is the type of parameter that is described by the new constant SQL_NULL.
- The XSQLVAR.sqldata of a SQL_NULL type of parameter should always be passed by the client application as a NULL pointer and should never be accessed.
NULL specified in the output set: When NULL is specified as an output constant (select NULL from …), it continues to be described as CHAR(1), rather than by SQL_NULL. That may change in a future version.
Extension to LIST() function
Adriano dos Santos Fernandes
Tracker reference CORE-1453.
A string expression is now allowed as the delimiter argument of the LIST() function.
Example
SELECT DISCUSSION_ID, LIST(COMMMENT, ASCII_CHAR(13)) FROM COMMENTS GROUP BY DISCUSSION_ID;
Extension to DATEADD and DATEDIFF() functions
Adriano dos Santos Fernandes
The WEEK unit was introduced for functions DATEADD and DATEDIFF.
MILLISECOND, SECOND, MINUTE and HOUR units are no longer invalid units to use with DATE arguments.
BIN_NOT() function added
Adriano dos Santos Fernandes
Completing the set of built-in binary functions added in v.2.1, new function BIN_NOT() returns the result of a bitwise NOT operation performed on its argument.
Syntax pattern
BIN_NOT( <number> )
Example
select bin_not(flags) from x;
Write to temporary tables in a read-only database
Vladyslav Khorsun
(V.2.5.1) Write operations to global temporary tables in a read-only database are enabled.
Tracker reference CORE-3399.
Optimizer improvements
Changes in optimizer logic that address recognised problems include:
CROSS JOIN logic (D. Yemanov)
When a CROSS JOIN involved an empty table, the optimizer had no special logic to detect that the query was futile and return the empty set immediately. That shortcut logic has now been implemented.
Tracker reference CORE-2200.
Note: The same change was implemented in v.2.1.2.
Derived tables (A. dos Santos Fernandes)
The limit on the number of contexts available when using derived tables has been raised.
Tracker reference CORE-2029.
Timing of DEFAULT evaluation (A. dos Santos Fernandes)
Under rare conditions, the early evaluation of a DEFAULT value or expression defined for a column might give rise to a confused situation regarding the evaluation of an input value supplied for that column. The possibility was addressed by deferring the evaluation of DEFAULT and not actually performing the evaluation at all unless it was actually necessary.
Tracker reference CORE-1842.
Index use for NOT IN searches (A. dos Santos Fernandes)
Better performance has been achieved for the NOT IN predicate by enabling the use of an index.
Tracker reference CORE-1137.
Undo log memory consumption (D. Yemanov)
Excessive memory consumption by the Undo log after a lengthy series of updates in a single transaction has been avoided.
Tracker reference CORE-1477.
Other improvements
Other changes to smooth out the little annoyances include:
FREE_IT error detection (A. dos Santos Fernandes)
Previously, a UDF declared with FREE_IT would crash if the pointer returned had not been allocated by the ib_util_malloc() function. Now, such a condition is detected, an exception is thrown and the pointer remains in its allocated state.
Tracker reference CORE-1937.
Expression evaluation not supported message improved (C. Valderrama)
A number of secondary GDS codes were introduced to provide more details about an operation that fails with an Expression evaluation not supported exception, for example:
'Argument for @1 in dialect 1 must be string or numeric' 'Strings cannot be added to or subtracted from DATE or TIME types' 'Invalid data type for subtraction involving DATE, TIME or TIMESTAMP types' etc.
These detailed messages follow the GDS code for the isc_expression_eval_err (expression evaluation not supported) error in the status vector.
Tracker reference CORE-1799.