<<toc>>
Dynamic columns allows you to have a different set of "virtual columns" for
each row in your table.
You can at any time add or remove columns from a row.
This allows you to to solve problems like a store application, where you have a
lot of different things <<entity>>mdash<</entity>> like t-shirts and phones
<<entity>>mdash<</entity>> and you want to store different attributes for each
item. This is something that traditionally is very hard to do in a relational
database.
Dynamic columns works by storing the extra columns in a blob and having a small
set of functions to manipulate it. It was first implemented in
[[what-is-mariadb-53|MariaDB 5.3]].
You can either manipulate the dynamic columns in SQL or in the client.
The original architecture of dynamic columns was done in MWL#34. This includes
information on how the data is stored.
The first implementation of dynamic columns is meant to be highly efficient for
programs that generate sql code (which is what we belive most store
applications today that try to handle objects with different attributes use).
This is why access to dynamic columns is via by numbers. See the TODO
section for more information on the future roadmap of dynamic columns.
== SQL functions
=== Creating a dynamic column object
<<code>>
COLUMN_CREATE(column_nr, value [as type], [column_nr, value [as type]]...)
<</code>>
* '##type##' is not required and is mostly there for testing except in cases
like date/time/datetime where you can't create literals of the type you need.
=== Adding or updating dynamic columns
<<code>>
COLUMN_ADD(dynamic_col, column_nr, value [as type], [column_nr, value [as type]]...)
<</code>>
* '##dynamic_col##' must either be data created by ##COLUMN_CREATE()## or an
empty string.
* If '##dynamic_col##' is ##NULL## then the function returns ##NULL##.
* If you add a column that already exists, the new value will replace it.
* If you add ##NULL## for a column, the entry will be deleted.
* The type is not required and is there mostly for testing.
Typical usage:
<<sql>>
UPDATE t1 set dynamic_col=COLUMN_ADD(dynamic_col, 5, "test") WHERE id=1;
<</sql>>
The above adds or replaces ##dynamic_column## '##5##', with value ##"test"##,
stored in the '##dynamic_col##' column, for the row with ##id=1##
The reason you have ##dynamic_col## twice is the ##COLUMN_ADD()## is a normal
function, like ##[[concat|CONCAT()]]## and you want to replace the old value in
the '##dynamic_col##' value with the new value from the ##COLUMN_ADD()##
function.
=== Deleting a dynamic column
<<code>>
COLUMN_DELETE(dynamic_column, column_nr, column_nr...);
<</code>>
=== Querying a dynamic column
<<code>>
COLUMN_GET(dynamic_column, column_nr as type)
<</code>>
Type here can be one of the same ones that you would use in ##[[cast|CAST]]##
or ##[[convert|CONVERT]]##:
* ##BINARY[(N)]##
* ##CHAR[(N)]##
* ##DATE##
* ##DATETIME[(D)]##
* ##DECIMAL[(M[,D])]##
* ##DOUBLE[(M[,D])]##
* ##INTEGER##
* ##SIGNED [INTEGER]##
* ##TIME[(D)]##
* ##UNSIGNED [INTEGER]##
The type is required because SQL is a strongly typed language and the rest of
the functions (and the client) need to know what the type is before fetching
any data.
If the dynamic column has a different type than the requested one, the value
will be converted to the requested type.
Note that if you are using ##CHAR## or ##BINARY## without a length, MariaDB
will handle it as a blob with length up to ##536870911## (the max length of
data in a dynamic column). This buffer is a dynamically allocated in the server
but because some clients may allocate buffers up to the max field length, it's
best to always supply a length.
==== Checking if a column exists ===
<<code>>
COLUMN_EXISTS(dynamic_column, column_nr);
<</code>>
This returns ##1## if the column exists, ##0## if not, and ##NULL## if
##//dynamic_column//## is ##NULL##.
<<code>>
COLUMN_LIST(dynamic_column);
<</code>>
Returns a string of all columns in the ##//dynamic_column//## separated with
'##,##'.
== Example Using SQL with Dynamic Columns
First create the table. Dynstr will be our place where we store the
dynamic columns:
<<sql>>
create table t1 (id int auto_increment primary key,
name varchar(40),
type enum ("shirt", "phone", "computer"),
price decimal(10,2),
dynstr mediumblob);
<</sql>>
Insert some random items, each with some common and own attributes.
<<sql>>
insert into t1 (name, type, price, dynstr) values
("Funny shirt", "shirt", 10.0, COLUMN_CREATE(1, "blue", 10, "XL")),
("nokia", "phone", 649, COLUMN_CREATE(1, "black", 2, "touchscreen")),
("htc Desire hd", "phone", 579, COLUMN_CREATE(1, "black", 3, "Android")),
("BM/Lenovo Thinkpad X60s", "computer", 419, COLUMN_CREATE(1, "black", 3, "Linux"));
<</sql>>
Lets check the data:
<<sql>>
select id, name, type, price, length(dynstr) as len, column_list(dynstr) as list from t1;
<</sql>>
<<code>>
+----+-------------------------+----------+--------+------+------+
| id | name | type | price | len | list |
+----+-------------------------+----------+--------+------+------+
| 1 | Funny shirt | shirt | 10.00 | 17 | 1,10 |
| 2 | nokia | phone | 649.00 | 27 | 1,2 |
| 3 | htc Desire hd | phone | 579.00 | 23 | 1,3 |
| 4 | BM/Lenovo Thinkpad X60s | computer | 419.00 | 21 | 1,3 |
+----+-------------------------+----------+--------+------+------+
<</code>>
<<sql>>
SELECT name FROM t1 WHERE COLUMN_GET(dynstr, 1 as char(10)) = "black";
<</sql>>
<<code>>
+-------------------------+
| name |
+-------------------------+
| nokia |
| htc Desire hd |
| BM/Lenovo Thinkpad X60s |
+-------------------------+
<</code>>
<<sql>>
SELECT name, COLUMN_GET(dynstr, 1 as char(10)) FROM t1 WHERE COLUMN_EXISTS(dynstr, 1);
<</sql>>
<<code>>
+-------------------------+--------+
| name | colour |
+-------------------------+--------+
| Funny shirt | blue |
| nokia | black |
| htc Desire hd | black |
| BM/Lenovo Thinkpad X60s | black |
+-------------------------+--------+
<</code>>
Lets update all items with ##colour="black"## to red
<<sql>>
UPDATE t1 set dynstr=COLUMN_ADD(dynstr, 1, "red") where COLUMN_GET(dynstr, 1 as char(10)) = "black";
SELECT name, COLUMN_GET(dynstr, 1 as char(10)) FROM t1 WHERE COLUMN_EXISTS(dynstr, 1);
<</sql>>
<<code>>
+-------------------------+-----------------------------------+
| name | COLUMN_GET(dynstr, 1 as char(10)) |
+-------------------------+-----------------------------------+
| Funny shirt | blue |
| nokia | red |
| htc Desire hd | red |
| BM/Lenovo Thinkpad X60s | red |
+-------------------------+-----------------------------------+
<</code>>
<<sql>>
SELECT COLUMN_GET(dynstr, 1 as char(10)) as colour, count(*) FROM t1 group by COLUMN_GET(dynstr, 1 as char(10));
<</sql>>
<<code>>
+--------+----------+
| colour | count(*) |
+--------+----------+
| blue | 1 |
| red | 3 |
+--------+----------+
<</code>>
Adding a new column:
<<sql>>
UPDATE t1 set dynstr=COLUMN_ADD(dynstr, 15, "4G ram") where type="computer";
SELECT name, type, price, length(dynstr) as len, column_list(dynstr) as list from t1 where type="computer";
<</sql>>
<<code>>
+-------------------------+----------+--------+------+--------+
| name | type | price | len | list |
+-------------------------+----------+--------+------+--------+
| BM/Lenovo Thinkpad X60s | computer | 419.00 | 29 | 1,3,15 |
+-------------------------+----------+--------+------+--------+
<</code>>
== The C client functions
The dynamic column functions are included in ##libmysqld.a## in MariaDB 5.3
This gives you the option to retrive the full dynamic column object (normally a
blob) from MariaDB and then unpack it in the client.
The definitions for the structures and functions are in
##include/ma_dyncol.h##. The implementation can be found in
##mysys/ma_dyncol.c##. You can find a C usage example at:
##unittest/mysys/ma_dyncol-t.c##.
=== value object
When asking for a value with ##dynamic_column_get()## you get the result in the
following structure:
<<code lang=c inline=false>>
enum enum_dynamic_column_type
{
DYN_COL_NULL= 0,
DYN_COL_INT,
DYN_COL_UINT,
DYN_COL_DOUBLE,
DYN_COL_STRING,
DYN_COL_DECIMAL,
DYN_COL_DATETIME,
DYN_COL_DATE,
DYN_COL_TIME
};
typedef enum enum_dynamic_column_type DYNAMIC_COLUMN_TYPE;
struct st_dynamic_column_value
{
DYNAMIC_COLUMN_TYPE type;
union
{
long long long_value;
unsigned long long ulong_value;
double double_value;
struct {
LEX_STRING string_value;
CHARSET_INFO *charset;
};
struct {
decimal_digit_t decimal_buffer[DECIMAL_BUFF_LENGTH];
decimal_t decimal_value;
};
MYSQL_TIME time_value;
};
};
<</code>>
The '##type##' tells you what kind of data is in the object and the
date is then stored in the respective union part.
=== Function results
All functions return an ##enum## with one of the following results
<<code lang=c inline=false>>
enum enum_dyncol_func_result
{
ER_DYNCOL_OK= 0,
ER_DYNCOL_YES= 1, /* For functions returning 0/1 */
ER_DYNCOL_FORMAT= -1, /* Wrong format of the encoded string */
ER_DYNCOL_LIMIT= -2, /* Some limit reached */
ER_DYNCOL_RESOURCE= -3, /* Out of resourses */
ER_DYNCOL_DATA= -4, /* Incorrect input data */
ER_DYNCOL_UNKNOWN_CHARSET= -5 /* Unknown character set */
};
<</code>>
=== Initializing and freeing a dynamic column object
The following should be used to initialize your ##DYNAMIC_COLUMN## object (It
just resets all bits of the object to ##0##):
<<code lang=c inline=false>>
void dynamic_column_initialize(DYNAMIC_COLUMN *str);
<</code>>
When you are done with the object, you should free the memory
associated with it:
<<code lang=c inline=false>>
void dynamic_column_column_free(DYNAMIC_COLUMN *str);
<</code>>
=== Creating a dynamic column object
Creating an object with one value:
<<code lang=c inline=false>>
enum enum_dyncol_func_result
dynamic_column_create(DYNAMIC_COLUMN *str,
uint column_nr, DYNAMIC_COLUMN_VALUE *value);
<</code>>
Creating an object with many values:
<<code lang=c inline=false>>
enum enum_dyncol_func_result
dynamic_column_create_many(DYNAMIC_COLUMN *str,
uint column_count,
uint *column_numbers,
DYNAMIC_COLUMN_VALUE *values);
<</code>>
In this case you create an array of column numbers and another array of values.
'##str##' is either an object initialized with
##dynamic_column_initialize(DYNAMIC_COLUMN *str)## or an old
##DYNAMIC_COLUMN *## object that is reused.
=== Deleting a dynamic column
<<code lang=c inline=false>>
enum enum_dyncol_func_result
dynamic_column_delete(DYNAMIC_COLUMN *org, uint column_nr);
<</code>>
=== Checking which columns exists
Checking if one column exists:
<<code lang=c inline=false>>
enum enum_dyncol_func_result
dynamic_column_exists(DYNAMIC_COLUMN *org, uint column_nr);
<</code>>
Getting a list of all columns that exist:
<<code lang=c inline=false>>
enum enum_dyncol_func_result
dynamic_column_list(DYNAMIC_COLUMN *org, DYNAMIC_ARRAY *array_of_uint);
<</code>>
In the later case, the ##DYNAMIC_ARRAY## object will be initialized as part of
the call. The functions to manipualte the ##DYNAMIC_ARRAY## are:
<<code>>
Free data in the array
delete_dynamic(DYNAMIC_ARRAY *array);
Number of elements in the array:
array.elements
Accessing one element:
unsigned int column= dynamic_element(array, index, unsigned int)
<</code>>
=== Using dynamic columns with your own objects
If you read a blob from MariaDB and want to use this with the above
functions you probably don't want to duplicate it just to access
data in it. Then you can do the following trick to create a
DYNAMIC_COLUMN that you use for those functions that will
not change it:
<<code lang=c inline=false>>
char *blob; /* Your data */
size_t blob_length; /* blob length */
DYNAMIC_STRING dynstr;
struct st_dynamic_column_value value;
dynamic_column_initialize(DYNAMIC_COLUMN &str);
dynstr.str= blog;
dynstr.length= length;
if (dynamic_column_get(&str, 1, &value))
{
/* process value for column 1 */
}
<</code>>
== Implementation goals:
The following are the goals we had when implementing dynamic columns:
* One should be able to use dynamic columns anywhere you use a normal column. (In ORDER BY, GROUP BY etc.)
* Dynamic columns should be implemented trough functions (not operators) to make it easy to use them from any SQL client (like ODBC).
* Very low storage overhead compared to a normal column (Overhead is now 4-6 bytes per column)
* Optimized for read; Read speed should be close to a normal column.
* We don't have to support having NULL values in a dynamic columns; If you don't have a value for a column, you don't store it in the dynamic column. This is why we delete a column if you change it's value to NULL.
* Small set of functions to make the functionally easy to use and understand.
* Free license (BSD) to make it easy to adopt to other languages than C.
* Storage format should be possible to extend in the future without having to modify old data. (Done by having a 'flag' byte first in the storage format).
* First version optimized for computer generated SQL and then if the feature gets adoption add new things like column names.
== Limitations:
* Columns have to be referenced by numbers (see also TODO)
* You can use any number 0-65535 for a column
* Max size of an dynamic column object is 536870911 bytes. (29 bits).
* The only way to add an index on a dynamic column is by creating a [[virtual-columns|virtual column]]. However in this case the virtual column will be instantiated.
* You can't specify a column number twice in COLUMN_CREATE() or COLUMN_ADD(). If you try you will get the not that informative error **"Illegal value used as argument of dynamic column function"**. This will be fixed in the future so that the last argument will be used.
== TODO:
* Add a ##COLUMN_CHECK()## function to check if the context of the dynamic
column is correct. (Good for checking integrity and finding errors).
* Allow one to specify same column number twice in COLUMN_CREATE() / COLUMN_ADD().
* Change type syntax for COLUMN_CREATE() to be same as for COLUMN_GET().
* Add functional indexes. This will partly solve the problem of adding an index
on virtual columns (partly because adding a functional index is not
'instant').
* Provide a way to use names for the virtual columns. The problem with this
is to decide where to store the name: In each row, in the .frm file or in an
extra file. All solutions have different space/speed advantages and we are
waiting for a bit more usage of dynamic functions until we decide which way
to go.
== See Also:
* [[what-is-mariadb-53|What is MariaDB 5.3]]
* [[microseconds-in-mariadb|Microseconds in MariaDB]]