אוק29
Written by:
ronen ariely
29/10/2014 16:18
Introduction
In some case we have to use single value, in order to represent a list of elements (values). This is a very common needs in hardware and software developing. All the logics that we discus in this article, can be implements regardless of database use. In this article we will focus on using SQL Server to implement our logics. This might be an external demand (applications for example) or even a result of bad database design. Yet, the needs is clear, we want to store multiple elements in one column. Those elements can be properties list, options, security permissions, dates, or any other data.
In this article we will go over several solutions, using different logics. Our challenge is to find a good logic, which give us a onetoone correspondence, between each available combination of our elements, to a single value which represent this combination.
Our case study
For the purpose of this article we will talk in general about permission's rules, as example. A user can have any combination of security rules which together describe the user's permissions. In our theoretical case, we need a way, to store all the user's permissions, using a single value (for example in the Users table).
Permission Description

Read

Write

Execute

Important! This is not Relational Data Base Management (RDBM) best solution! The obvious solution using RDBM will base on something like creating Users table, Permissions table, and a junction table for ManytoMany relationship. This solution might involve with using Tablevalued parameters (TVP) to send multiple rows of data to a TransactSQL statement. This is probably the best solution for most cases, as far as we discuss RDBM database server side.
Note! A column cannot be of a userdefined table type, we cannot use this for our case (store single element which represent multiple values). Moreover, using TVP is probably the best solution for most cases on the database side, but it is not so easy to implement sometimes with all developing technologies and languages (for example using JAVA).
Solution 01: CommaSeparated List
The simple's way, and probably the most expensive regarding resource, will be using a single string, separated by comma, with all our user permissions. For example the user A with permissions to Read and Write will get the value "Read, Write".
Note! If you did chose this option, then please remember that TSQL do not deal with text parsing well, and it is highly recommended to work with CLR function!
The description of the permission might be very long string. We can represent the SET of those descriptions, using integer primary Key (or in most cases better a surrogate primary key). For example same user A have permissions "1, 2".
PermissionID

Permission Description

1

Read

2

Write

3

Execute

In most case we will prefer to use a single integer value, and not to parse in every use, this string of CommaSeparated list.
Solution 02: Explicitly using all available combinations.
The basic logic behind this solution, is to create a list of all available combination, and use a unique value in order to represent each of those options. For example:
Unique value

Permissions Combination

1

Read

2

Write

3

Execute

4

Read + Write

5

Read + Execute

6

Read + Write + Execute

Now we know that if a user A have permission 6, it mean that he have Read, write, and Execute permissions, while a user B with permission 3 have only Execute permission, and so on.
This solution can be very useful using database, especially for a small SET of options, but it need much more managing when permissions changed. The big disadvantage of this option, is that the amount of row growth exponentially, when we add more permissions. The table growing rapidly. For amount of n elements, we will need to hold factorial (N!) rows. Moreover, this solution fit databases which work well with SETs, but might not work well for our application (we don’t want to store all the information in memory, and we don’t want to go back and forward to the database, each time we use it).
Solution 03: Using groups of three.
The basic logic behind this solution, is to represent our elements list, in ordered list, and represent each element with a different number using base 2.
For example we can use this table below, and use the SUM of the permissions in order to represent the user's permissions. A user that have permission 1 can only read, permission 3 is the SUM of 1+2 therefore the user have permissions to read and write, permission 4 is only Execute, and 5 is Execute + Read. Permission 6 is write + Execute and 7 is full permission Read, Write, and Execute.
Note! You can notice that the unique values that we use are base 2 exponent (or power) n. Using TSQL we can use the function POWER(2,n1)
Uniqe Number

Permission Description

1

Read

2

Write

4

Execute

The Linux File Permissions is a good example of this case. Each file and directory has three user based permission groups: Owner (u), Group (g), All Users (a). Each file or directory, has three permissions rules: Read (r), Write (w), Execute (x). We can represent the permissions groups and permissions types, by using this small table, using bit type (1 if the group and permission rule applied, and 0 if not).

Owner

Group

All Users

Rule type

r

W

x

r

w

x

r

w

x

Is Rule active on File or folder?

1

1

1

1

1

0

1

0

0

This table include the permissions for a single file or folder. Each row represent a file or a folder permissions. In our example Owner have READ, WRITE, and EXECUTE permission, while the group have only READ AND WRITE permissions, and the use have only READ permission.
As we mentioned above each three permissions (each group of permissions) can be represent in a single number. In our example the first number is 1+2+3 = 7, the second number is 1+2 = 3, and the third number is 1. Therefore, we can represent all the permission using a number with 3 characters (one character for each group of permissions): 731
The big advantage in this solution is the visualization of the value. It is very fast and easy for us, as humans, without any calculation, to convert the number 731 to a SET of options. Since each characters represent a group of 3 options.
Why do we use groups of three and not groups of two or four for example?
The answer simply lies in the visualization of the value. The SUM of all the values should be the smaller number that we can, in order to store the maximum amount, of permissions. (1) We will choose the biggest group that we can represent. Therefore if three work well there is no reason to work only with two permissions in a group. (2) At the same time we want each group to be displayed separately, in a single character (for the visualization). Therefore we cannot choose four permissions, since the next value using our logic POWER(2,n1), will give us POWER(2,3)= 8, and in order to get all permissions we get the value 1+2+4+8 = 15 which will use two characters.
Solution 04: Using base two.
The basic logic is the same as above, but in this case we are not limiting ourself to groups of three permission, but only one group with infinite number of permissions. As explain above we are losing the virtualization advantage, but we gain smaller value in order to represent the same amount of permissions.
Uniqe Number

Permission Description

1

Rule1

2

Rule2

4

Rule3

8

Rule4

16

Rule5

32

Rule6

64

Rule7

128

Rule8

256

Rule9

Example: User A with permissions Rule1+Rule2+Rule6 will get the value 1+2+32 = 35. As we can see, getting the permissions from the value is not intuitive as in previous solution. But in previous solution we need to use the number 777 in order to represent all first 9 permissions, while in this case we only need to use the number 511.
select
SUM
(P)
from
(
select
top
9 n, POWER(2,n1) P
from
Ari_Numbers_Tbl
) t
Clarification on our format:
We can represent our permissions as a list of binary data, like this:
Rule Number

9

8

7

6

5

4

3

2

1

Is Rule active on File or folder?

0

0

0

1

0

0

0

1

1

For each rule that the user get, we use 1 and for each rule that is not implied on the user, will get the value 0. We can notice that the user's permission is actually a number on base 2. In our case 100011. In order to get the value which represent those permission, we calculate the SUM of (2 power n). This is just a simple converting from base 2, to a number on base 10. And 100011 in base two is equivalent to 35 in base 10 (same as we got above).
Note: In order to use this logic we will need a function to convert from binary to base 10 and vice versa. We will create a table with all the permission rules. And each user will get one value using base 10 which represent all his permissions, according to the permission's table.
Solution 05: using Fibonacci base.
Fibonacci numbers or Fibonacci sequence are the numbers in integer sequence, which by definition, the first two numbers are 1 and 1, or 0 and 1, depending on the chosen starting point of the sequence, and each subsequent number is the sum of the previous two.
 select all numbers in Fibonacci sequence
 which are smaller then 1000000000:
WITH
Fibonacci (PrevN, N)
AS
(
SELECT
0, 1
 starting point
UNION
ALL
SELECT
N, PrevN + N
FROM
Fibonacci
WHERE
N < 1000000000
)
SELECT
PrevN
as
Fibo
FROM
Fibonacci
OPTION
(MAXRECURSION 0);
GO
Every integer number is the sum of set of distinct (no number used more than once) Fibonacci Numbers. This forms the idea behind representing numbers using Fibonacci numbers.
Converting between different number bases
In order to convert from base 2 to decimal, we represent each number as column of POWER(2,n1), and we uses 1 or 0 to indicate that this column should be calculate. This is actually the base 2 number. For example the base 2 number 000100011 was represent like this:
Value in base 10

256

128

64

32

16

8

4

2

1

Is this place in base 2 should be calculate?

0

0

0

1

0

0

0

1

1

The equivalent decimal number got by calculating the SUM of the Values in base 10. Therefore the decimal number in our case is 1+2+32 = 35.
Can we do the same with any sequence numbers? The answer is no. A number representation system (a number base) is most useful if it has a unique representation of every integer (onetoone correspondence).
Using the same displayed as above for Fibonacci base we can see that the number 000110011 in Fibonacci base is equal to 1+2+8+13 = 24, but this is not the only option to get the value 24! We could use the SUM of anyone of those options: {3,21} {1,2,21} {3,8,13} {1,2,8,13} {1,2,3,5,13}. It is not enough that we can represent each integer in base 10 as SUM of Fibonacci numbers, but we need it to be a unique option.
Value in base 10

55

34

21

13

8

5

3

2

1

Fibonacci base

0

0

0

1

1

0

0

1

1

The reason that in our example we got multiple options, lies in the use of two consecutive Fibonacci numbers. After all the definition of Fibonacci sequence is that the next number is the SUM of the last two numbers. Therefore we can replace two consecutive Fibonacci numbers with the next number (1+2 = 3 which is Fibonacci number by itself, same with 8+13).
We can find a single way to write every number as a sum of Fibonacci numbers if we also have the rule that no two consecutive Fibonacci numbers can be used in the same sum. In terms of the base system with Fibonacci numbers as the headings, this means that no two ones can occur next to each other.
Implementing Fibonacci base
It is very simple to see that if we have a SET of values (permission's rules in our case) and those permissions sorted in a way that there is no way that we will use two consecutive rules, then we can use the Fibonacci base as it is, in the same way that we used Base 2 in previous solutions. For example if we have a system that serve two (or more) companies (or other group of users), and each company has its own SET of rules, then we can use the even places to keep the rules of one company, and odd places to keep the rules of the second one.
Another option where we can implement the Fibonacci base, is by only the numbers in the odd places or only the numbers that are in the even places. For example using those bases (those are not full number bases and do not enable us to represent every integer from base 10, but it is fit our needs):
Value in base 10

55

21

8

3

1

Fibonacci half base

1

1

1

1

1

The number 11111 which is full permissions is equal to 1+3+8+21+55 = 88
Why to use Fibonacci base?
While in base 2 in order to use the 9^{th} rule, we needed to SUM the POWER(2,91)= 256 , in Fibonacci Base the 9^{th} rule is only getting the number 55. Therefore the value that represent our total permissions can be much smaller.
Solution 06: using view with INSTEAD of trigger.
We mentioned in the beginning of the article that using RDBM approach will give us much better solution for most cases, while implementing in the database, but there are some cases that we need to use one column which store multiple values, which might be an external needs (like external application). There is a simple solution which we must mention here, and that is hiding the real database structure and given the external client to get exactly what he want without the need of actually store several values in a single column. We can use view with INSTEAD of trigger, as our solution.
The external client will only contact the view, which will use anyone of the logic that we show here in order to use dynamic calculate virtual column. This column will give us the one value that represent all the permissions, and using INSTEAD of trigger the use could handle this view as it was a real table (INSERT, DELETE, and UPDATE).
For more information about this option you can check this article: http://social.technet.microsoft.com/wiki/contents/articles/28152.insteadoftriggers.aspx
Conclusions
In this article we showed several logics to represent a list of values, using a single value. We demonstrate those options using TSQL language and SQL Server, but you can use the same logics in in programing using any language that you want.
Our challenge is to find a good logic, which give us a oneone correspondence, between each available combination of our elements, to the single value which represent this combination.
Resources and More information
Using the Fibonacci numbers to represent whole numbers
http://www.maths.surrey.ac.uk/hostedsites/R.Knott/Fibonacci/fibrep.html
Understanding Linux File Permissions
http://www.linux.com/learn/tutorials/309527understandinglinuxfilepermissions
Representations of N as a sum of distinct elements from special sequences by D. A. Klarner, Fibonacci Quarterly, vol 4 (1966), pages 289306 and 322.