Parsing delimited string in Redshift

SQL server developers are very much familiar with split string function. These types of functions generally parse delimited string and return single column table type. Recently SQL Server 2016 has given native function “STRING_SPLIT” for parsing.

Redshift has provided “split_part” function for parsing string which returns a part of delimiter string. But developers are always like to convert delimited string to table rows so that they can join the result. Yes this can be also possible in Redshift by utilizing “split_part”.

First, we will have to create a number series. PostgreSQL have nice function called “generate_series” to generate series of integer values. Through Redshift was build from PostgreSQL 8.1.x, “generate_series” function is not fully supported in Redshift. Below code will generate series of integer values between 0 to 255 (collected from here).

CREATE TEMPORARY TABLE numbers AS (
 SELECT 
 p0.n 
 + p1.n*2 
 + p2.n * POWER(2,2) 
 + p3.n * POWER(2,3)
 + p4.n * POWER(2,4)
 + p5.n * POWER(2,5)
 + p6.n * POWER(2,6)
 + p7.n * POWER(2,7) 
 as num
 FROM 
 (SELECT 0 as n UNION SELECT 1) p0,
 (SELECT 0 as n UNION SELECT 1) p1,
 (SELECT 0 as n UNION SELECT 1) p2,
 (SELECT 0 as n UNION SELECT 1) p3,
 (SELECT 0 as n UNION SELECT 1) p4,
 (SELECT 0 as n UNION SELECT 1) p5,
 (SELECT 0 as n UNION SELECT 1) p6,
 (SELECT 0 as n UNION SELECT 1) p7
);

Suppose, we have a delimited string and then create a single row temp table to use in join easily.

CREATE TEMPORARY TABLE delimatedtagid AS(
 SELECT '32,64,256' as tagidtext
);

Now below code will convert delimited text to table rows:

 SELECT 
 TRIM(
    split_part(dti.tagidtext, 
                ',',
               (numbers.num+1)::int)
    ) as tagid
 FROM delimatedtagid dti
 JOIN numbers 
 ON numbers.num <= regexp_count(dti.tagidtext, ',')

 

Complete script:

CREATE TEMPORARY TABLE numbers AS (
 SELECT 
 p0.n 
 + p1.n*2 
 + p2.n * POWER(2,2) 
 + p3.n * POWER(2,3)
 + p4.n * POWER(2,4)
 + p5.n * POWER(2,5)
 + p6.n * POWER(2,6)
 + p7.n * POWER(2,7) 
 as num
 FROM 
 (SELECT 0 as n UNION SELECT 1) p0,
 (SELECT 0 as n UNION SELECT 1) p1,
 (SELECT 0 as n UNION SELECT 1) p2,
 (SELECT 0 as n UNION SELECT 1) p3,
 (SELECT 0 as n UNION SELECT 1) p4,
 (SELECT 0 as n UNION SELECT 1) p5,
 (SELECT 0 as n UNION SELECT 1) p6,
 (SELECT 0 as n UNION SELECT 1) p7
);

CREATE TEMPORARY TABLE delimatedtagid AS(
 SELECT '32,64,256' as tagidtext
);

CREATE TEMPORARY TABLE tagidlist AS( 
 SELECT 
 TRIM(
      split_part(dti.tagidtext, 
                 ',',
                 (numbers.num+1)::int)
    ) as tagid
 FROM delimatedtagid dti
 JOIN numbers 
 ON numbers.num <= regexp_count(dti.tagidtext, ',')
);

select * from tagidlist;

Output is like below:

[tagid]
..................
32
64
256
Advertisements