This post shows the invisible characters that we might encounter in tsql string.
We often face problem in data matching when invisible characters are available in string. When we select the string we see something that seems to us as space. But sometimes they are not really spaces. So when we compare strings having these invisible characters, we think that they should be same but sql server does not find them equal and does not match. Often in our queries we face problems due to this reason.
It often occurs when we load data from some flat file or csv file and from some other sources in relational tables. Most often the invisible characters are newline, carriage return, line feed or tab. But I have faced a problem where the invisible character was a character whose Unicode value is 160. So first of all we will see how we can find which the invisible characters are.
We can use loop through from 1 to 255 and see the character using sql CHAR() function.
declare @loop_counter int= 1, @all_unicode_characters nvarchar(max)=''
declare @loop_counter int = 1, @all_unicode_characters nvarchar(max) = ''
while @loop_counter<256
begin
set @all_unicode_characters = @all_unicode_characters + ',' +
cast(@loop_counter as varchar) + '=' + char(@loop_counter)
set @loop_counter = @loop_counter + 1
end
select stuff(@all_unicode_characters, 1, 1, '') we will get the following output that shows us the characters together with their values.
1=, 2=, 3=, 4=, 5=, 6=, 7=, 8=, 9= , 10= , 11=, 12=, 13= , 14=, 15=, 16=, 17=, 18=, 19=, 20=, 21=, 22=, 23=, 24=, 25=, 26=, 27=, 28=, 29=, 30=‑, 31=, 32= , 33=!, 34=", 35=#, 36=$, 37=%, 38=&, 39=', 40=(, 41=), 42=*, 43=+, 44=,, 45=-, 46=., 47=/, 48=0, 49=1, 50=2, 51=3, 52=4, 53=5, 54=6, 55=7, 56=8, 57=9, 58=:, 59=;, 60=<, 61==, 62=>, 63=?, 64=@, 65=A, 66=B, 67=C, 68=D, 69=E, 70=F, 71=G, 72=H, 73=I, 74=J, 75=K, 76=L, 77=M, 78=N, 79=O, 80=P, 81=Q, 82=R, 83=S, 84=T, 85=U, 86=V, 87=W, 88=X, 89=Y, 90=Z, 91=[, 92=\, 93=], 94=^, 95=_, 96=`, 97=a, 98=b, 99=c, 100=d, 101=e, 102=f, 103=g, 104=h, 105=i, 106=j, 107=k, 108=l, 109=m, 110=n, 111=o, 112=p, 113=q, 114=r, 115=s, 116=t, 117=u, 118=v, 119=w, 120=x, 121=y, 122=z, 123={, 124=|, 125=}, 126=~, 127=, 128=€, 129=, 130=‚, 131=ƒ, 132=„, 133=…, 134=†, 135=‡, 136=ˆ, 137=‰, 138=Š, 139=‹, 140=Œ, 141=, 142=Ž, 143=, 144=, 145=‘, 146=’, 147=“, 148=”, 149=•, 150=–, 151=—, 152=˜, 153=™, 154=š, 155=›, 156=œ, 157=, 158=ž, 159=Ÿ, 160= , 161=¡, 162=¢, 163=£, 164=¤, 165=¥, 166=¦, 167=§, 168=¨, 169=©, 170=ª, 171=«, 172=¬, 173=, 174=®, 175=¯, 176=°, 177=±, 178=², 179=³, 180=´, 181=µ, 182=¶, 183=·, 184=¸, 185=¹, 186=º, 187=», 188=¼, 189=½, 190=¾, 191=¿, 192=À, 193=Á, 194=Â, 195=Ã, 196=Ä, 197=Å, 198=Æ, 199=Ç, 200=È, 201=É, 202=Ê, 203=Ë, 204=Ì, 205=Í, 206=Î, 207=Ï, 208=Ð, 209=Ñ, 210=Ò, 211=Ó, 212=Ô, 213=Õ, 214=Ö, 215=×, 216=Ø, 217=Ù, 218=Ú, 219=Û, 220=Ü, 221=Ý, 222=Þ, 223=ß, 224=à, 225=á, 226=â, 227=ã, 228=ä, 229=å, 230=æ, 231=ç, 232=è, 233=é, 234=ê, 235=ë, 236=ì, 237=í, 238=î, 239=ï, 240=ð, 241=ñ, 242=ò, 243=ó, 244=ô, 245=õ, 246=ö, 247=÷, 248=ø, 249=ù, 250=ú, 251=û, 252=ü, 253=ý, 254=þ, 255=ÿ
From this we see that there are a lot of characters that seem to us as space but they are not actually space. Unicode 32 is space only.
So here we see a tsql function that will take a string and return whether the string has any invisible characters other than space.
We will see here several utility functions related to this.
First we see a table valued function that returns a table containing all the characters in a string.
The output table has 2 columns: one contains the character and the other contains the Unicode value of that character.
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
-- =============================================
-- Author: Maeenul
-- Create date:
-- Description: This table - valued function returns a table containing the individual characters of a string
-- and corresponding unicode value
-- =============================================
CREATE FUNCTION AllCharactersInString
(
@str nvarchar(max)
)
RETURNS TABLE
AS
RETURN
(
select substring(B.main_string, C.int_seq, 1) as character,
Unicode(substring(B.main_string, C.int_seq, 1)) as unicode_value
from
( select @str as main_string ) B,
(
select A.int_seq
from
(
select row_number() over (order by name) as int_seq
from sys.all_objects
) A
where A.int_seq <= len(@str)
) C
)
GO
select *
from AllCharactersInString('PRESCRIPTION STATUS') This will give us the following output:

The Following function we can use for checking whether our string contains any invisible character or not. This function uses the previous function.
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
-- =============================================
-- Author: Maeenul
-- Create date:
-- Description: This function returns 1 if the string contains invisible character other than space.
-- Returns 0 is it does not contain invisible character
-- =============================================
CREATE FUNCTION ContainsInvisibleCharacter
(
@str nvarchar(max)
)
RETURNS int
AS
BEGIN
DECLARE @Result int
if exists(
select *
from AllCharactersInString(@str)
where unicode_value in (1, 9, 10, 11, 12, 13, 14, 28, 29, 31, 129, 141, 143, 144, 157, 160)
)
select @Result = 1
else
select @Result = 0
RETURN @Result
END
GO
select dbo.ContainsInvisibleCharacter('This string has invisible character')
n Returns 1
select dbo.ContainsInvisibleCharacter('This string does not have invisible character')
n Returns 0
Now we will see a function that we can use to remove our invisible characters with space. Afterwards we can decide whether we want to keep the spaces or want to trim. When all of our invisible characters are replaced with space, we can use tsql ltrim or rtrim functions or take other steps to fit our requirements.
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
-- =============================================
-- Author: Maeenul
-- Create date:
-- Description: This function replaces the invisible characters from a string with space (unicode 32)
-- =============================================
CREATE FUNCTION ReplaceInvisibleCharactersWithSpace
(
@str nvarchar(max)
)
RETURNS nvarchar(max)
AS
BEGIN
SELECT @str = replace(@str, char(invisible_char), ' ')
from (
select 1 as invisible_char
union
select 9
union
select 10
union
select 11
union
select 12
union
select 13
union
select 14
union
select 28
union
select 31
union
select 129
union
select 141
union
select 143
union
select 144
union
select 157
union
select 160
) A
RETURN @str
END
GO
select dbo.ReplaceInvisibleCharactersWithSpace('Testing '+CHAR(10)+' and '+CHAR(160))