TSQL delete statement is a well used statement. Sometimes we need to delete some rows of a table that match a certain condition. We can easily use where clause together with the delete statement. But sometimes we want to delete some rows of a table matching multiple columns of the rows with another table. This becomes a bit problematic and in this post we will see a very easy solution to this problem.
First we discuss about simple delete statement with where clause. Suppose we have a table named Raw_Data_Eucerin where we have the following columns: [Product_code] ,[Pharmacy_Code] ,[Period] ,[Units_Sales] ,[Value_Sales] ,[Net_Value_Sales] ,[Last_Modified_Date] Some sample rows of this table is shown below.

Now for example, we want to delete the rows containing a specific Pharmacy_Code.
We write the query like this:
delete from Raw_Data_Eucerin where Pharmacy_Code = 40170878
Again if we want to delete the rows containing some of the specific Pharmacy_code, then we write:
delete from Raw_Data_Eucerin where Pharmacy_Code in(40170878, 40170927, 40171029)
Now suppose we have a Pharmacy table where we have the following columns:
Pharmacy_Code Product_Code
This table contains the product pharmacy relation.
Now if we want to delete the rows from Raw_Data_eucerin having Pharmacy_Code that is not avaiable in the Pharmacy table, then we can write a simple delete statement like this:
delete from dbo.Raw_Data_Eucerin where Pharmacy_Code notin(selectdistinct Pharmacy_Code from Pharmacy)
Now comes an interesting scenario. If we want to delete the rows from Raw_Data_Eucerin having the combination of Pharmacy_Code and Product_Code that is not in the Pharmacy table, what should be the delete statement? At first thought it might come to our mind like this: we will take distinct Pharmacy_Code and Product_Code in the not in clause.
But it won't work, as sql does not allow multiple column checking in the not in clause. So we may try something like this.
delete from dbo.Raw_Data_Eucerin where Pharmacy_Code notin(selectdistinct Pharmacy_Code from Pharmacy) and Product_code notin(selectdistinct Product_code from Pharmacy)
But this will not work also. This will delete the rows matching both the conditions, but if we think deeply, we see that it will delete the rows if the Pharmacy_Code is not in the Pharmacy table and the Product_Code is not in the Pharmacy table. It is not deleting the Pharmacy-Product combination that is not in the Pharmacy table. For example, if we have a row having Pharmacy_Code = 40171503 and Product_Code = 63304 and we don't have this Pharmacy_Code in the Pharmacy table, it will not delete this row as the Product_Code is available.
But this should be deleted because we don't have this combination in the Pharmacy table. So the next thought could be, ok, we can use or instead of and. So the delete statement may look like this:
delete from dbo.Raw_Data_Eucerin where Pharmacy_Code notin(selectdistinct Pharmacy_Code from Pharmacy) or Product_code notin(selectdistinct Product_code from Pharmacy)
This will delete the rows for which Pharmacy_Code or Product_Code is not available in the Pharmacy table. But this also has some problem. Suppose we have a row in Raw_Data_Eucerin having Pharmacy_Code = 40171751 and Product_Code = 63304. Both the Product_Code and Pharmacy_Code are available in the Pharmacy table, but this Pharmacy does not have this product. This Pharmacy has some other products and this product is available in some other Pharmacies. So this should be deleted but it will not. So I think, it clearly explains the problem.
Any time we need to match combination of columns and want to delete the nonmatching rows, we are in trouble. If sql where not in clause gave the facility of checking multiple columns, our life would be easier. But we done have this. Now we can play a simple trick to resolve this problem.
delete from dbo.Raw_Data_Eucerin where Pharmacy_Code+'#'+Product_code notin ( selectdistinct Pharmacy_Code+'#'+Product_code from Pharmacy )
If we make a single column with concatenated string of the columns we want to match with then we can use the where not in clause. And this will resolve our problem. We can concatenate any number of columns to get the combined string, so we don't have the limitation column numbers. But if we have a numeric column, we might be in problem. But of course we can cast the numeric column to varchar and have the concatenated string.
delete from dbo.Raw_Data_Eucerin wherecast(Pharmacy_Code asvarchar)+'#'+Product_code notin ( selectdistinctcast(Pharmacy_Code asvarchar)+'#'+Product_code from Pharmacy )
One very important thing here is to consider the column separator character. Here I have used # because my Pharmacy_Code and Product_Code does not contain #. If any of our columns that we want to match contains #, then we cannot use # as the separator. In case of varchar columns, this might be problem. Because, we don't know before whether our column will contain # or not. In such cases we can use a multi character separator that is very unlikely to occur in our varchar columns.
For example, we can use #?#$$$#?#.
delete from dbo.Raw_Data_Eucerin wherecast(Pharmacy_Code asvarchar)+'#?#$$$#?#'+Product_code notin ( selectdistinctcast(Pharmacy_Code asvarchar)+'#?#$$$#?#'+Product_code from Pharmacy )
So, this is a very simple but useful trick (I think) to resolve the multiple column matching for deletion.