The rules are fairly simple, once you have the actual challenge. The SQL must be encapsulated into a single SQL statment - be that select, delete, insert or update. It can have all the joins, subclauses, derived tables, and whatever else SQL provides, as long as it is included in a single statement. If inserting, updating or deleting, you are allowed a separate SQL statement for each table, but that is the only exception to the one statement rule. So, no cursors or loops. You get a bonuses for using ANSI SQL syntax and for how quickly you finish.
Today, I got one of those challenges. Let's start with the givens.
1. We have one or more email addresses stored in customer.email.
2. These emails can be comma delmited or semicolon delimited.
3. There may be spaces embedded before or after commas/semicolons.
4. The emails may be hosted internally or by a 3rd party provider, or a combination of the two.
5. A list of domains is stored in domain_item.domain. These domains are hosted internally.
6. A list of email addresses is stored in email_data.emailaddr. These emails are hosted internally.
7. When sending email messages to the email addresses stored on customer.email, if one of the email addresses is on a hosted domain but the email address is not in the list of hosted emails, the SMTP server will return a 550 (mailbox not found) error.
Now, the challenge. Given all of the above, we want to find a list of customers with email addresses on customer.email that will generate a 550. This means we want to find a list of email addresses attached to a hosted domain but are not a hosted email address. After a little less than an hour, I made this...
select
/* The
unique customer id */
customer.id,
/* The
delimited list of contact email addresses */
customer.email,
/* A
hosted domain matching one of the email addresses */
domain_item.domain,
/* The
number of email addresses in customer.email */
len(customer.email) - len(replace(customer.email, '@', '')) as email_count,
/* The
number of emails that match the hosted domain */
(
len(replace(';'+replace(customer.email,' ','')+';',',', ';'))
- len(replace(replace(';'+replace(customer.email,' ','')+';',',', ';'), '@'+domain_item.domain+';',''))
) / len('@' + domain_item.domain + ';') as domain_count,
/* The
number of hosted email addresses that match the hosted domain */
(
select count(*)
from email_data
where customer.id = email_data.d_custid
and ';'+replace(replace(customer.email,' ',''),',',';')+';' like '%;'+email_data.emailaddr+';%'
and email_data.emailaddr + ';' like '%@' + domain_item.domain + ';'
) as match_count
from customer
inner join domain_item
on replace(';'+replace(customer.email,' ','')+';',',',';') like '%@'+domain_item.domain+';%'
/* Strip
the hosted domain from the delimited list */
/*
Comparing the difference divided by the length of the hosted domain name */
/* This
will let us know how many emails _should_ be hosted for this domain */
where
(
len(replace(';'+replace(customer.email,' ','')+';',',',';'))
- len(replace(replace(';'+replace(customer.email,' ','')+';',',',';'), '@'+domain_item.domain+';', ''))
) / len('@'+domain_item.domain+';')
/*
Compare the _should_ total against the _actual_ total */
/* If
they don't match exactly, include the customer in the result */
<> (
/* Find
the number of hosted email addresses for this customer */
/* That
match one of the email addresses on the delimited list */
/* This
will let us know how many emails are _actually_ hosted for this domain */
select count(*)
from email_data
where customer.id = email_data.d_custid
and replace(';'+replace(customer.email,' ','')+';',',',';') like '%;'+email_data.emailaddr+';%'
and email_data.emailaddr+';' like '%@'+domain_item.domain+';'
)
go
No comments:
Post a Comment