Browsing the archives for the SQL tag.


  • Anthony Stevens

Comparing SQL Queries That Solve My Thorny SQL Set Operation Problem

Software

For comparison’s sake, the following is a runnable T-SQL script that contains both the solutions provided by helpful friends to my previous “Thorny SQL Set Operation” problem.

set nocount on
go

drop table person
drop table category
drop table personcategory
drop table groupcategory

go

/* create the person table */
create table person (personid int, name varchar(50))
/* create the category table */
create table category (categoryid int, name varchar(50))
/* create the personcategory table */
create table personcategory(personid int, categoryid int)
/* create the group table */
create table groupcategory (groupid int, categoryid int)

go

/* insert the person data */
insert person values (1, “Hillary Clinton”)
insert person values (2, “Segolene Royal”)
insert person values (3, “Barack Obama”)

/* insert the category data */
insert category values (1, “Man”)
insert category values (2, “Woman”)
insert category values (3, “Politician”)
insert category values (4, “American”)
insert category values (5, “French”)

/* insert the personcategory data */
insert personcategory values (1, 2)
insert personcategory values (1, 3)
insert personcategory values (1, 4)
insert personcategory values (2, 2)
insert personcategory values (2, 3)
insert personcategory values (2, 5)
insert personcategory values (3, 1)
insert personcategory values (3, 3)
insert personcategory values (3, 4)

/* insert the group data */

/* GROUP 1: American Politicians */
insert groupcategory values (1, 3)
insert groupcategory values (1, 4)

/* GROUP 2: Female French Politicians */
insert groupcategory values (2, 2)
insert groupcategory values (2, 3)
insert groupcategory values (2, 5)

/* local vars */
declare @american_pols int
select @american_pols = 1
declare @female_french_pols int
select @female_french_pols = 2

/* QUERY METHOD 1: “Hans Method” */

select distinct p.*
from Person p
join PersonCategory c on (p.personid = c.personid)
where c.personid in (
select c.personid from PersonCategory c
where c.categoryid in (
select g.categoryid from GroupCategory g
where g.groupid = @female_french_pols
)
group by c.personid
having count(*) = (
select count(*) from GroupCategory g
where g.groupid = @female_french_pols
)
)
and c.categoryid in (
select gc.categoryid from GroupCategory gc
where gc.groupid = @female_french_pols
)

/* QUERY METHOD 2: “Tom Music / Brian Dorsey Method” */

select p.personid, p.name, count(*) from person p
join personcategory pc on pc.personid = p.personid
join category c on c.categoryid = pc.categoryid
where c.categoryid in (select gc.categoryid from groupcategory gc where groupid = @female_french_pols)
group by p.personid, p.name
having count(*) = (select count(gc.categoryid) from groupcategory gc where groupid = @female_french_pols)
order by p.personid

Comments Off

Bitmasking Solution For Thorny SQL Set Operation

Software

My previous post outlined a tricky little SQL query problem dealing with set intersection. In addition to the solution found by Hans, I thought for completeness I would post a little scalar function that does a bitmask comparison to achieve the same result – with the notable, and possibly untenable solution that it limits the number of categories in your database to 64 (the number of bits in a BIGINT).


create function IsMember
(
@personid int,
@groupid int
)
returns int
as
begin
declare @ret bigint
declare @personsum bigint
select @personsum = (select sum(power(2, pc.categoryid - 1))
from personcategory pc
where pc.personid = @personid)
declare @groupsum bigint
select @groupsum = (select sum(power(2, gc.categoryid - 1))
from groupcategory gc
where gc.groupid = @groupid)
if ((@personsum & @groupsum) = @personsum)
select @ret = 1
else
select @ret = 0
return @ret
end

Comments Off

Thorny SQL Set Operation

Software

UPDATE 2: I posted a full script that contains both working solutions in a related post.

UPDATE: Hans developed a query that works. See the comments. It works even if my group is (Female + French), but my Person is (Female + French + Politician) – e.g., extra categories not part of the group are ignored.

Thanks for all the help!

********************************

I’ve been working for a bit on a (possibly) hard SQL query and thought I would write out my problem here, in the hopes that (a) I’ll solve it, and (b) I’ll make the solution available to others via the magic of Teh Google.

Here’s the problem: I have a concept of “category”. Example categories: Man, Woman, Movie Star, Politician, American, French.

I have a concept of “person”. Persons can belong to one or more categories.

I have a concept of “group”. A Group is merely a listing of one or more categories. For example, “Male Movie Stars” is a Group (Man + Movie Star). “Female French Politicians” is a group (Woman + Politician + French).

What I want to do in my query is this: Given a Group, find all Persons who belong to that Group.

pseudo-code query:

select * from Person p
join PersonCategory pc on (p.personid = pc.personid)
join GroupCategory gc ON (pc.categoryid = gc.categoryid)
where gc.groupid = @groupid

However, normal JOIN syntax using the IN operator does an “OR”-type operation, when what I really want is an “AND”-type operation. Using the “Female French Politicians” example from above, this query will return French OR Woman OR Politician. Not what I want.

Thinking…

Comments Off

The Linq FirstOrDefault() Method and Null Resultsets

Software

I was doing some Linq programming and kept hitting the problem where if there were no results to a query, but I wanted to call First() or Single() to get a single row, I kept getting “Sequence contains no elements” exceptions.

There didn’t seem to be an easy way to tell if the resultset was null, because it wasn’t really NULL, not in a C# sense. And trying to get Count() would throw a separate exception if there really was at least one element, because I couldn’t enumerate a sequence more than once.

Luckily an offhand comment by Scott Guthrie on his blog helped me out a lot. Use FirstOrDefault() with a dummy criteria as an argument:

var record = <blah>.GetResult().FirstOrDefault(p => p.id <0)

If there are no results, this code will return NULL to the variable, which you can then easily check in your code.

Thanks Scott!

10 Comments

Creating a Comma-Delimited String in SQL Server

Software

I love 4 Guys From Rolla. Most of the time when I have a particular tool I’m looking for, I can find it.

In this case, I had a many-to-many relationship that I wanted to join together to concatenate one half of the relationship to get a comma-delimited string. The DB entities are JOB and KEYWORD. One job can have many keywords, and one keyword can be associated with many jobs. I use a standard join table, JOB_X_KEYWORD, to accomplish this in the DB.

Here’s the code, with a hat tip to Scott Mitchell:

CREATE FUNCTION dbo.udf_GetKeywords(@jobid int)
RETURNS VARCHAR(1000) AS
BEGIN
DECLARE @k varchar(1000)
SELECT @k = COALESCE(@k + ', ', '') + k.keyword_value
FROM keyword k JOIN job_x_keyword jxk ON (k.keyword_id = jxk.keyword_id)
WHERE jxk.job_id = @jobid
RETURN @k
END
go
select j.title, dbo.udf_getkeywords(j.job_id)
from job j

Comments Off

Linq to SQL Stumper

Software

See if you can spot the contradiction in the following two FAQs:

Q: Should I create a new DataContext in every business logic method?
A: The DataContext conforms to the Unit of Work design pattern. Unless you are moving data between physical tiers between each operation you should keep your DataContext alive for the duration of work.
Q: Should I keep my DataContext in a static/global/shared variable?
A: The DataContext is not thread safe and is not meant to be shared. A DataContext is meant to be used for a single unit or work or at most for multiple consecutive units of work.

Hint: I’ve highlighted it for you.

1 Comment