Monday, February 27, 2012

Help: Shortest Path in sql

Hello to all,

help, help,...

i have with this problem since 3 weeks, until now i cann't resolve this problem. Maybe can somebody help me. I am hopeless.Crying

i have a data table ValidRelationship, i will check if there is a relationship between two members by this table.

datas in the table ValidRelationship:

IDMember IDOtherMember IDType

3700

372610000372637001000037423672100003672374210000342235481000035483422100003548371710000371735481000036753695100003695367510000

I will give two member and check their Relationship with a sql query. but it can be that this two member have no relationship. So i define here that man should search processor <= 6 . To better describe i use a example: max. Result of this query is: 1-2-3-4-5-6. If this is a relationship between 1-7 is 1-2-3-4-5-6-7, but i will give a answer that this is no relationship between 1-7. because processor > 6.

But my problem is: this query executing is too slow. if i habe two member no relationship, the time of this complete sql query to execute is more than 1 minutes. Is my algorithm wrong, or where is the problem, why this executing is so slow? How can i quickly get the relationships between two member, if my algorithms is not right? The following Query is only to processor = 3, but it works too slowly, so i don't write remaining processors.

declare @.IDMint;

declare @.IDOint;

set @.IDM= 3418;

set @.IDO= 4270

selecttop 1 IDMember

from v_ValidRelationships

where IDMember= @.IDM

and @.IDOin(select a.IDOtherMemberfrom v_ValidRelationshipsas awhere a.IDMember= @.IDM)

selecttop 1 a.IDMember, b.IDMember

from v_ValidRelationshipsas a, v_ValidRelationshipsas b

where a.IDMember= @.IDM

and b.IDMemberin(select c.IDOtherMemberfrom v_ValidRelationshipsas cwhere c.IDMember= @.IDM)

and @.IDOin(select d.IDOtherMemberfrom v_ValidRelationshipsas dwhere d.IDMember= b.IDMember)

selecttop 1 a.IDMember, b.IDMember, e.IDMember

from v_ValidRelationshipsas a, v_ValidRelationshipsas b, v_ValidRelationshipsas e

where a.IDMember= @.IDM

and b.IDMemberin(select c.IDOtherMemberfrom v_ValidRelationshipsas cwhere c.IDMember= @.IDM)

and e.IDMemberin(select f.IDOtherMemberfrom v_ValidRelationshipsas fwhere f.IDMember= b.IDMember)

and @.IDOin(select d.IDOtherMemberfrom v_ValidRelationships as d where d.IDMembe = e.IDMember)

If someone has a idea, please help me. Thank a million

Best Regards

Shasha

The reason it starts gumming up is because you are potentially creating millions of records for this process - in step one, you are only scanning the records in the table, but then you're exponentially scanning the table - like this example with just 10 rows:

1) Look for relationship in 10 rows

2) Look for relationship in 10 x 10 rows (100)

3) Look for relationship in 10 x 10 x 10 rows (1000)

4) Look for relationship in 10 x 10 x 10 x 10 rows (10,000)

5) 100,000

6) 1,000,000

You would have to totally revisit the structure to avoid this.

|||

Hello Sohnee,

I'm sorry that i can not good understand what you mean. Could you please explain more some details? Thank you very much.

Regards

Shasha

|||

Sorry, I don't have time to write the exact query, but perhaps someone else here wants to take a shot.

Assuming the that IDMember is "parent", and IDOtherMember is "child". Build a recursive CTE, that returns all the children of the parent up to x levels deep (columns parent,child,path_from_parent_to_child,depth).

Then using the CTE, filter only the rows where parent is equal to an input parameter, child is equal to an input parameter, and depth is equal to or less than an input parameter.

|||

@.MaxDepth determines the "number of processors". If you want to search up to 7 deep, set @.MaxDepth to 7. The results are not exactly the same, but similiar enough. The Path column will include a comma delimited list of ID's it went through to reach the other member.

DECLARE @.MaxDepthint

DECLARE @.IDMemberint

DECLARE @.IDOtherMemberint

SET @.MaxDepth=10

SET @.IDMember=3700

SET @.IDOtherMember=3726

WITH DirectReports(IDMember, IDOtherMember,Path, Depth)

AS

(

-- Anchor member definition

SELECT IDMember, IDOtherMember,','+CAST(IDMemberasvarchar(MAX))+','+CAST(IDOtherMemberASvarchar(MAX))+','ASPath, 1AS Depth

FROM ValidRelationship

WHERE IDMember=@.IDMember

UNIONALL

-- Recursive member definition

SELECT t2.IDMember, t1.IDOtherMember, t2.Path+CAST(t1.IDOtherMemberASvarchar(MAX))+','ASPath, Depth+ 1

FROM ValidRelationship t1

INNERJOIN DirectReportsAS t2ON t2.IDOtherMember= t1.IDMember

WHERECHARINDEX(','+CAST(t1.IDOtherMemberASVARCHAR(MAX))+',',t2.Path,0)=0

AND t2.Depth<@.MaxDepth

)

-- Statement that executes the CTE

SELECTTOP 1 IDMember,IDOtherMember,SUBSTRING(Path,2,LEN(path)-2),Depth

FROM DirectReports

WHERE IDOtherMember=@.IDOtherMember

ORDERBY DepthDESC

|||

I changed the query from what I described above so that it won't allow circular paths (3700,3726,3700, etc) because they will never be "shortest", and also to reduce processing time.

I also implemented the MaxDepth differently. Instead of allowing the CTE to recurse completely and then filtering based on the depth, I terminate the recursion at the MaxDepth.

|||

hello to motly,

Thank you very much. With your help i have resolved this problem. Now i can quickly get a result. Thanks

Best Regard

Shasha

No comments:

Post a Comment