Communities

Writing
Writing
Codidact Meta
Codidact Meta
The Great Outdoors
The Great Outdoors
Photography & Video
Photography & Video
Scientific Speculation
Scientific Speculation
Cooking
Cooking
Electrical Engineering
Electrical Engineering
Judaism
Judaism
Languages & Linguistics
Languages & Linguistics
Software Development
Software Development
Mathematics
Mathematics
Christianity
Christianity
Code Golf
Code Golf
Music
Music
Physics
Physics
Linux Systems
Linux Systems
Power Users
Power Users

Dashboard
Notifications
Mark all as read
Q&A

How to break infinite loop in CTE

+4
−0

I have a parent child relation in my table, with possibly circular cases. Is it possible to break the infinite recursivity in CTE checking values of all previous rows? I would need something like this:

with my_cte(childId, parentId, col1)
AS (
SELECT r.childId, r.parentId, r.col1
FROM My_Table r WHERE r.childId = 'x' AND (r.col1 = 1 or r.col1 = 2)
UNION ALL 
SELECT rel.childId, rel.parentId, rel.col1
	FROM My_Table rel INNER JOIN my_cte sd ON rel.childId = sd.parentId where (rel.col1 = 1 OR rel.col1 = 2) AND ...
-- check that rel.childId, rel.parentId not in previous rows
)
SELECT DISTINCT * from shared_documents

How can I achieve this? Or how to approach this?

Example table with records would be:

childId | parentId | col1

1 | 3 | 1

1 | 4 | 1

3 | 5 | 1

5 | 1 | 1 -- circular case

6 | 2 | 1

5 | 7 | 1

If I ask for hierarcy related to 1, I should get

childId | parentId | col1

1 | 3 | 1

1 | 4 | 1

3 | 5 | 1

5 | 1 | 1 -- circular case

5 | 7 | 1

  • 1 has two parents: 3 and 4.
  • 3 has 1 parent: 5
  • 5 has 2 parents: 1 and 7. But 1 brings the infinite loop, so the query should ignore this. And continue to the next row.
Why does this post require moderator attention?
You might want to add some details to your flag.
Why should this post be closed?

4 comments

id isn't relevant. you have to look for childId and parentId. artaxerxe‭ 25 days ago

@Estela have a look now at my edited post. artaxerxe‭ 25 days ago

Ok, so a line with id=X childId=C and parentId=P means that C is the child of P. I was understanding that C was the child of X and P the parent of X. Another question. Is the relationship transitive? That is, in your example, since 5 is the child of 1 and 1 is the child of 3. Does it mean that 5 should be considered as a child of 3? Estela‭ 25 days ago

@Estela the relationship is transitive. I would say that in my case 5 should be considered as a grandchild of 3. But let's see where we get. artaxerxe‭ 25 days ago

3 answers

+3
−0

I have not managed to make an answer with standard SQL. I've had to resort to using arrays. And I don't have a Microsoft SQL Server RDBMS at hand. So I've done it instead in PostgreSQL. You'll just need to use whatever equivalent array functions sql-server has.

  • ARRAY(SELECT XX) : constructs an array with a single element XX
  • ARRAY_CAT( a1, a2) : concatenates two arrays
  • ARRAY_POSITION(a, e) : returns the position of e in array a. NULL if e is not in array a.

This is half the answer: how to get the hierarchy for all descendants (children, grandchildren, etc).

For the full answer you just need to do the same for ascendants and then perform a union.

WITH RECURSIVE cte AS (
    SELECT
        childid,
        parentid,
        ARRAY(SELECT XX) a
    FROM my_table
    WHERE parentid=XX

    UNION ALL
    
    SELECT 
        m.childid,
        m.parentid,
        ARRAY_CAT( cte.a, ARRAY(SELECT m.parentid))
    FROM 
        my_table m 
        INNER JOIN cte 
            ON cte.childid = m.parentid
    WHERE ARRAY_POSITION(cte.a, m.parentid) is null
)
SELECT DISTINCT childid, parentid FROM cte;

Replace XX by the id for which you are querying its descendants.

This works by storing in an extra colum an array with all parent ids which have been processed. When doing the recursive CTE we only select rows with a parentid which is not contained in such array. This way we break the loop.

The code as such is hard to understand. I suggest to run the following code and check the result. Which will help in understanding the previous code.

WITH RECURSIVE cte AS (
    SELECT
        1 as iteration,
        childid,
        parentid,
        ARRAY(SELECT XX) a
    FROM my_table
    WHERE parentid=XX

    UNION ALL
    
    SELECT 
        iteration+1,
        m.childid,
        m.parentid,
        ARRAY_CAT( cte.a, ARRAY(SELECT m.parentid))
    FROM 
        my_table m 
        INNER JOIN cte 
            ON cte.childid = m.parentid
    WHERE ARRAY_POSITION(cte.a, m.parentid) is null
            
)
SELECT iteration, childid, parentid, a FROM cte ORDER BY iteration;

Which, for XX=1, returns the following table:

iteration | childid | parentid | a
--------------------------------------------
     1    |    5    |    1     | {1}
     2    |    3    |    5     | {1,5}
     3    |    1    |    3     | {1,5,3}

Which are the descendants of 1, as expected. With their immediate parents and the array of checked parents as it grows.

Why does this post require moderator attention?
You might want to add some details to your flag.

2 comments

Nope. ON clause is good. See my edit. artaxerxe‭ 25 days ago

@artaxerxe I've fully reworked my answer. Estela‭ 25 days ago

+2
−0

Estela's answer provides great insight about how to do it also in SQL Server. Unfortunately, there does not seem to be a build-in array functionality, so one way is to rely on strings as shown here.

Basically, instead of accumulating values in an array, a string does this (way less efficiently for many records, I would say).

In this case, a query that seems to provide the results you are looking for is the following (some security check does not allow to post the query, so I am using SQL fiddle for sharing it):

http://sqlfiddle.com/#!18/a50101/1

Note: considering how complex things might get (edge cases might add up to the logic), this type of query might be better be done in a language/framework like C++, Java or .NET, unless there is a very strong reason to remain in SQL. These allow for much more flexibility, considering the iterative nature of the algorithm (not so suitable for set based operations) and also provide way better-debugging capabilities.

Why does this post require moderator attention?
You might want to add some details to your flag.

5 comments

Stored procedures might be a sensible compromise. I'm not aware of step-through debuggers, but they allow you to do the iterative part on the SQL server without network overhead each time round the loop. Peter Taylor‭ 24 days ago

@PeterTaylor - While I have experience writing stored procedures, I am also aware of the many disadvantages that they might bring in the project. AFAIK, unless there is a really serious reason to incorporate business logic in a stored procedure (e.g. avoid fetching huge amounts of data in the application layer), having the business logic in the application layer is the recommended way nowadays. Alexei‭ 24 days ago

@Alexei 'this type of query might be better be done in a language/framework like C++, Java or .NET'. Yes, I'm forced to remain to SQL. But do you mean to use an ORM or do you mean to make the logic from such a framework? I ask because second option seems to me not efficient. artaxerxe‭ 18 days ago

@artaxerxe‭ If you don't need to compute this too often and the data volume is pretty low (<100K records, also depends on the frequency), you can fetch all the required data in the application layer and implement the algorithm there. Languages such as Java or C# are better equipped for recursivity, tree traversal algorithms or similar. Alexei‭ 17 days ago

@Alexei In my case, I think this wouldn't be recommendable, even if I shouldn't stick with SQL. Anyway, taking 100K rows form DB to parse them on the middleware seems to me not a good idea. I think this should be done on the DB side. Your SQL is quite useful for me except the recursivity limitation. I think because of this I have to move the logic in a stored procedure. Although maximum recursivity number would be quite enough I should consider edge cases. artaxerxe‭ 17 days ago

+2
−0

For small tables you can bound the recursion depth:

WITH my_cte(childId, parentId, depth, max_depth)
AS (
    SELECT r.childId, r.parentId, 1, (SELECT COUNT(*) FROM My_Table)
    FROM My_Table r
    WHERE r.childId = 1

UNION ALL

    SELECT rel.childId, rel.parentId, sd.depth + 1, sd.max_depth
    FROM My_Table rel INNER JOIN my_cte sd ON rel.childId = sd.parentId
    WHERE sd.depth < sd.max_depth
)
SELECT DISTINCT childId, parentId
FROM my_cte

This approach is limited by the bound that MAXRECURSION <= 32767.

Why does this post require moderator attention?
You might want to add some details to your flag.

1 comment

@Peter, the only problem with this is that if you get cycles, you will do unnecessary work. artaxerxe‭ 17 days ago

Sign up to answer this question »