Opened 4 weeks ago

Last modified 3 weeks ago

#37000 assigned Bug

cursor_iter relies on GC for server-side cursor cleanup, causing transaction abort after savepoint rollback

Reported by: Ratskó László Owned by: Vidhi Singh
Component: Database layer (models, ORM) Version: 4.2
Severity: Normal Keywords: iterator, server-side-cursor, savepoint, psycopg3
Cc: Huwaiza, Simon Charette Triage Stage: Accepted
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

Description

When QuerySet.iterator() is used inside transaction.atomic() and an exception interrupts the iteration, the server-side cursor opened by cursor_iter is not closed eagerly. The generator stays alive (referenced from the exception's traceback), and the cursor is only closed when GC collects the generator — by which time the savepoint has already been rolled back, destroying the cursor on the PostgreSQL side.

The delayed cursor.close() sends a CLOSE command for a non-existent cursor, which raises InvalidCursorName and aborts the entire transaction. Any subsequent DB operations fail with InFailedSqlTransaction.

This was discussed with the psycopg maintainer in https://github.com/psycopg/psycopg/discussions/1282, who suggested that we open a discussion about this here is well.

Reproduction

from django.db import transaction
from myapp.models import Item

def export_items():
    try:
        with transaction.atomic():  # creates a savepoint
            for item in Item.objects.iterator():  # opens server-side cursor
                if item.value == "bad":
                    raise ValueError("Export failed")
                process(item)
    except ValueError:
        # Savepoint was rolled back, cursor destroyed on PostgreSQL side.
        # But cursor_iter generator is still alive in the traceback.
        # GC will eventually close() the cursor → CLOSE fails → transaction aborted.
        Item.objects.create(value="error logged")  # ← fails with InFailedSqlTransaction

What happens step by step

  1. Item.objects.iterator() opens a server-side cursor via cursor_iter
  2. Exception is raised mid-iteration
  3. transaction.atomic().__exit__ rolls back the savepoint → PostgreSQL destroys the cursor
  4. The cursor_iter generator is not closed — still referenced from the exception traceback
  5. GC collects the generator → finally: cursor.close() runs
  6. psycopg sends CLOSE "cursor_name"InvalidCursorName
  7. The failed SQL aborts the transaction
  8. Subsequent queries fail with InFailedSqlTransaction

The problem in cursor_iter

# django/db/models/sql/compiler.py
def cursor_iter(cursor, sentinel, col_count, itersize):
    try:
        for rows in iter((lambda: cursor.fetchmany(itersize)), sentinel):
            yield rows if col_count is None else [r[:col_count] for r in rows]
    finally:
        cursor.close()  # only runs when generator is closed or GC'd

The finally block is correct for normal completion, but when an exception interrupts the generator mid-yield, the close is deferred to GC. By that time, a savepoint rollback may have already destroyed the cursor.

Change History (8)

comment:1 by Simon Charette, 4 weeks ago

Triage Stage: UnreviewedAccepted

Accepting but I'd like to clarify a specific point.

In your report you mention that entering the transaction.atomic context will create a savepoint but in practice that only happens if the atomic block is nested which is not included in your example.

Can you also reproduce without using a nested transaction as the example you provided catches an exception inside atomic which is a documented anti-pattern?

We should make named cursor closing more resilient but it'd be good to know if this can be triggered with proper usage of transaction.atomic.

comment:2 by Ratskó László, 4 weeks ago

Thanks for accepting the ticket!

To clarify: yes, this is specifically a nested transaction.atomic() issue (savepoints). We have ATOMIC_REQUESTS = True, so every request is already in a transaction, and any transaction.atomic() in our code creates a savepoint. Sorry for not mentioning that in the report.

Our pattern follows the documented approach — catching exceptions outside the atomic block:

# ATOMIC_REQUESTS = True → request is already in a transaction

def export_view(request):
    try:
        with transaction.atomic():  # this creates a savepoint (nested)
            for item in queryset.iterator():  # server-side cursor
                process(item)  # may raise
    except SomeError:
        save_error_status()  # ← fails with InFailedSqlTransaction

This matches the pattern from the docs:

@transaction.atomic
def viewfunc(request):
    create_parent()

    try:
        with transaction.atomic():
            generate_relationships()
    except IntegrityError:
        handle_exception()

    add_children()

We tested both scenarios:

Nested (savepoint) — FAILS:

  with transaction.atomic():          # outer (ATOMIC_REQUESTS)
      try:
          with transaction.atomic():  # inner → savepoint
              for item in queryset.iterator():
                  raise ValueError()
      except ValueError:
          Model.objects.create(...)    # ← InFailedSqlTransaction

Non-nested (full rollback) — WORKS:

  try:
      with transaction.atomic():      # not nested → full transaction
          for item in queryset.iterator():
              raise ValueError()
  except ValueError:
      Model.objects.create(...)        # ← OK, transaction is IDLE after full rollback

Let me know if we can help somehow by providing more information. Thanks again!

comment:3 by Huwaiza, 4 weeks ago

Cc: Huwaiza added
Owner: set to Huwaiza
Status: newassigned
Triage Stage: AcceptedReady for checkin

comment:4 by Simon Charette, 4 weeks ago

Cc: Simon Charette added
Owner: Huwaiza removed
Status: assignednew
Triage Stage: Ready for checkinAccepted

Huwaiza please refrain from assigning ticket to yourself mid conversation with reports and changing ticket status -- RFC is a status meant that the patch is ready to be checked in and there's no patch here. On top of that you can't mark your own patch as RFC.

Refer to the triaging ticket documentation.


Thanks for confirming Ratskó, I'll see if I can write a regression test that reproduces the issue as I've been unable to do so yet. #28062 is a related ticket, savepoint rollback might have been a cause all along.

Last edited 4 weeks ago by Simon Charette (previous) (diff)

comment:5 by Simon Charette, 4 weeks ago

I was able to reproduce with the following test

  • tests/backends/postgresql/test_server_side_cursors.py

    diff --git a/tests/backends/postgresql/test_server_side_cursors.py b/tests/backends/postgresql/test_server_side_cursors.py
    index 9a6457cce6..900a2dc229 100644
    a b  
    11import operator
     2import gc
    23import unittest
    34from collections import namedtuple
    45from contextlib import contextmanager
    56
    6 from django.db import connection, models
     7from django.db import connection, models, transaction
    78from django.db.utils import ProgrammingError
    89from django.test import TestCase
    910from django.test.utils import garbage_collect
    class ServerSideCursorsPostgres(TestCase):  
    154155            # most likely need to be adapted.
    155156            with self.assertRaises(ProgrammingError):
    156157                perform_query()
     158
     159    def test_transaction_cursor_closing(self):
     160        with self.assertRaises(ValueError), transaction.atomic():
     161            persons = Person.objects.iterator()
     162            next(persons)
     163            raise ValueError
     164        del persons
     165        gc.collect()
     166        list(Person.objects.all())

Without the last query we still get presented with a resource warning

Exception ignored while closing generator <generator object cursor_iter at 0x7ffb1b97fab0>:
Traceback (most recent call last):
  File "/django/source/django/db/models/sql/compiler.py", line 2266, in cursor_iter
    cursor.close()
  File "/usr/local/lib/python3.14/site-packages/psycopg/_server_cursor.py", line 73, in close
    self._conn.wait(self._close_gen())
  File "/usr/local/lib/python3.14/site-packages/psycopg/connection.py", line 484, in wait
    return waiting.wait(gen, self.pgconn.socket, interval=interval)
  File "psycopg_binary/_psycopg/waiting.pyx", line 241, in psycopg_binary._psycopg.wait_c
  File "/usr/local/lib/python3.14/site-packages/psycopg/_server_cursor_base.py", line 163, in _close_gen
    yield from self._conn._exec_command(query)
  File "/usr/local/lib/python3.14/site-packages/psycopg/_connection_base.py", line 483, in _exec_command
    raise e.error_from_result(result, encoding=self.pgconn._encoding)
psycopg.errors.InvalidCursorName: cursor "_django_curs_140716552528704_sync_1" does not exist
Exception ignored while calling deallocator <function ServerCursorMixin.__del__ at 0x7ffb1c113060>:
Traceback (most recent call last):
  File "/usr/local/lib/python3.14/site-packages/psycopg/_server_cursor_base.py", line 55, in __del__
    __warn(
ResourceWarning: <django.db.backends.postgresql.base.ServerSideCursor object at 0x55b16299aac0> was deleted while still open. Please use 'with' or '.close()' to close the cursor properly

The approach in the Claude generated MR has merit but it's too naive in the sense that it tries to close all server side cursors on each savepoint rollbacks but these can be nested and we likely want to keep weakrefs to cursors as to avoid memory leaks.

An ideal solution would weak track cursors by savepoint ID and only do so in nested transactions. I'll note that even if we do so failing to consume iterators returned by QuerySet.iterator entirely will always incur a ResourceWarning (transaction or not) so maybe we should also (or instead) adjust the documentation to mention that the following pattern should be used instead?

import contextlib

def export_items():
    try:
        with (
            transaction.atomic(),
            contextlib.closing(Item.objects.iterator()) as items
         ):
            for item in items:
                if item.value == "bad":
                    raise ValueError("Export failed")
                process(item)
    except ValueError:
        Item.objects.create(value="error logged")

to enforce explicit closing of iterators? Maybe we should even have the object returned by QuerySet.iterator be a context closing of itself so we can document

def export_items():
    try:
        with (
            transaction.atomic(),
            Item.objects.iterator() as items,
         ):
            for item in items:
                if item.value == "bad":
                    raise ValueError("Export failed")
                process(item)
    except ValueError:
        Item.objects.create(value="error logged")

comment:6 by Vidhi Singh, 4 weeks ago

May I work on this?

Last edited 4 weeks ago by Vidhi Singh (previous) (diff)

comment:7 by Vidhi Singh, 4 weeks ago

Owner: set to Vidhi Singh
Status: newassigned

comment:8 by Vidhi Singh, 3 weeks ago

I’ve worked on this issue and opened a PR to make server-side cursor closing more
resilient when iterator() is interrupted inside transaction.atomic().

The fix ensures that delayed cursor.close() calls do not raise InvalidCursorName
and break the transaction.

PR -> https://github.com/django/django/pull/20991

Last edited 3 weeks ago by Vidhi Singh (previous) (diff)
Note: See TracTickets for help on using tickets.
Back to Top