Opened 9 years ago

Closed 9 years ago

Last modified 9 years ago

#13073 closed (invalid)

Duplicate rows when checking ID

Reported by: jnadro52 Owned by: nobody
Component: Database layer (models, ORM) Version: 1.1
Severity: Keywords:
Cc: Triage Stage: Unreviewed
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no



I hope that this is a legitimate bug and I'm not wasting your time with something that is a slight misunderstanding of how the Django database layer works. Here we go...

I have two tables... a table that holds football matchups (texans vs. giants, for example) and a table that caches the statistics for the game. The table that caches the statistics would have a row per team, so this is what the data would look like:

ID       team               opponent              matchup_id
1        texans             giants                1
2        giants             texans                1

So now I want to retrieve these two rows by matchup ID. Lets say the model names are "matchup" and "cache". So, I attempted to go retrieve them like so:

caches = cache.objects.filter(matchup=1)

When I do so, I do get two objects returned, but they are both the same cache row that is returned. So I would get a QuerySet containing two objects, both of which would have the same primary key.

What really confused me is that if I added a .distinct() method call to the end of that call, I would get the expected results. If it is the case where there is a join occurring I would understand getting duplicates, even though I would think there should be no join in this call, since i am just checking a foreign key. The kicker is that I always get row id 2 and never row id 1 when I make this call without the distinct call as well. If this were a cross-join, I should get multiple duplicates of all rows that meet the criteria, but I do not.

I hope I explained this well enough, please feel free to contact me at my given email if you have any questions.

Change History (4)

comment:1 Changed 9 years ago by Russell Keith-Magee

Resolution: worksforme
Status: newclosed

Closing worksforme. I can't reproduce your problem. If what you are describing were happening as you describe.

A minimal example (that is - actual models, sample data and queries) that demonstrates the problem is essential if you want to report a problem like this.

comment:2 Changed 9 years ago by jnadro52

Resolution: worksforme
Status: closedreopened

I will try to provide more information to allow you to reproduce this bug. I'm surprised that I am seeing this behavior and you are not, as it seems to be very simple case. This is apparent for me in Django 1.1 Final.


class Team(models.Model):
	name = models.CharField(max_length=255, help_text='Team Name')

class Matchup(models.Model, PlayPickObject):
	league = models.ForeignKey('SportBettingApp.League', help_text='Matchup\'s League')

class TeamVsTeamStatCache(models.Model):
	snapshotDate = models.DateTimeField(help_text='The date this cache is for')
	matchup = models.ForeignKey('SportBettingApp.Matchup', help_text='The matchup object this cache refers to', null=True, default=None)
	team = models.ForeignKey(Team, help_text='Team these stats refer to', editable=False)
	opponent = models.ForeignKey(Team,related_name='opponent', help_text='The opponent team this team is playing', editable=False)


2> import SportBettingApp.models
3> sMod = SportBettingApp.models
4> tvt = sMod.TeamVsTeamStatCache
5> caches = tvt.objects.filter(matchup=9598)
6> caches[0].id
<6> 2
7> caches[1].id
<7> 2

As you can see, this is a very simple model structure. If you are curious about the PlayPickObject that Matchup inherits from, let me know. This is not a Model class, so it should have no bearing on the ORM features, it only adds some methods and properties.

I am reopening this, I hope you don't mind, I just don't want it to fall to the wayside. Thanks.

comment:3 Changed 9 years ago by Karen Tracey

Resolution: invalid
Status: reopenedclosed

It appears you are indexing into an unordered QuerySet. The caches[0].id type of query results in a database query that uses OFFSET and LIMIT to retrieve exactly one result. If you have not done anything to require a unique ordering for the query results the database is free to return any matching result for any particular OFFSET (and PostgreSQL in particular is known to take advantage of this).

comment:4 Changed 9 years ago by jnadro52


I feel so stupid. I forgot the simple fact that all of these querysets are lazy evaluated. For some reason, I expected an array of items from the database, instead of having the queryset evaluated every time when getting accessed by the index. For example, if I enumerate the items they certainly DO get emitted correctly.

I'm sorry if I've wasted your time. Thank you.

Note: See TracTickets for help on using tickets.
Back to Top