Postgres anti join

9/13/2023

So the same issue as with many-to-many where this query is orders of magnitude slower than a slightly modified one: ON ("demo_parent"."id" = "demo_child"."parent_id") SELECT "demo_parent"."id", "demo_parent"."name" Parents = models.ForeignKey(to=Parent, db_column="parent_id") I realized over the weekend that the same issue exists for one-to-many relationships. Sidenote: why exactly are we using deferred foreign keys, not foreign keys with DEFERRABLE INITIALLY IMMEDIATE and set them to deferred only when needed in fixture loading? Or am I missing some other valid use case for deferred foreign keys in Django? If so, that would naturally also make the above join-removal optimization invalid. Django has an advantage in that even if we are using deferred foreign keys, we know that in normal usage the defer should not have any effect. This complication of deferred foreign keys also make this a hard optimization target for databases. If the user ends up in situation where he has a row in the m2m table, but not in the target table, he has done something wrong - m2m targets should always be saved into the DB first before adding them to m2m collections. I don't know if that is something Django should guard against. There is a slight complication because of deferred foreign keys used in Django - it is possible to insert a row into course_students table without there being a matching row in course table.

The last join is not needed, as the null/not null status of the first join tells us already if the filter condition is correct or not. Naturally, if there is no rows in the first join, then there will be no rows in the second join, either. This means that if the first join yields any rows, so must the second join, too. Foreign keys ensure that if there is a row in course_students table, then there will be a matching row in course table. The last join in that chain is not necessary at all. Thus a join against only the m2m intermediate table should suffice. The query is asking for items with no m2m assigned items. To see the generated SQL query, simply run "python manage.py anti-join." I'm attaching a sample project with the model already set up. Here's the difference I'm seeing on real data: Changing WHERE to "course_students"."student_id" IS NOT NULL yields orders of magnitude improved query plan. The problem is that the way the WHERE clause is generated is very inefficient (at least when used with Postgres). ON ("course_students"."course_id" = "course"."id") ON ("student"."id" = "course_students"."student_id") (course_isnull=True)ĭjango translates this into the following query: Wrong results with postgres_fdw and merge anti join from RHEL 7.For background on this ticket please see the following two discussions:īasically, in a many-to-many mappings between models Student and Course, if I want to find all instances of Students that aren't registered for classes, I would issue the following Django query: Re: Wrong results with postgres_fdw and merge anti join from RHEL 7.9 To RHEL 8.7 at 19:46:43 from Jim Mlodgenski To RHEL 8.7 at 18:51:51 from Daniel Westermann (DWE) Re: Wrong results with postgres_fdw and merge anti join from RHEL 7.9.RHEL 8.7 at 18:33:08 from Daniel Westermann (DWE) Wrong results with postgres_fdw and merge anti join from RHEL 7.9 to.That's certainly a hazard, but do the servers even have the same > I am aware that the version of glibc is not the same between those red hats. > Remote SQL: SELECT cprd, xtc_id, rprd, prdgalsts_id, dlz_last_transaction_ts FROM ro_rsu.clb_global_product ORDER BY cprd ASC NULLS LAST > Output: clb_global_product.cprd, clb_global_product.xtc_id, clb_global_product.rprd, clb_global_product.prdgalsts_id, clb_global_product.dlz_last_transaction_ts

> -> Foreign Scan on ro_dlz.clb_global_product (cost=16.16 rows=923613 width=34) > Output: f2.cprd, f2.xtc_id, f2.rprd, f2.prdgalsts_id, f2.dlz_last_transaction_ts > -> Index Scan using data_2d_clb_global_product_pkey on rsu_adm.data_2d_clb_global_product f2 (cost=.56 rows=101426 width=34) > Merge Cond: ((f2.cprd)::text = (clb_global_product.cprd)::text) The given plan is at hazard for that because it intends to do Inconsistent collations could break that. Ideas about sort ordering, and if "cprd" is a text-type column then You didn't provide anything useful like the table schemas, butĬorrectness of a merge join depends on the servers having the same > Per default we see a merge anti join, and this gives results, which is wrong: > LEFT OUTER JOIN "rsu_adm"."clb_global_product" f1 on f1.cprd=f2.cprd

> SELECT * FROM "rsu_adm"."data_2d_clb_global_product" f2 > Target instance PostgreSQL 13.7 on RHEL 8.7 > Source instance: PostgreSQL 13.7 on RHEL 7.9 I am not sure if this qualifies as bug, but anyway:

0 Comments

Postgres anti join

Leave a Reply.

Author

Archives

Categories