Ticket #5137 (closed defect: fixed)

Opened 1 year ago

Last modified 1 year ago

Taxonomy intersection queries are inefficient

Reported by: ryan Assigned to: anonymous
Priority: normal Milestone: 2.3.1
Component: General Version: 2.3
Severity: normal Keywords: taxonomy performance
Cc:

Description

Performing intersection queries by equating like attributes is very inefficient when intersecting more than a few terms. Let's use the method where we perform a union query, count the number of terms for each post, and fetch the posts that have all terms.

Attachments

intersection.diff (1.4 kB) - added by ryan on 10/03/07 18:59:12.
New query for tag_slugand from Alexander Concha
intersection.2.diff (3.0 kB) - added by xknown on 10/03/07 21:49:34.
New query for tag and category intersections

Change History

10/03/07 18:59:12 changed by ryan

  • attachment intersection.diff added.

New query for tag_slugand from Alexander Concha

10/03/07 21:49:34 changed by xknown

  • attachment intersection.2.diff added.

New query for tag and category intersections

10/03/07 21:56:54 changed by xknown

I improved a bit my first suggestion since category__and, tag__and and tag_slug__and should use similar queries, see intersection.2.diff.

10/04/07 08:09:57 changed by Otto42

Few problems:

1. The resulting SQL these patches make doesn't work. Add spaces to the beginning of your quoted strings so you don't shove words together by accident like that.

2. What in the heck is the point of the $where = "AND 0 = 1"; bits?

10/04/07 17:49:13 changed by ryan

Without 0 = 1, if the intersection queries don't match anything, we fall through and do the regular front page query. We need to prevent that query from matching matching anything.

10/04/07 18:10:04 changed by ryan

  • status changed from new to closed.
  • resolution set to fixed.

(In [6186]) New taxonomy intersection queries from xknown. fixes #5137

10/04/07 18:11:03 changed by ryan

(In [6187]) New taxonomy intersection queries from xknown. fixes #5137 for 2.3

10/05/07 07:48:15 changed by Otto42

  • status changed from closed to reopened.
  • resolution deleted.

As I pointed out, these won't work.

$tsql = "SELECT p.ID FROM ... blah ... ON (tt.term_id = t.term_id)"; 
$tsql .= "WHERE tt.taxonomy = '$taxonomy' AND t.$taxonomy_field IN ('" . implode("', '", $q[$item]) . "')"; 
$tsql .= "GROUP BY p.ID HAVING count(p.ID) = " . count($q[$item]);

You end up with no spaces between the ) and the WHERE or the ) and the GROUP. The query should fail unless MySQL is a lot smarter than any other database I've ever used.

10/05/07 10:18:04 changed by DD32

You end up with no spaces between the ) and the WHERE or the ) and the GROUP. The query should fail unless MySQL is a lot smarter than any other database I've ever used.

MySQL must be smarter than any other database you've used then. MySQL handles that SQL perfectly (It doesnt need the space between the bracket and WHERE/GROUP statement; If the brackets were left out of that statement however, MySQL wouldnt handle too well).

Mind you, I would suggest that the spaces be added at least for those who want to try and use other databases, as they're probably not as forgiving.

10/05/07 16:27:48 changed by ryan

  • status changed from reopened to closed.
  • resolution set to fixed.

(In [6189]) Add some spacing. fixes #5137 for trunk

10/05/07 16:28:42 changed by ryan

(In [6190]) Add some spacing. fixes #5137 for 2.3

(follow-up: ↓ 11 ) 10/08/07 14:56:01 changed by Denis-de-Bernardy

  • status changed from closed to reopened.
  • resolution deleted.

why not use a subquery?

D.

(in reply to: ↑ 10 ) 10/08/07 18:49:25 changed by foolswisdom

  • status changed from reopened to closed.
  • resolution set to fixed.

Replying to Denis-de-Bernardy:

why not use a subquery?

Why not attach a patch.

10/10/07 20:17:07 changed by Denis-de-Bernardy

  • status changed from closed to reopened.
  • resolution deleted.

because if you've the code in there already... just replace:

$post_ids = $wpdb->get_col($tsql);

(and the stuff that follows)

with:

$where .= " AND $wpdb->posts.ID IN ( $tsql ) ";

it save a return trip to the sql server and some needless array processing

D.

(follow-up: ↓ 14 ) 10/10/07 20:25:46 changed by Denis-de-Bernardy

Also, some testing will highlight that inner joins are a lot faster than left joins.

but if it's just to answer "attach a patch" like you did in your last post, just close the ticket. As much as I like WP, I've absolutely no time to waste writing patches that almost always get discarded.

(in reply to: ↑ 13 ) 10/10/07 21:47:39 changed by foolswisdom

Replying to Denis-de-Bernardy:

but if it's just to answer "attach a patch" like you did in your last post, just close the ticket. As much as I like WP, I've absolutely no time to waste writing patches that almost always get discarded.

Denis, you know the routine. If you are finding your issues and patches aren't getting the attention they desire then bring it up on wp-hackers. Some of your tickets aren't good reports because they lack the necessary details, and don't allow myself or anyone else to advocate the problem's correction.

10/11/07 01:24:20 changed by ryan

  • status changed from reopened to closed.
  • resolution set to fixed.

I think subqueries were introduced in 4.1 and we still have a 4.0 or greater dependency. When we bump the dependency we can move to subqueries.

I'll open another ticket about using INNER JOINs rather than LEFT JOINs. Closing this ticket.