Ticket #3985 (closed defect: fixed)

Opened 1 year ago

Last modified 1 year ago

_get_cat_children is very inefficient

Reported by: ryan Assigned to:
Priority: high Milestone: 2.2
Component: Administration Version: 2.1.2
Severity: major Keywords: 2nd-opnion category optimization hit-list
Cc: charleshooper

Description

_get_cat_children is slow and stupid. Doing DB queries to get the children of each cat is faster than using _get_cat_children.

One proposal is to store an array that contains the hierarchy. array(parent => child, child => grandchild). Categories that have no children would not be present in the array. The drawback is that this would require keeping the array in sync with category changes. Any other suggestions to make creating the hierarchy more efficient while minimizing DB queries?

Attachments

3985.diff (2.0 kB) - added by ryan on 04/03/07 23:47:57.
First cut at caching category children

Change History

03/17/07 02:54:48 changed by ryan

See also #3614

03/17/07 08:33:09 changed by JeremyVisser

Please forgive my ignorance, as your PHP skills would be way better than mine, but when I do hierachies in arrays, I generally use matrixes (matrices?):

array(
    parent => array(
        child => array(
            grandchild => 'value'
        ),
    ),
)

Is this what you meant, or is your method better performance-wise? Or should I just stay away from Trac? ;)

03/17/07 08:39:28 changed by ryan

Acutally, we should probably do it with left and right pointers.

parent => (root, child), child => (parent, grandchild)

03/22/07 05:52:43 changed by charleshooper

  • cc set to charleshooper.

I agree.

If we use the hierarchy method mentioned above we would recursively have to search through the array whereas with the left and right pointer-method it wouldn't matter how "deep" the array in question was, the search is simple.

I have some free time tonight, I'd love to get some work done on this.

03/22/07 06:52:11 changed by charleshooper

  • keywords changed from category optimization to 2nd-opnion category optimization.
  • owner changed from anonymous to charleshooper.
  • status changed from new to assigned.

Just clarifying some things here -- feel free to correct me if I'm wrong in these

- Syncing the array should be done only when categories are created, edited, or deleted - A "sanity check" should be done upon access to the Manage Categories page. This is how it's done for plugins anyways, so I'm just trying to keep our methods consistent.

On second thought, perhaps instead of a sanity check this could be a good time to rebuild the array completely from the database.

I would also like this array to be storable using Wordpress' built-in caching functions.

03/22/07 07:08:50 changed by charleshooper

For quickly and efficiently building the array I recommend we add a table to the schema: wp_cat_hierarchy

wp_cat_hierarchy would consist of 2 columns: parent_id and child_id. There would be a unique key on the pair. This would allow us to quickly and efficiently build our hierarchy array.

03/27/07 17:42:35 changed by foolswisdom

  • milestone changed from 2.2 to 2.3.

03/27/07 18:45:43 changed by ryan

  • keywords changed from 2nd-opnion category optimization to 2nd-opnion category optimization hit-list.
  • milestone changed from 2.3 to 2.2.

Actually, I'd like to try to get this into 2.2. WP sucks with a large number of categories because of this bug.

03/27/07 18:46:58 changed by ryan

For 2.2, I think an array approach would suffice. I'd rather not muck with new tables right now.

03/27/07 19:14:53 changed by foolswisdom

  • priority changed from low to high.
  • severity changed from normal to major.

03/27/07 22:41:52 changed by charleshooper

  • owner deleted.
  • status changed from assigned to new.

04/03/07 23:47:57 changed by ryan

  • attachment 3985.diff added.

First cut at caching category children

04/03/07 23:50:03 changed by ryan

Patch creates array of categories and their children and sticks it in the options DB. Array is consulted to avoid descending through _get_cat_children() when there are no children. Next pass will remove the recursion.

04/04/07 20:44:48 changed by ryan

(In [5178]) Cache category hierarchy to make cat listing faster. Phase 1. see #3985

04/04/07 20:49:09 changed by ryan

That change avoids walking the category list for cats that have no children. With a categories table containing 3100 cats arranged in a flat hierarchy, time need to list categories went from minutes to seconds. Next I need to eliminate recursion altogether to sped up listing for those who use lots of hierarchy.

04/04/07 22:32:26 changed by ryan

(In [5179]) Category listing speedups. see #3985

04/05/07 04:00:21 changed by yskin

What about replace

delete_option('category_children'); 

in clean_category_cache() to

update_option('category_children', ""); 

clean_category_cache() will be called when publishing or editing posts. Category_children option will be deleted and added very often.

An empty string can work well, because _get_category_hierarchy() use is_array() to check category_children option. If it is not an array, it will be identified as invalid cache.

04/11/07 03:28:11 changed by rob1n

Or change it to update_option('category_children', array());

So it gets serialized and returns a blank array, so it's a valid cache.

04/23/07 20:28:38 changed by ryan

(In [5295]) Fix variable collission in _get_cat_children. see #3985

04/23/07 20:31:48 changed by ryan

(In [5296]) Fix variable collission in _get_cat_children. For 2.2. see #3985

05/12/07 22:08:01 changed by ryan

  • status changed from new to closed.
  • resolution set to fixed.

Let's call this good enough for now.