6.0.0-beta1
7/5/25

[#6982] Optimized Tags
Summary Optimized Tags
Queue Ansel
Type Enhancement
State Resolved
Priority 1. Low
Owners mrubinsk (at) horde (dot) org
Requester duck (at) obala (dot) net
Created 06/26/2008 (6218 days ago)
Due
Updated 12/14/2008 (6047 days ago)
Assigned 12/14/2008 (6047 days ago)
Resolved 12/14/2008 (6047 days ago)
Milestone
Patch No

History
12/14/2008 05:45:24 PM Michael Rubinsky Comment #19
State ⇒ Resolved
Reply to this comment
Agreed. This will be taken care of once Ansel is ported to use the 
Tagger code.
12/14/2008 03:56:12 PM Chuck Hagenbuch Comment #18
State ⇒ Assigned
Reply to this comment
I think that this can probably be resolved for now, though I'll leave 
it up to Michael. A lot of different options are now available from 
the Rampage Tagger code in the incubator, including stat tables that 
take care of caching counts for tags and tags + users (and this adds 
user_id to the current Ansel structure, allowing folksonomy-type stuff 
which Ansel doesn't currently have).



On HAVING vs. the JOIN - HAVING might be quicker in some cases but 
using a JOIN also lets us exclude tags from the result set, and it 
also avoids some filesorting most of the time. So the rampage tagger 
goes with the JOIN solution.
11/13/2008 01:48:01 AM Michael Rubinsky Comment #17 Reply to this comment
In the meantime, I changed the search queries to use GROUP BY / HAVING 
to start comparing it with the queries that we had...any feed back on 
performance is welcome. I'm going to port these changes over to my 
production server data and see what it gets us.
11/13/2008 01:46:14 AM CVS Commit Comment #16 Reply to this comment
Changes have been made in CVS for this ticket:

http://cvs.horde.org/diff.php/ansel/lib/Tags.php?r1=1.87&r2=1.88&ty=u
11/13/2008 12:55:41 AM Michael Rubinsky Comment #15
State ⇒ Feedback
Reply to this comment
1) This patch does nothing to update the count when a tag is removed. 
You only updating the tags that are passed to Ansel_Tags::writeTags() 
this function *always* overwrites the resources's tags, so your not 
capturing when the tag is removed.



2) Was there any special reason you removed the charset conversion is 
some of the queries?



3) the variable $id is not always set, that is why I was using the 
$reults['tag_id'] value - any special reason you didn't want to use 
that?




10/31/2008 04:21:38 PM Duck Comment #14
New Attachment: Tags.diff Download
Reply to this comment
Removing as a showstopper, tags work, this would be a nice
performance enhancement though, once the gallery vs image tag count
issue is resolved.
And here it is solved. When we fill the ansel_tags table we count both 
images and galleries table. To be more simple, I reduced the amount of 
changes just to introduce the tag_count column and avoid the double 
UNION SELECT. Other performance patches will come after this will be 
commited.


10/21/2008 09:02:17 PM Chuck Hagenbuch Comment #13 Reply to this comment
10/20/2008 06:56:34 PM Michael Rubinsky Comment #12
State ⇒ Assigned
Reply to this comment
An extremely simplistic idea for better tag-related queries:



http://blogs.sun.com/dups/entry/having_a_grand_old_time



Going to try to refactor Ansel's tag code to see what this gets us, 
what other issues it might bring up etc....
07/16/2008 03:30:58 AM Michael Rubinsky Deleted Original Message
 
07/16/2008 03:29:49 AM Michael Rubinsky Comment #11
Milestone ⇒
Reply to this comment
Removing as a showstopper, tags work, this would be a nice performance 
enhancement though, once the gallery vs image tag count issue is 
resolved.
07/02/2008 08:50:17 PM Michael Rubinsky Comment #10 Reply to this comment
Huh?  Your saving the actual number of tags in ansel_tags.tag_count
Sorry, that should be "the actual number of times that a tag is used in..."


07/02/2008 08:49:01 PM Michael Rubinsky Comment #9 Reply to this comment
Actually we don't need to know the actual number of images/galleries
tag count. Is used only as a flag if the gallery/image has tags or
not to save later queries. So this can be a boolean value.
Huh?  Your saving the actual number of tags in ansel_tags.tag_count 
and then your using the value of that field directly in 
Ansel_Tags::readTags() and Ansel_Tags::listTagInfo().  With your 
changes, once any tags are added/removed from an image or gallery, the 
tag counts will never be correct because the value in the 
ansel_tags.tag_count field is only taking into account the number of 
times that tag is used in _either_ the images table *or* the galleries 
table, when it needs to be both.
07/02/2008 07:56:22 AM Duck Comment #8 Reply to this comment
It looks like these changes will cause the tag_count field to be
overwritten with the count of tags in *either* the ansel_images_tags
or ansel_galleries_tags table...when it should be both.
Actually we don't need to know the actual number of images/galleries 
tag count. Is used only as a flag if the gallery/image has tags or not 
to save later queries. So this can be a boolean value.
06/30/2008 07:22:37 PM Michael Rubinsky State ⇒ Feedback
 
06/30/2008 03:54:13 PM Michael Rubinsky Comment #7 Reply to this comment
It looks like these changes will cause the tag_count field to be 
overwritten with the count of tags in *either* the ansel_images_tags 
or ansel_galleries_tags table...when it should be both.
06/30/2008 01:56:46 PM Chuck Hagenbuch Comment #6 Reply to this comment
and why not, a tag cloud page...
To be honest, I feel that an entire page containing only a tag cloud
is a bit, well, boring and wasteful considering that the "Browse"
page can be configured with a cloud of whatever size the user wants.
I agree.
06/30/2008 01:54:29 PM Michael Rubinsky Milestone ⇒ 1
 
06/30/2008 01:38:51 PM Michael Rubinsky Assigned to Michael Rubinsky
State ⇒ Assigned
 
06/30/2008 01:38:25 PM Michael Rubinsky Comment #5 Reply to this comment
and why not, a tag cloud page...
To be honest, I feel that an entire page containing only a tag cloud 
is a bit, well, boring and wasteful considering that the "Browse" page 
can be configured with a cloud of whatever size the user wants.   I 
suppose I can be convinced differently if there is a compelling reason 
that I am missing or the Cloud page contained some other widgets/info 
etc...
06/30/2008 01:37:33 PM Duck Comment #4 Reply to this comment
I'm still looking over the changes here, but why do the
iteration/concatenation in gallery.php instead of using implode()?
Is there a big performance difference?
becouse getTags() does not return one dimentional array anymore but 
even the tag count
06/30/2008 01:33:40 PM Michael Rubinsky Comment #3 Reply to this comment
I'm still looking over the changes here, but why do the 
iteration/concatenation in gallery.php instead of using implode()?   
Is there a big performance difference?
06/26/2008 02:51:08 PM Duck Comment #2
New Attachment: cloud.php
Reply to this comment
and why not, a tag cloud page...
06/26/2008 02:49:39 PM Duck Comment #1
Priority ⇒ 1. Low
State ⇒ New
New Attachment: ansel-tags.diff Download
Patch ⇒ No
Milestone ⇒
Queue ⇒ Ansel
Summary ⇒ Optimized Tags
Type ⇒ Enhancement
Reply to this comment
The UNION ALL SELECT query is very waste full SQL command. Expecially 
with the kind of table that tags are implemented with. Is taking 
seconds for my installations with a lot of images. As the server must 
read all table data to perform the counting and proper selection of 
data. So we must help the server and perform the count on the rare 
update operations to have fast common selections.



The patch:

reorganize the Ansel_Tags object store the store counts of used tags 
in the tags table to not count on the fly

mark galleries and images if they have tags or not so we can avoid not 
needed queries with all galleries and images that does not have tags.

Add limit functionality to the tags cloud block

Saved Queries