Summary | Face similarity indexing is incorrect/broken |
Queue | Ansel |
Queue Version | Git master |
Type | Bug |
State | Resolved |
Priority | 1. Low |
Owners | mrubinsk (at) horde (dot) org |
Requester | thpo+horde (at) dotrc (dot) de |
Created | 08/10/2011 (5136 days ago) |
Due | |
Updated | 10/05/2011 (5080 days ago) |
Assigned | 09/03/2011 (5112 days ago) |
Resolved | 10/05/2011 (5080 days ago) |
Github Issue Link | |
Github Pull Request | |
Milestone | 2 |
Patch | No |
Need to include image_id in GROUP BY clause.
Bug: 10420Signed-off-by: Michael J Rubinsky <mrubinsk@horde.org>
1 files changed, 1 insertions(+), 1 deletions(-)
http://git.horde.org/horde-git/-/commit/912b151ce197158a757a3fc231cf9f0828ff7cda
git. Great!
the downside is that the image has to be just about identical to get a
hit. This indexing technique is really designed for quickly detecting
duplicates/similar images, not facial recognition...but at least it's
better than nothing I guess.
The alternative, loading each know face image for the user, then
comparing it with a known face image using the puzzle routines (which
would give a similarity ranking between the two) is not practical.
Plus, that, too, is really designed to detect similar, entire, images,
not facial recognition. i.e, If the same person is in two images, but
has a different facial expression, or is looking in a slightly
different direction, there will be no match.
Just a hint: when you create the face indexes you will
thanks for pointing it out. I didn't have the query fixed yet, so
thanks for that patch :)
git. Great! Just a hint: when you create the face indexes you will
miss out the last one (str_len - word_len == 0).
Furthermore, there ist a call to a function "updatec" that does not exist.
And in getSignatureMatches() the f.image_id has to be part of the "GROUP BY".
Here is my diff for these issues:
--- a/ansel/lib/Faces/Base.php
+++ b/ansel/lib/Faces/Base.php
@@ -555,7 +555,7 @@ class Ansel_Faces_Base
try {
$GLOBALS['ansel_db']->update('UPDATE ansel_images SET
image_faces = '
. count($fids) . ' WHERE image_id = ' . $image->id);
- $GLOBALS['ansel_db']->updatec('UPDATE ansel_shares '
+ $GLOBALS['ansel_db']->update('UPDATE ansel_shares '
. 'SET attribute_faces = attribute_faces + ' . count($fids)
. ' WHERE share_id = ' . $image->gallery);
} catch (Horde_Db_Exception $e) {
@@ -654,7 +654,7 @@ class Ansel_Faces_Base
$str_len = strlen($signature);
$GLOBALS['ansel_db']->delete('DELETE FROM ansel_faces_index
WHERE face_id = ' . $face_id);
$q = 'INSERT INTO ansel_faces_index (face_id,
index_position, index_part) VALUES (?, ?, ?)';
- for ($i = 0; $i < $str_len - $word_len; $i++) {
+ for ($i = 0; $i <= $str_len - $word_len; $i++) {
$data = array(
$face_id,
$i,
@@ -807,7 +807,7 @@ class Ansel_Faces_Base
$word_len = $GLOBALS['conf']['faces']['search'];
$str_len = strlen($signature);
$indexes = array();
- for ($i = 0; $i < $str_len - $word_len; $i++) {
+ for ($i = 0; $i <= $str_len - $word_len; $i++) {
$sig = new Horde_Db_Value_Binary(substr($signature, $i,
$word_len));
$indexes[] = '(index_position = ' . $i . ' AND
index_part = ' . $sig->quote($GLOBALS['ansel_db']) . ')';
}
@@ -822,7 +822,7 @@ class Ansel_Faces_Base
if ($indexes) {
$sql .= ' AND (' . implode(' OR ', $indexes) . ')';
}
- $sql .= ' GROUP BY i.face_id, f.face_name HAVING
count(i.face_id) > 0 '
+ $sql .= ' GROUP BY i.face_id, f.face_name, f.image_id HAVING
count(i.face_id) > 0 '
. 'ORDER BY count(i.face_id) DESC';
$sql = $GLOBALS['ansel_db']->addLimitOffset(
$sql,
Milestone ⇒ 2
State ⇒ Assigned
Summary ⇒ Face similarity indexing is incorrect/broken
Assigned to Michael Rubinsky
However, the query dealing with the vector indexes I have left alone
for now. The person that originally wrote this part of the code has
been gone for a while now, but it looks like the way the index is
created is completely incorrect e.g.,
Given a vector such as [abcdefgh] with a word length of 2, the
*correct* words should be:
[ab] [bc] [cd] [de] [ef] [fg] [gh]
not
[ab] [cd] [ef] [gh]
Also, these words need to be stored composed with the index position,
in the same field (See
http://download.pureftpd.org/pub/pure-ftpd/misc/libpuzzle/doc/README
for more technical detail if interested).
Combine this with the fact that the images that are being compared are
very small (just the face portion), and the current query really has
very little chance of finding a similar face, unless it almost the
exact same image - after that, it's just pure luck.
This should be fixed before Ansel 2.0
Some clean up of Ansel_Faces_Base
The face signature index stuff still looks broken. The index that
is calculated is done so incorrectly (The table should contain a single,
compound index field composed of the word and the position - not to
mention the words that are calculated are not correct).
See
Bug: 104202 files changed, 24 insertions(+), 15 deletions(-)
http://git.horde.org/horde-git/-/commit/49c02e64b89fcf72cce97246fd76b274ffc2c31b
Ensure this binary value is quoted correctly by Horde_Db.
Partially fixes
Bug: 104201 files changed, 1 insertions(+), 1 deletions(-)
http://git.horde.org/horde-git/-/commit/3b6afef1f9e6a90247acd865e9012a6a1d8fa327
The fields you removed are needed further down in the method. Of
course, that's another bug - the variable is misnamed (the query
results should be assigned to $results, not $faces).
actually trying to get from the query, as the result was overwritten
just a few steps later.
signature? It's a bound parameter, so the quoting should be taken
care of by the Horde_Db library. If it's not, something is wrong in
that library.
repertoire: 7 ERROR: invalid byte sequence for encoding "UTF8": 0x80
TIP: This error can also happen if the byte sequence does not match
the encoding expected by the server, which is controlled by
"client_encoding".
UPDATE ansel_faces SET face_signature =
'<80>6srh{Z^_^_Eyu1|^Bxz ^Z^T' WHERE
face_id = 1 [pid 14759 on line 808 of
"/usr/share/php/Horde/Db/Adapter/Base.php"]
The fields you removed are needed further down in the method. Of
course, that's another bug - the variable is misnamed (the query
results should be assigned to $results, not $faces).
State ⇒ Feedback
signature? It's a bound parameter, so the quoting should be taken care
of by the Horde_Db library. If it's not, something is wrong in that
library.
New Attachment: ansel.diff
State ⇒ Unconfirmed
Patch ⇒ No
Milestone ⇒
Queue ⇒ Ansel
Summary ⇒ wrong column names and non-standard compliant queries
Type ⇒ Bug
Priority ⇒ 1. Low
- unquoted signature from libpuzzle creates error in update statement
- GROUP BY and HAVING have to create an unambiguous result set
not yet fixed in the patch:
- ansel gallery object ids in _fetchFaces and _countFaces can no
longer be accessed with array_keys()