#1 pre compile and normalize search filter
Closed: wontfix None Opened 12 years ago by rmeggins.

When processing large search filters which are applied to every entry in the search result set, the filter is normalized anew each time a new entry is tested. For substring filters, a regular expression must be created, compiled, and freed each time the substring filter is tested, in addition to normalizing the values. For example, if the search filter contains 1000 substring sub-filters, for each entry tested with the filter, this will require 1000 filter normalizations followed by 1000 regex creation, compilation, and cleanup. If there are 1000 entries in the search result set, this will require a million such operations.


Reviewed the patch: 0001-pre-normalize-filter-and-pre-compile-substring-regex.patch
ack+

To ssh://git.fedorahosted.org/git/389/ds.git
681b22b..62e93bc master -> master
commit changeset:62e93bc/389-ds-base
Author: Rich Megginson rmeggins@redhat.com
Date: Mon Dec 12 21:07:59 2011 -0700
When processing large search filters which are applied to every entry in
the search result set, the filter is normalized anew each time a new entry
is tested. For substring filters, a regular expression must be created,
compiled, and freed each time the substring filter is tested, in addition
to normalizing the values. For example, if the search filter contains
1000 substring sub-filters, for each entry tested with the filter, this
will require 1000 filter normalizations followed by 1000 regex creation,
compilation, and cleanup. If there are 1000 entries in the search result
set, this will require a million such operations.

The solution is to "pre-compile" the search filter - perform all necessary
normalizations and compiling of the regular expressions used in the
filter once we know the search will go through.

struct subfilt and struct ava have "private" members which weren't being
used for anything.  For subfilt, the private field is used to store the
pre-compiled regex to pass to the syntax filter code.  For ava, the
private field is used to store the flags to specify if the type and/or
value is normalized.

Try to avoid normalization wherever possible.  slapi_value has a v_flags
field which is used to specify if the value is normalized.  Check this
before we attempt to normalize a value.  If we are creating a new
slapi_value, set the normalized flag if the new value is already
normalized.  Have to make sure that Slapi_Value structures are always
initialized correctly.

When examining the filter string, do not convert it to lower case first -
just use strcasestr - note that even though the string may be utf8,
strcasestr will still work, because we are searching for ascii characters.
Use PL_strcasestr because the system strcasestr causes valgrind to
print uninitialized memory errors.

Eliminate some uses of sprintf where a simple char assignment will suffice.
Reviewed by: nhosoi (Thanks!)

Added initial screened field value.

Metadata Update from @nhosoi:
- Issue assigned to rmeggins
- Issue set to the milestone: 1.2.10

7 years ago

Metadata Update from @spichugi:
- Issue close_status updated to: wontfix (was: Fixed)

3 years ago

389-ds-base is moving from Pagure to Github. This means that new issues and pull requests
will be accepted only in 389-ds-base's github repository.

This issue has been cloned to Github and is available here:
- https://github.com/389ds/389-ds-base/issues/1

If you want to receive further updates on the issue, please navigate to the github issue
and click on subscribe button.

Thank you for understanding. We apologize for all inconvenience.

Login to comment on this ticket.

Metadata