Unifilters.pm


NAME

Unifilters - package for filters parsing and preparing.


SYNOPSIS

  use Communiware::Unifilters;
  $sth = prepare_filter($context, $filterspec, @fields)
  ($sth, $types, $sqltext) = prepare_filter($context,$filterspec,@fields)
  $sth = Communiware::Unifilters::prepare_unifilter($ctx, $filter_source, @fields);
  $sth = Communiware::Unifilters::prepare_unifilter($ctx, $filter_source, $hints, @fields);
  $res = $sth->fetchall_hashref();
  $query = Communiware::Unifilters::translate_to_sql($ctx, $filter_source, @fields);
  $query = Communiware::Unifilters::translate_to_sql($ctx, $filter_source, $hints, @fields);
  ($query, $values, $attrs) =
      Communiware::Unifilters::translate_to_sql($ctx, $filter_source, @fields);
  ($query, $values, $attrs) =
      Communiware::Unifilters::translate_to_sql($ctx, $filter_source, $hints, @fields);
  if (Communiware::Unifilters::is_unifilter_query($filter_source)) {...}
  Communiware::Unifilters::cache_init(MINEL => 100, MAXEL => 200);
  print Communiware::Unifilters::cache_report();
  Communiware::Unifilters::is_aggregate($name)
  Communiware::Unifilters::aggregate_type($aggrop, $type, $fragment)
  Communiware::Unifilters::validate_filter($filter_spec, $template_fragment)
  filter_lister($storable_filter_name)


DESCRIPTION

This module implements 'Unifyed filters' (below - filters) concept, which are non-sql method for selecting set of items. See detailed syntax reference in FILTER GRAMMAR section below. It provides some functions for ``filters in general'' used in compile- and runtime interfaces.


EXTERNAL FUNCTION

prepare_filter

   $sth = prepare_filter($context, $filterspec, [\%parameters,] @fields)
   ($sth, $types, $sqltext) = prepare_filter($context,$filterspec,[\%parameters,]@fields)

where filterspec is string, containing filter specification as described above, optional \%parameters is hash reference to extra parameters, @fields is list of attribute names, which should be extracted by this filter. Optionally, they can be prefixed by + or -, indicating that these fields should be used for ordering results of the filter in accending or descending order.

Defined parameters and their meaning:

delayed-rights-check
If true, rights check in unifilter is not applied in hope that caller will check rights itself. Used typically with delayed_check=1 parameters of DE Loop and List. For prepare_filter is equivalent to specifying include-forbidden hint, but intended for use from perl code and cannot be specified in template.

Returns reference to Communiware::Filter object or to SQL statement handle, ready to fetch data from.

If filter rewriting is disabled, but sort order is specified, all the data are fetched, then sorted inside Perl, and returned as Communiware::Filter object. In this case, only attributes which are directly returned by filter can be used as sort keys.

Side effect If filter have positional parameters, their values are added to current context.

If filterspec is immediate list of item_ids, no sorting allowed and no additional attributes will be taken into account.

In list context the function returns the same object as in scalar context and hash reference describing names and data types of data returned by filter. Keys in this hash are column names such as ITEM_ID, UIN etc and values are array references. Each array contains only one element - name of Communiware data types of corresponding column, such as STRING, DATE, RICHTEXT, NUMBER.

Example of types data structure:

 {
   ITEM_ID => ['STRING'],
   TEXT => ['RICHTEXT'],
   PUBLISHED => ['DATE'],
 }

Third returned value in list context is SQL text, to which filter is converted, if determined. This value is not null for unifilters and rewritable SQL-filters only.

reload_filters

Clear up_to_date hash to force reload already loaded filters.

load_filter

  load_filter ($filter_name)

Gets the named filter from database and caches it. It here means filter source text as well as information about filtera arguments, rewritable field etc. All this info is stored in some %Communiware::Unifilters::* haches.

do_sort

Recieves filtes data as it is returned by fetchall_arrayref({}) and list of key attributes. Returns same data structure sorted by specified attributes.

validate_filter

  validate_filter($filter_spec, $fragment)

Checks if given filter is valid. Returns storable filter name (or undef if the filter is not storable) and list of context attributes used by this filter are known in the compile time.

The function throws compile error if errors occurs.

filter_lister

   filter_lister($storable_filter_name)

Gets caconical name of storable filter and returns list of attributes used by this filter. Is called from containing template lister function at runtime.

prepare_unifilter

  $sth = Communiware::Unifilters::prepare_unifilter($ctx, $source, @fields)
  $sth = Communiware::Unifilters::prepare_unifilter($ctx, $source, $hints, @fields)

Function has interface similar to prepare_filter. First argument is source text of (uni)filter, and the rest of arguments are extra fields (item attributes) to extract. They may be prepended by '+' or '-' sign, which sets sort order. Optional third argument may be hache reference with filters hints (see below).

Special syntax is used in @fields for fetching group attributes, such as sum or min: aggrfunc(attrname). Aggrfunc here is one of max, min, sum, avg, count - the same, as in SQL. Attrname is the name of item attribute. Mixing group and 'usual' attributes in @fields causes error.

Optional second attribute $hints may be hash ref, containing hints for filter translation.

This module recognises following hints (keys in %$hints hash):

extract
Value is space separated list of fields names, as in @fields argument. Effect of this hint is addig these fields to end of @fields argument.

noextattr
If true then no extended attributes are extracted in translated SQL-query.

naturaljoin
If true then extended attributes table is joined together item table using natural (not outer left) join regardless any other conditions.

delayed-rights-check
See prepare_filter

All other hints are counted as ``database hints'' and are passed to resulting SQL-query server-specific way. It means they are fully discarded for PostgreSQL and written in /*+ ... */ syntax for Oracle.

On success function returns object of DBI::st class - statement handle, which is ready to fetch data from it - i.e with fetchall_hashref method. When non-group attibutes are fetched names of columns in result are attibute names: ITEM_ID, TITLE, EMAIL. On case of group atrributes names in result are constructed from aggregate function and attribute names: MAX_PUBLISHED, COUNT_ITEM_ID etc. In list context the function returns resulting SQL text as well as statement handle.

On error function returns undef, and puts error message to $Communiware::dbh::errstr variable.

translate_to_sql

 ($query, $values, $attrs) =
   Communiware::Unifilters::translate_to_sql($ctx, $source, @fields);
 ($query, $values, $attrs) =
   Communiware::Unifilters::translate_to_sql($ctx, $source, $hints, @fields);

This function takes the same arguments as prepare_unifilter - current context, source text of filter, possible hints and list of extra attributes. But it returns not 'ready to use' statement handle, but result of translation filter to sql representation.

On success function returns 3-el list:

  • translated sql-text, which (maybe) contains placeholders, '?' signs;

  • reference to array of values, which substitute placeholders in sql-query;

  • reference to array of context attributes names, which values are kept in second element.

This function is useful when some analysis of filter is needed, i.e. in 'filter' script for filter testing.

is_unifilter_query

This simple function returns true if its argument may be filter source text. Really it checks if text contains curly brackets on beginning and end, nothing else.

cache_init

 cache_init(MINEL => $mmm, MAXEL => $nnn)

This module provides two-level cache for increasing speed of prepare_unifilter function. Upper level contains parsed conditions (after parse method), and lower level contains DBI statement handles for different descendants of one source filter.

Low-lewel cache has two main parametes: maximum number of elements in it (MAXEL) and number of elements after 'cleaning' cache (MINEL).

Function cache_init re-initialises cashe and (if arguments are present) sets new cache parameters. Defaults are MINEL => 80, MAXEL => 100.

cache_report

 print cache_report($lev);

Returns text with some information about current cache state. Argument defines details level.

  1. Only common information - MINEL and MAXEL parameters, current cache volume and number of cleaning.

  2. The same plus information about each cache item - source text, number of calls, subcache etc.

  3. As 1, but outputs translated sql-texts of filters.


INTERNAL IMPLEMENTATION

The following section describes some internals of filters. All work is encapsulated into Communiware::Unifilters object. Its methods implement various stages of filter to sql translation.

This object is in reality hash reference. Names and meanind fields in this cache are described later (see MEMBERS OF FILTER OBJECT).

new

  $f = new Communiware::Unifilters($ctx);
  $f = new Communiware::Unifilters($ctx, $srctext);
  $f = new Communiware::Unifilters($ctx, $srctext, $hints);

This package method creates new Communiware::Unifilters object. The only required argument is current context, an object of Communiware::Context class. Optional argument sets filter source text. The text can be set later with text method. See prepare_unifilter for hints describing, see MEMBERS OF FILTER OBJECT for describing of object members.

text

 $oldtext = $f->text();
 $oldtext = $f->text($newtext);

Method gets current filter source text. Optional argument sets new filter source.

errlev

 $f->errlev(2);
 $f->parse;

Method sets level of parse errors details when subsequent parse method is called.

  1. On fail parse returns undef, none of error messages are available.

  2. (default)
  3. On fail parse method dies, only first and last of error message stack are kept in $@. As a rule we are not interested of other messages.

  4. The same as 1, but all the error messages stack are kept in $@.

parse

 eval {$f->parse()};
 eval {$f->parse($sourcetext)};

Method parses previously stored or explicitly given filter source text and stores parsed information. Method dies if sorce text is sintax-incorrect. Error message is kept in $@ (see errlev).

Method stores parsed information into PARSED_CNDS element of object.

check_itemattrs_and_links

Method checks each item attribute and linktype name in all parsed filter conditions. These objects must exist in Communiware tables. In other words: all of them must be real attribute or linktype names.

attrs_to_extract

 $f->attrs_to_extract(@fields)

Method takes @fields argument, as it passed to prerare_filter, parses it and stores information about needed attributes and thier sort order in EXTRACT_ATTRS.

is_aggregate

 is_aggregate($name)

Returns true if given name denotes known aggregate function and undef if not. Function names are case-insensitive

check_aggrop

 $f->check_aggrop($aggrop, $attr);

Returns true if this aggregate function is applicable for $attr attribute.

aggrop_type

 $f->aggrop_type($aggrop, $attr)

Returns data type of aggregate function $aggrop applicated to attribute $attr.

aggregate_type

 aggregate_type($aggrop, $type, $fragment)

This function (not a method!) returns data type of $aggrop applicated to attribute of $type. $type can be undef (for example for 'count' aggregate function). Function raises compile error if it cannot detremine needed data type and third argument (Communiware::Template::Fragment) is given.

rewrite_qualified_query

Transforms qualified attribute parameter and query specification, which looks like linkop function, for example

        <-- max(PUBLISHED)

into pair (unifilter,fieldspec), which could be directly passed to prepare_unifilter

rewrite_cnds

 $f->rewrite_cnds();

Method takes parsed conditions (see parse) and rewrites them to data with (almost) the same structure, but some different content. Rewritting changes ['ctxattr', 'ATTRNAME'] values to ['placeholder', '?'] and push 'ATTRNAME' to USED_CTXATTRS if attribute ATTRNAME is defined. Otherwise the condition with such attribute are eliminated.

Besides this method replaces argument of contains condition parsed search expression.

Checking if given items attribute names are correct. Correct means (a) real Communiware attribute name which is present in attribute DB table and (b) attribute must have non-empty value in attribute.table_name column. Condition with incorrect in this meaning item attribute names are eliminated too.

After executing method stores rewritten conditions in REWRITTEN_CNDS and list of used context attribute names in USED_CTXATTRS.

get_func_attr

Syntax:

        get_func_attr($attr)

Returns function flag & attribute name. Supported functions: '^' (uppercase) & '_' (lowercase).

rewrite_cnd

 $rewritten_cnd = $f->rewrite_cnd($cndref);

Method rewrites single condition from parsed condition set. For details of rewritting see rewrite_cnds method.

rewrite_value

 $newval = $f->rewrite_value($val);

Method changes values of attribute and dateexpr types to pair ['placeholder', '?'] or returns unchanged argument on other value types (stringliteral etc). If attribute is not defined then method returns undef.

to_sql

 $f->to_sql

Method produces ready to use sql text from the infomation collected on previous stages. Sql text is returned as a method result and stored into SQLTEXT element of object.

dump

 $f->dump();
 $f->dump('USED_CTXATTRS');

Method dumps to STDOUT whole filter object or one of its parts, if argument is given.


FILTER GRAMMAR

This module uses the Parse::RecDescent manpage module for syntax parsing. There is a module variable $Communiware::Unifilters::parser which is parser object. It implements three grammars: for unifilters parsing, for search expression parsing and for ``general'' filter expression recognising.

Unifilters grammar

Unifilter contains zero or more conditions enclosed in curly brackets. Conditions are separated by comma sign. There are several types of conditions.

Relation conditions

Set some restriction to arributes of selected items.

 itemattr relop value
 [func]itemattr relop value

or

 itemattr relop (value,value,...)
 [func]itemattr relop (value,value,...)
itemattr
Name of item attribute, such as PUBLISHED, EMAIL etc.

[func]
Optional function abbreviation. Supported functions: '^' (uppercase) and '_' (lowercase).

relop
Relationship operation, one of =, !=, <>, >, <, >=, <=, like. like has the same meaning as in SQL language.

value
Can be one of several types.
name of context attribute
Name of context attribute (surprise?).

number
Sequence of digits with possible decimal point and fractal part: 12, 98765, 12.24.

string
Sequence of any characters enclosed in single apostrophs: 'qwerty', 'Here I Am'.

date
Date in Communiware format, YYYY.MM.DD HH.MI.SS, enclosed in single apostrophs: '2002.01.04 12.23.44'. All parts of data are needed.

date expression
Context attribite +/- number of days

Here context attribute must be of type DATE. Number of days may be not integer: SYSDATE - 10.

There may be several values in the right part of condition. In such case they are enclosed in brackets and splitted by commas:

 AUTHOR = ('William', 'Bill', ATTRNAME)

This means we select items where AUTHOR attribute is 'William' or 'Bill' or current value of ATTRNAME context attribute.

Link conditions

These conditions allow to select items linked with some given items set.

 linktypes linkop dstitems

We select items, which are linked with dstitems by one of linktypes. linkop tells how are they linked.

<-
Selects items immediately linked as passive with dstitems.

->
Selects items immediately linked as active with dstitems.

<--
Selects items recursive linked as passive with dstitems.

-->
Selects items recursive linked as active with dstitems.

linktypes may be one link name or several link names enclosed in brackets: TOPIC, (TOPIC, PART).

dstitems is (possible empty) list of item ids to which selected items are linked. It may have one of forms:

<empty>
If dstitems are omitted by default ITEM_ID, i.e. current item, is implied.

context attribute name
 SOMEATTR
string in apostrophs
 'someitemid'

Both these variants set single item id.

virtual page
 *top

Means ``item to which this virtual page points''.

list of several values enclosed in brackets
 (SOMEATTR, 'someitemid', *top)

Gives set of item ids to any of which selected items must be linked.

the word ``all'' and the same list
 all(SOMEATTR, 'someitemid', *top)

This means that selected items must be linked with all of listed item ids.

Predefined conditions

These conditions consist of one word only:

sameserver
All selected items must belong to the current server.

allservers
Selected items may belong to different servers.

allowed-only
Filter selects only items to which current user is allowed to read.

include-forbidden
Filter selects protected to read items as well.

Search condition

 contains 'what to search'
 contains CTXATTR

This condition selects items text of which satisfies given search expression. Search expression may be set literally, as string, or placed in context attribute value. Search expression must correspond to Seach expression grammar.

Seach expression grammar

This grammar is used to parse search expression given as an argument of search condition. Search expression consists of several (non zero!) searched words possibly joined by logical operation and enclosed in brackets.

Words here are natural language words to search in item text.

Logigal operations are:

and (&, <none>)
The item text must contain all the words joined by and operation:
 communiware administrator
 communiware & administrator
 communiware and administrator

These three search expressions are equal, and search items with text contains both of words communiware and administrator.

or (|)
The item text must contain at least one of the words joined by or operation:
 communiware or administrator
 communiware | administrator

These two search expressions are equal, and search items with text contains the word communiware or the word administrator.

not (^)
Inverts search criteria of following word - filter selects items which text does not contain this word.
 not software
 ^ software

These two search expressions search items which text does not contain the word software.

Logical operation may be present in other used language, for example in Russian.

not has the highest priority, then goes and and then or. Brackets can be used for changing this rule and logical operations can be applied to expression in brackets as well as to single word:

 (bill or gates) and not (robert plant)

Filter expression grammar

This grammar is used ror parsing ``general filter expression'', which may be implicit filters, storable filters, perl filters or unifilters. Two previous grammars are used implicitely, from functions of this package. Parsing of filter expression must be called as this:

 $fp = $Communiware::Unifilters::parser->filter_expression_top($filter);

$filter here is the text of filter as it is decriben in the Communiware::Unifilters manpage. On case of error (bad filter expression) parser returns scalar, error message. If parsing is successfull it returns hash reference with some information about this filter. The keys and thier meaning depend on type of filter, but key type is always present. It takes one of values explicit, perl, uni and storable. Below returned data for each of these types are explained.

Explicit filters

 (id,id,...)
 (id,id,...) colname

Explicitely given list of item ids.

item_ids
Reference to array with listed item ids.

column_name
Name of ``column'', default is ITEM_ID.

Perl filters

 ::filtername
 ::filtername(param,param,...)

filtername is module name in Communiware::Filter hierachy.

name
Name of perl filter - with leading ::.

params
Reference to array with filter params.

Unifilters

 { condition,condition,...}
 { /*+ hint,hint,... */ condition,condition,...}

Zero or more conditions in curly brackets. Details see above.

text
Full text of unifilter but without hints.

hints
Reference to hash with hints, given to parsed filter or to empty hash if no hints.

Storable filters

 /*+ hints */ filtername(param,param,...)
 { /*+ hints */ filtername(param,param,...) | cnd,cnd,... }

The second format is used for unifilter ``refining'' and contains so called ``extra conditions''. Regardless of format the result of storable filter parsing contains following fields:

name
Name of storable filter.

params
Reference to array with filter params.

hints
Reference to hash with filter hints.

addcnd
Extra conditions ``as are''.


MEMBERS OF FILTER OBJECT

This section descriebes all the fielsd are contained in Unifilters object. In other word 'internal representation' of such filter. This information is used first of all in Communiware::SQL::* modules for building SQL-query, corresponding to filter.

Text in brackets shows the stage, which defines these fields. 'parse' means function from this module and 'to_sql' means function from Communiware::SQL::* module.

CTX
Object of Communiware::Context class given as a parameter to filter constructor.

SOURCE_TEXT (parse)
Source text of filter \'as is\' without any changes. Is set by \'new\' or \'text\' or 'parse' methods.

HINTS (parse, to_sql)
$hints argument of prepare_unifilter or translate_to_sql functions ``as is''.

ERRLEV (parse)
Integer, sets level of parse error message details. 0 means no error messages at all, no 'die', parse returns reference to empty array. 1 - first and last messages from error messages stack are showed. 2 - all messages from stack are showed.

PARSED_CNDS (parse)
Array reference, result of 'parse' method. Items of this array are conditions from source form of filter translated into internal representation.

WAS_VALIDATE_PARSE
True if 'parse' method was called with true second argument. This used only with filter validation, and source text may contain @-expressions.

EXTATTRTABLE (parse)
Name (single!) of used extended attribute table, if exists.

USED_CTXATTRS_VALUES (parse)
Hash reference. Contains information about defined context attibutes and their values are used in translated filter. I.e. {ITEM_ID => 'testname1', WRITTEN => '2000.04.01 21.00.00'}. This item is set by rewrite_cnds method.

USED_CTXATTRS (parse)
Array reference, contains list of context attribute names which values must be used as bind values on executing prepared DBI statement. Items may be non unique: ['ITEM_ID', 'ITEM_ID', 'CURRENT_SERVER', ...].

REWRITTEN_CNDS (parse)
Result of rewrite_cnds method. Data in the same format as PARSED_CNDS, but contains conditions with defined context attributes only. Besides ['ctxattrs', 'name'] constructs are replaced to ['placeholder', 'type'].

PREDEFS (parse)
Hash contains info about which predefined conditions were present in source filter, such as include-forbidden, allowed-only, sameserver, allservers.

CNDS_INFO (parse)
Hash reference which contains different info, related to access rights. In particularry it may contain such elements:
mode
Always defined after conditions rewritting. Takes values of 'none', 'anon' and 'base' - result of get_rights_mode call.

wasbelongs
True if 'BELONGS linkop ITEM_ID' condition is present.

wasauthlink
True if '<AuthLinkType> linkop AUTHOR_ID' condition is present.

HAS_GROUP_ATTRS (parse)
True, if extracted fields contain group attributes, such as max(PUBLISHED) or count(ITEM_ID).

EXTRACT_ATTRS (parse)
Array reference, result of attrs_to_extract methods executing. Contains pairs of item attribute names to extract (besides ITEM_ID) and their sort order. Sort order may be '+', '-' or '' - no sort.

ITEMATTRS_FULLNAMES (parse, to_sql)
Hash reference. Keys are column names in result of translated 'select' statement, values are full names of these columns. I.e. {ITEM_ID => 'ITEM.ITEM_ID', EMAIL => 'AUTHOR.EMAIL'}. Some of attributes are added on 'to_sql' stage.

LINKTABLES (to_sql)
Array reference. Contains pairs 'ITEM_LINK' (or 'ITEM_REL') and alias for this table ('ITEM_LINK_1' by example). This array is formed by linkop_to_sql method, and is used in to_sql method.

SEARCH_TABLES
Array reference, may contain table names are needed to add to 'from' clause on case 'contains' condition.

PARENT_ID_TABLES (to_sql)
Reference to empty or 2-el array. If array is not empty, then elements are aliases for ITEM_REL and ITEM_LINK tables when PARENT_ID attribute is needed.

SQLTEXT (to_sql)
Result of to_sql method, full text of sql select statement which represents translated filter. This text contains placeholders (?), which order must be taken from USED_CTXATTRS and values - from USED_CTXATTRS_VALUES members.

NNN (to_sql)
Integer count, is used for generating unique names.

16 октябрь 2007 13:45