CWB
Data Structures | Defines | Functions

regopt.h File Reference

#include "globals.h"
#include <pcre.h>

Data Structures

Defines

Functions


Define Documentation

#define MAX_GRAINS   12

Maximum number of grains of optimisation.

There's no point in scanning for too many grains, but regexps can be bloody inefficient.

Referenced by read_disjunction().


Function Documentation

int cl_regopt_analyse ( char *  regex)

Analyses a regular expression and tries to find the best set of grains.

Part of the regex optimiser. For a given regular expression, this function will try to extract a set of grains from regular expression {regex_string}. These grains are then used by the CL regex matcher and cl_regex2id() for faster regular expression search.

If successful, this function returns True and stores the grains in the optiomiser's global variables above (from which they should be copied to a CL_Regex object's corresponding members).

Usage: optimised = cl_regopt_analyse(regex_string);

This is a non-exported function.

Parameters:
regexString containing the regex to optimise.
Returns:
Boolean: true = ok, false = couldn't optimise regex.

References buf, cl_debug, cl_regopt_anchor_end, cl_regopt_anchor_start, cl_regopt_grain, cl_regopt_grain_len, cl_regopt_grains, grain_buffer, grain_buffer_grains, local_grain_data, make_jump_table(), read_disjunction(), read_grain(), read_kleene(), read_wildcard(), and update_grain_buffer().

Referenced by cl_new_regex().

void regopt_data_copy_to_regex_object ( CL_Regex  rx)

Internal regopt function: copies optimiser data from internal global variables to the member variables of argument CL_Regex object.

References _CL_Regex::anchor_end, _CL_Regex::anchor_start, cl_debug, cl_regopt_anchor_end, cl_regopt_anchor_start, cl_regopt_grain, cl_regopt_grain_len, cl_regopt_grains, cl_regopt_jumptable, cl_strdup(), _CL_Regex::grain, _CL_Regex::grain_len, _CL_Regex::grains, and _CL_Regex::jumptable.

Referenced by cl_new_regex().