xapian-core  1.5.0
Public Types | Public Member Functions | Static Public Attributes | List of all members
Xapian::Enquire Class Reference

Querying session. More...

Public Types

enum  docid_order { ASCENDING = 1, DESCENDING = 0, DONT_CARE = 2 }
 Ordering of docids. More...
 

Public Member Functions

 Enquire (const Enquire &o)
 Copying is allowed. More...
 
Enquireoperator= (const Enquire &o)
 Copying is allowed. More...
 
 Enquire (Enquire &&o)
 Move constructor.
 
Enquireoperator= (Enquire &&o)
 Move assignment operator.
 
 Enquire (const Database &db)
 Constructor. More...
 
 ~Enquire ()
 Destructor.
 
void set_query (const Query &query, termcount query_length=0)
 Set the query. More...
 
const Queryget_query () const
 Get the currently set query. More...
 
void set_weighting_scheme (const Weight &weight)
 Set the weighting scheme to use. More...
 
void set_docid_order (docid_order order)
 Set sort order for document IDs. More...
 
void set_sort_by_relevance ()
 Set the sorting to be by relevance only. More...
 
void set_sort_by_value (valueno sort_key, bool reverse)
 Set the sorting to be by value only. More...
 
void set_sort_by_key (KeyMaker *sorter, bool reverse) XAPIAN_NONNULL()
 Set the sorting to be by key generated from values only. More...
 
void set_sort_by_value_then_relevance (valueno sort_key, bool reverse)
 Set the sorting to be by value, then by relevance for documents with the same value. More...
 
void set_sort_by_key_then_relevance (KeyMaker *sorter, bool reverse) XAPIAN_NONNULL()
 Set the sorting to be by keys generated from values, then by relevance for documents with identical keys. More...
 
void set_sort_by_relevance_then_value (valueno sort_key, bool reverse)
 Set the sorting to be by relevance then value. More...
 
void set_sort_by_relevance_then_key (KeyMaker *sorter, bool reverse) XAPIAN_NONNULL()
 Set the sorting to be by relevance, then by keys generated from values. More...
 
void set_collapse_key (valueno collapse_key, doccount collapse_max=1)
 Control collapsing of results. More...
 
void set_cutoff (int percent_threshold, double weight_threshold=0)
 Set lower bounds on percentage and/or weight. More...
 
void add_matchspy (MatchSpy *spy) XAPIAN_NONNULL()
 Add a matchspy. More...
 
void clear_matchspies ()
 Remove all the matchspies.
 
void set_time_limit (double time_limit)
 Set a time limit for the match. More...
 
MSet get_mset (doccount first, doccount maxitems, doccount checkatleast=0, const RSet *rset=NULL, const MatchDecider *mdecider=NULL) const
 Run the query. More...
 
MSet get_mset (doccount first, doccount maxitems, const RSet *rset, const MatchDecider *mdecider=NULL) const
 Run the query. More...
 
TermIterator get_matching_terms_begin (docid did) const
 Iterate query terms matching a document. More...
 
TermIterator get_matching_terms_begin (const MSetIterator &it) const
 Iterate query terms matching a document. More...
 
TermIterator get_matching_terms_end (docid) const noexcept
 End iterator corresponding to get_matching_terms_begin().
 
TermIterator get_matching_terms_end (const MSetIterator &) const noexcept
 End iterator corresponding to get_matching_terms_begin().
 
void set_expansion_scheme (const std::string &eweightname, double expand_k=1.0) const
 Set the weighting scheme to use for expansion. More...
 
ESet get_eset (termcount maxitems, const RSet &rset, int flags=0, const ExpandDecider *edecider=NULL, double min_weight=0.0) const
 Perform query expansion. More...
 
ESet get_eset (termcount maxitems, const RSet &rset, const ExpandDecider *edecider) const
 Perform query expansion. More...
 
std::string get_description () const
 Return a string describing this object.
 

Static Public Attributes

static const int INCLUDE_QUERY_TERMS = 1
 Flag telling get_eset() to allow query terms in Xapian::ESet. More...
 
static const int USE_EXACT_TERMFREQ = 2
 Flag telling get_eset() to always use the exact term frequency. More...
 

Detailed Description

Querying session.

An Enquire object represents a querying session - most of the options for running a query can be set on it, and the query is run via Enquire::get_mset().

Member Enumeration Documentation

◆ docid_order

Ordering of docids.

Parameter to Enquire::set_docid_order().

Enumerator
ASCENDING 

docids sort in ascending order (default)

DESCENDING 

docids sort in descending order.

DONT_CARE 

docids sort in whatever order is most efficient for the backend.

Constructor & Destructor Documentation

◆ Enquire() [1/2]

Xapian::Enquire::Enquire ( const Enquire o)

Copying is allowed.

The internals are reference counted, so copying is cheap.

◆ Enquire() [2/2]

Xapian::Enquire::Enquire ( const Database db)
explicit

Constructor.

Parameters
dbThe database (or databases) to query.
Since
1.5.0 If db has no subdatabases, it's handled like any other empty database. In earlier versions, Xapian::InvalidArgumentError was thrown in this case.

Member Function Documentation

◆ add_matchspy()

void Xapian::Enquire::add_matchspy ( MatchSpy spy)

Add a matchspy.

This matchspy will be called with some of the documents which match the query, during the match process. Exactly which of the matching documents are passed to it depends on exactly when certain optimisations occur during the match process, but it can be controlled to some extent by setting the checkatleast parameter to get_mset().

In particular, if there are enough matching documents, at least the number specified by checkatleast will be passed to the matchspy. This means that you can force the matchspy to be shown all matching documents by setting checkatleast to the number of documents in the database.

Parameters
spyThe MatchSpy subclass to add. The caller must ensure that this remains valid while the Enquire object remains active, or until clear_matchspies() is called, or else disown the MatchSpy object by calling spy->release() before passing it in.

◆ get_eset() [1/2]

ESet Xapian::Enquire::get_eset ( termcount  maxitems,
const RSet rset,
const ExpandDecider edecider 
) const
inline

Perform query expansion.

Perform query expansion using a Xapian::RSet indicating some documents which are relevant (typically based on the user marking results or similar).

Parameters
maxitemsThe maximum number of terms to return.
rsetDocuments marked as relevant.
edeciderXapian::ExpandDecider object - this acts as a yes/no filter on terms which are being considered.
Returns
Xapian::ESet object containing a list of terms with weights.

◆ get_eset() [2/2]

ESet Xapian::Enquire::get_eset ( termcount  maxitems,
const RSet rset,
int  flags = 0,
const ExpandDecider edecider = NULL,
double  min_weight = 0.0 
) const

Perform query expansion.

Perform query expansion using a Xapian::RSet indicating some documents which are relevant (typically based on the user marking results or similar).

Parameters
maxitemsThe maximum number of terms to return.
rsetDocuments marked as relevant.
flagsBitwise-or combination of INCLUDE_QUERY_TERMS and USE_EXACT_TERMFREQ flags (default: 0).
edeciderXapian::ExpandDecider object - this acts as a yes/no filter on terms which are being considered. (default: no Xapian::ExpandDecider)
min_weightLower bound on weight of acceptable terms (default: 0.0)
Returns
Xapian::ESet object containing a list of terms with weights.

◆ get_matching_terms_begin() [1/2]

TermIterator Xapian::Enquire::get_matching_terms_begin ( const MSetIterator it) const
inline

Iterate query terms matching a document.

Convenience overloaded form, taking a Xapian::MSetIterator instead of a Xapian::docid.

Parameters
itMSetIterator to return matching terms for

◆ get_matching_terms_begin() [2/2]

TermIterator Xapian::Enquire::get_matching_terms_begin ( docid  did) const

Iterate query terms matching a document.

Takes terms from the query set by set_query() and from the document with document ID did in the database set in the constructor, and returns terms which are in both, ordered by ascending query position. Terms which occur more than once in the query are only returned once, at the lowest term position they occur at.

Parameters
didDocument ID in the database set in the constructor

◆ get_mset() [1/2]

MSet Xapian::Enquire::get_mset ( doccount  first,
doccount  maxitems,
const RSet rset,
const MatchDecider mdecider = NULL 
) const
inline

Run the query.

Run the query using the settings in this Enquire object and those passed as parameters to the method, and return a Xapian::MSet object.

Parameters
firstZero-based index of the first result to return (which supports retrieving pages of results).
maxitemsThe maximum number of documents to return.
rsetDocuments marked as relevant (default: no documents have been marked as relevant)
mdeciderXapian::MatchDecider object - this acts as a yes/no filter on documents which match the query. See also Xapian::PostingSource. (default: no Xapian::MatchDecider)

◆ get_mset() [2/2]

MSet Xapian::Enquire::get_mset ( doccount  first,
doccount  maxitems,
doccount  checkatleast = 0,
const RSet rset = NULL,
const MatchDecider mdecider = NULL 
) const

Run the query.

Run the query using the settings in this Enquire object and those passed as parameters to the method, and return a Xapian::MSet object.

Parameters
firstZero-based index of the first result to return (which supports retrieving pages of results).
maxitemsThe maximum number of documents to return.
checkatleastCheck at least this many documents. By default Xapian will avoiding considering documents which it can prove can't match, which is faster but can result in a loose bounds on and a poor estimate of the total number of matches - setting checkatleast higher allows trading off speed for tighter bounds and a more accurate estimate. (default: 0)
rsetDocuments marked as relevant (default: no documents have been marked as relevant)
mdeciderXapian::MatchDecider object - this acts as a yes/no filter on documents which match the query. See also Xapian::PostingSource. (default: no Xapian::MatchDecider)

◆ get_query()

const Query& Xapian::Enquire::get_query ( ) const

Get the currently set query.

If set_query() is not called before calling get_query(), then the default query Xapian::MatchNothing will be returned.

◆ operator=()

Enquire& Xapian::Enquire::operator= ( const Enquire o)

Copying is allowed.

The internals are reference counted, so assignment is cheap.

◆ set_collapse_key()

void Xapian::Enquire::set_collapse_key ( valueno  collapse_key,
doccount  collapse_max = 1 
)

Control collapsing of results.

The MSet returned by get_mset() will have only the "best" (at most) collapse_max documents with each particular non-empty value in slot collapse_key ("best" being highest ranked - i.e. highest weight or highest sorting key).

An example use might be to create a value for each document containing an MD5 hash of the document contents. Then duplicate documents from different sources can be eliminated at search time by collapsing with collapse_max = 1 (it's better to eliminate duplicates at index time, but this may not be always be possible - for example the search may be over more than one Xapian database).

Another use is to group matches in a particular category (e.g. you might collapse a mailing list search on the Subject: so that there's only one result per discussion thread). In this case you can use get_collapse_count() to give the user some idea how many other results there are. And if you index the Subject: as a boolean term as well as putting it in a value, you can offer a link to a non-collapsed search restricted to that thread using a boolean filter.

Parameters
collapse_keyvalue slot to collapse on (default is Xapian::BAD_VALUENO which means no collapsing).
collapse_maxMaximum number of documents with the same key to allow (default: 1).

◆ set_cutoff()

void Xapian::Enquire::set_cutoff ( int  percent_threshold,
double  weight_threshold = 0 
)

Set lower bounds on percentage and/or weight.

Parameters
percent_thresholdLower bound on percentage score
weight_thresholdLower bound on weight (default: 0)

No thresholds are applied by default, and if either threshold is set to 0, then that threshold is disabled.

◆ set_docid_order()

void Xapian::Enquire::set_docid_order ( docid_order  order)

Set sort order for document IDs.

This order only has an effect on documents which would otherwise have equal rank. When ordering by relevance without a sort key, this means documents with equal weight. For a boolean match with no sort key, this means all documents. And if a sort key is used, this means documents with the same sort key (and also equal weight if ordering on relevance before or after the sort key).

Parameters
orderThis can be:
  • Xapian::Enquire::ASCENDING docids sort in ascending order (default)
  • Xapian::Enquire::DESCENDING docids sort in descending order
  • Xapian::Enquire::DONT_CARE docids sort in whatever order is most efficient for the backend

    Note: If you add documents in strict date order, then a boolean search - i.e. set_weighting_scheme(Xapian::BoolWeight()) - with set_docid_order(Xapian::Enquire::DESCENDING) is an efficient way to perform "sort by date, newest first", and with set_docid_order(Xapian::Enquire::ASCENDING) a very efficient way to perform "sort by date, oldest first".

◆ set_expansion_scheme()

void Xapian::Enquire::set_expansion_scheme ( const std::string &  eweightname,
double  expand_k = 1.0 
) const

Set the weighting scheme to use for expansion.

If you don't call this method, the default is as if you'd used:

get_expansion_scheme("trad");

Parameters
eweightnameA string in lowercase specifying the name of the scheme to be used. The following schemes are currently available: "bo1" : The Bo1 scheme for query expansion. "trad" : The TradWeight scheme for query expansion.
expand_kThe parameter required for TradWeight query expansion. A default value of 1.0 is used if none is specified.

◆ set_query()

void Xapian::Enquire::set_query ( const Query query,
termcount  query_length = 0 
)

Set the query.

If set_query() is not called before calling get_mset(), the default query used will be Xapian::MatchNothing.

Parameters
queryThe Xapian::Query object
query_lengthThe query length to use (default: query.get_length())

◆ set_sort_by_key()

void Xapian::Enquire::set_sort_by_key ( KeyMaker sorter,
bool  reverse 
)

Set the sorting to be by key generated from values only.

Parameters
sorterThe functor to use for generating keys.
reverseIf true, reverses the sort order.

◆ set_sort_by_key_then_relevance()

void Xapian::Enquire::set_sort_by_key_then_relevance ( KeyMaker sorter,
bool  reverse 
)

Set the sorting to be by keys generated from values, then by relevance for documents with identical keys.

Parameters
sorterThe functor to use for generating keys.
reverseIf true, reverses the sort order.

◆ set_sort_by_relevance()

void Xapian::Enquire::set_sort_by_relevance ( )

Set the sorting to be by relevance only.

This is the default.

◆ set_sort_by_relevance_then_key()

void Xapian::Enquire::set_sort_by_relevance_then_key ( KeyMaker sorter,
bool  reverse 
)

Set the sorting to be by relevance, then by keys generated from values.

Note that with the default BM25 weighting scheme parameters, non-identical documents will rarely have the same weight, so this setting will give very similar results to set_sort_by_relevance(). It becomes more useful with particular BM25 parameter settings (e.g. BM25Weight(1,0,1,0,0)) or custom weighting schemes.

Parameters
sorterThe functor to use for generating keys.
reverseIf true, reverses the sort order of the generated keys. Beware that in 1.2.16 and earlier, the sense of this parameter was incorrectly inverted and inconsistent with the other set_sort_by_... methods. This was fixed in 1.2.17, so make that version a minimum requirement if this detail matters to your application.

◆ set_sort_by_relevance_then_value()

void Xapian::Enquire::set_sort_by_relevance_then_value ( valueno  sort_key,
bool  reverse 
)

Set the sorting to be by relevance then value.

Note that sorting by values uses a string comparison, so to use this to sort by a numeric value you'll need to store the numeric values in a manner which sorts appropriately. For example, you could use Xapian::sortable_serialise() (which works for floating point numbers as well as integers), or store numbers padded with leading zeros or spaces, or with the number of digits prepended.

Note that with the default BM25 weighting scheme parameters, non-identical documents will rarely have the same weight, so this setting will give very similar results to set_sort_by_relevance(). It becomes more useful with particular BM25 parameter settings (e.g. BM25Weight(1,0,1,0,0)) or custom weighting schemes.

Parameters
sort_keyvalue number to sort on.
reverseIf true, reverses the sort order of sort_key. Beware that in 1.2.16 and earlier, the sense of this parameter was incorrectly inverted and inconsistent with the other set_sort_by_... methods. This was fixed in 1.2.17, so make that version a minimum requirement if this detail matters to your application.

◆ set_sort_by_value()

void Xapian::Enquire::set_sort_by_value ( valueno  sort_key,
bool  reverse 
)

Set the sorting to be by value only.

Note that sorting by values uses a string comparison, so to use this to sort by a numeric value you'll need to store the numeric values in a manner which sorts appropriately. For example, you could use Xapian::sortable_serialise() (which works for floating point numbers as well as integers), or store numbers padded with leading zeros or spaces, or with the number of digits prepended.

Parameters
sort_keyvalue number to sort on.
reverseIf true, reverses the sort order.

◆ set_sort_by_value_then_relevance()

void Xapian::Enquire::set_sort_by_value_then_relevance ( valueno  sort_key,
bool  reverse 
)

Set the sorting to be by value, then by relevance for documents with the same value.

Note that sorting by values uses a string comparison, so to use this to sort by a numeric value you'll need to store the numeric values in a manner which sorts appropriately. For example, you could use Xapian::sortable_serialise() (which works for floating point numbers as well as integers), or store numbers padded with leading zeros or spaces, or with the number of digits prepended.

Parameters
sort_keyvalue number to sort on.
reverseIf true, reverses the sort order.

◆ set_time_limit()

void Xapian::Enquire::set_time_limit ( double  time_limit)

Set a time limit for the match.

Matches with check_at_least set high can take a long time in some cases. You can set a time limit on this, after which check_at_least will be turned off.

Parameters
time_limittime in seconds after which to disable check_at_least (default: 0.0 which means no time limit)

Limitations:

This feature is currently supported on platforms which support POSIX interval timers. Interaction with the remote backend when using multiple databases may have bugs. There's not currently a way to force the match to end after a certain time.

◆ set_weighting_scheme()

void Xapian::Enquire::set_weighting_scheme ( const Weight weight)

Set the weighting scheme to use.

The Xapian::Weight object passed is cloned by calling weight.clone(), so doesn't need to remain valid after the call.

If set_weighting_scheme() is not called before calling get_mset(), the default weighting scheme is Xapian::BM25Weight().

Parameters
weightXapian::Weight object

Member Data Documentation

◆ INCLUDE_QUERY_TERMS

const int Xapian::Enquire::INCLUDE_QUERY_TERMS = 1
static

Flag telling get_eset() to allow query terms in Xapian::ESet.

By default, query terms are excluded. This is appropriate when using get_eset() to generate terms for query expansion, but for some other uses query terms are also interesting.

◆ USE_EXACT_TERMFREQ

const int Xapian::Enquire::USE_EXACT_TERMFREQ = 2
static

Flag telling get_eset() to always use the exact term frequency.

By default, get_eset() approximates the term frequency in some cases (currently when we're expanding from more than one database and there are sub-databases which don't contain any documents marked as relevant). This is faster and should still return good results, but this flag allows the exact term frequency to always be used.


The documentation for this class was generated from the following file: