Public interfaces for the Xapian library.
More...
Public interfaces for the Xapian library.
TermIterator() noexcept
Default constructor.
Definition: termiterator.h:79
const FieldProcessor * release() const
Start reference counting this object.
Definition: queryparser.h:480
void clear()
Clear the cluster weights.
std::string short_name() const
Return the short name of the weighting scheme.
Cluster(const Centroid ¢roid)
Constructor.
TermIterator termlist_begin() const
Return a TermIterator to the beginning of the termlist.
void need_stat(stat_flags flag)
Tell Xapian that your subclass will want a particular statistic.
Definition: weight.h:94
virtual std::string get_description() const
Return a string describing this object.
ExpandDeciderFilterPrefix(const std::string &prefix_)
The parameter specify the prefix of terms to be retained.
Definition: expanddecider.h:150
ClusterSet(ClusterSet &&other)
Move constructor.
~TermIterator()
Destructor.
Definition: termiterator.h:83
FixedWeightPostingSource * unserialise(const std::string &serialised) const
Create object given string serialisation returned by serialise().
Xapian::termcount get_query_length() const
The length of the query.
Definition: weight.h:411
double get_sumpart(Xapian::termcount wdf, Xapian::termcount doclen, Xapian::termcount uniqterm, Xapian::termcount wdfdocmax) const
Calculate the weight contribution for this object's term to a document.
BoolWeight * create_from_parameters(const char *params) const
Return the parameterised weighting scheme object.
virtual doccount get_doccount() const =0
Return the number of documents within the MSet.
void unserialise(const std::string &serialised)
Unserialise a string and set this object to the coordinates in it.
const int DB_RETRY_LOCK
If the database is already locked, retry the lock.
Definition: constants.h:145
Registry for user subclasses.
Definition: registry.h:48
InvalidArgumentError(const std::string &msg_, const std::string &context_=std::string(), int errno_=0)
General purpose constructor.
Definition: error.h:245
const FreqSource * release() const
Start reference counting this object.
Definition: cluster.h:177
virtual std::string get_description() const =0
Returns a string describing the similarity metric being used.
void set_max_expansion(Xapian::termcount max_expansion, int max_type=Xapian::Query::WILDCARD_LIMIT_ERROR, unsigned flags=FLAG_WILDCARD|FLAG_PARTIAL|FLAG_FUZZY)
Specify the maximum expansion of a wildcard and/or partial and/or fuzzy term.
void increase_termpos(Xapian::termpos delta=100)
Increase the term position used by index_text.
void set_sort_by_relevance()
Set the sorting to be by relevance only.
void set_docid_order(docid_order order)
Set sort order for document IDs.
double get_maxpart() const
Return an upper bound on what get_sumpart() can return for any document.
virtual std::string serialise_results() const
Serialise the results of this match spy.
@ NON_SPACING_MARK
Mark, nonspacing (Mn)
Definition: unicode.h:227
DerefWrapper_< std::string > operator++(int)
Advance the iterator to the next position (postfix version).
Definition: valueiterator.h:95
DatabaseModifiedError indicates a database was modified.
Definition: error.h:527
ESetIterator operator[](Xapian::doccount i) const
Return iterator pointing to the i-th object in this ESet.
Definition: eset.h:337
void register_posting_source(const Xapian::PostingSource &source)
Register a user-defined posting source class.
Abstract base class for match spies.
Definition: matchspy.h:50
Xapian::termcount get_wdf() const
Return the wdf for the document at the current position.
void recalculate()
Recalculate the centroid of the Cluster after each iteration of the KMeans algorithm by taking the me...
Xapian::doccount get_termfreq_est() const
An estimate of the number of documents this object can return.
std::string get_description() const
Return a string describing this object.
Class for storing the results returned by the Clusterer.
Definition: cluster.h:445
void keep_alive()
Send a keep-alive message.
std::string serialise() const
Return this object's parameters serialised as a single string.
Handle a byte unit range.
Definition: queryparser.h:401
@ INITIAL_QUOTE_PUNCTUATION
Punctuation, initial quote (Pi)
Definition: unicode.h:244
PositionIterator positionlist_begin() const
Return a PositionIterator for the current document.
std::string get_description() const
Return a string describing this object.
void set_default_op(Query::op default_op)
Set the default operator.
const valueno BAD_VALUENO
Reserved value to indicate "no valueno".
Definition: types.h:100
virtual PostingSource * clone() const
Clone the posting source.
WritableDatabase & operator=(const WritableDatabase &o)
Assignment operator.
Definition: database.h:1033
flags set_flags(flags toggle, flags mask=flags(0))
Set flags.
DerefWrapper_< std::string > operator++(int)
Advance the iterator to the next position (postfix version).
Definition: termiterator.h:111
const Xapian::PostingSource * get_posting_source(const std::string &name) const
Get a posting source given a name.
double get_maxpart() const
Return an upper bound on what get_sumpart() can return for any document.
const Query operator|=(const Query &o)
Combine with another Xapian::Query object using OP_OR.
Definition: query.h:908
StemStopper(const Xapian::Stem &stemmer, stem_strategy strategy=STEM_SOME)
Constructor.
MSet get_mset(doccount first, doccount maxitems, doccount checkatleast=0, const RSet *rset=NULL, const MatchDecider *mdecider=NULL) const
Run the query.
Xapian::TermIterator spellings_begin() const
An iterator which returns all the spelling correction targets.
TfIdfWeight * unserialise(const std::string &serialised) const
Unserialise parameters.
Database(const Database &o)
Copy constructor.
Class for calculating the cosine distance between two documents.
Definition: cluster.h:534
DatabaseLockError(const std::string &msg_, const std::string &context_=std::string(), int errno_=0)
General purpose constructor.
Definition: error.h:497
@ TITLECASE_LETTER
Letter, titlecase (Lt)
Definition: unicode.h:224
void remove_term(const std::string &term)
Remove a term from this document.
ValueCountMatchSpy(Xapian::valueno slot_)
Construct a MatchSpy which counts the values in a particular slot.
Definition: matchspy.h:237
std::string serialise() const
Return this object's parameters serialised as a single string.
double get_sumpart(Xapian::termcount wdf, Xapian::termcount doclen, Xapian::termcount uniqterms, Xapian::termcount wdfdocmax) const
Calculate the weight contribution for this object's term to a document.
double get_maxextra() const
Return an upper bound on what get_sumextra() can return for any document.
bool operator<(const LatLongCoord &other) const noexcept
Compare with another LatLongCoord.
Definition: geospatial.h:150
A latitude-longitude coordinate.
Definition: geospatial.h:81
double get_sumextra(Xapian::termcount, Xapian::termcount, Xapian::termcount) const
Calculate the term-independent weight component for a document.
Set of documents judged as relevant.
double get_weight() const
Return the weight contribution for the current document.
ValueIterator & operator=(ValueIterator &&o)
Move assignment operator.
Definition: valueiterator.h:66
~QueryParser()
Destructor.
Class to represents a Cluster which contains Points and Centroid of the Cluster.
Definition: cluster.h:360
TermIterator unstem_end(const std::string &) const noexcept
End iterator over unstemmed forms of the given stemmed query term.
Definition: queryparser.h:1028
virtual LatLongMetric * clone() const =0
Clone the metric.
@ MATH_SYMBOL
Symbol, math (Sm)
Definition: unicode.h:247
const TermIterator get_terms_end() const noexcept
End iterator for terms in the query object.
Definition: query.h:581
This class implements the BB2 weighting scheme.
Definition: weight.h:1363
KMeans(unsigned int k_, unsigned int max_iters_=0)
Constructor specifying number of clusters and maximum iterations.
RangeError(const std::string &msg_, int errno_)
Construct from message and errno value.
Definition: error.h:983
std::string get_value_upper_bound(Xapian::valueno slot) const
Get an upper bound on the values stored in the given value slot.
NetworkError(const std::string &msg_, const std::string &context_=std::string(), int errno_=0)
General purpose constructor.
Definition: error.h:807
@ OTHER_PUNCTUATION
Punctuation, other (Po)
Definition: unicode.h:246
void set_data(const std::string &data)
Set the document data.
const Xapian::KeyMaker * get_key_maker(const std::string &name) const
Get a KeyMaker given a name.
Weight()
Default constructor, needed by subclass constructors.
Definition: weight.h:155
DatabaseVersionError(const std::string &msg_, const std::string &context_=std::string(), int errno_=0)
General purpose constructor.
Definition: error.h:636
double get_weight() const
Return the weight contribution for the current document.
std::string get_description() const
Return a string describing this object.
@ FINAL_QUOTE_PUNCTUATION
Punctuation, final quote (Pf)
Definition: unicode.h:245
category get_category(unsigned ch)
Return the category which a given Unicode character falls into.
Definition: unicode.h:342
virtual PostingSource * unserialise_with_registry(const std::string &serialised, const Registry ®istry) const
Create object given string serialisation returned by serialise().
virtual ~StemImplementation()
Virtual destructor.
Enquire(const Enquire &o)
Copying is allowed.
const int DBCOMPACT_SINGLE_FILE
Produce a single-file database.
Definition: constants.h:275
@ SURROGATE
Other, surrogate (Cs)
Definition: unicode.h:239
TermIterator unstem_begin(const std::string &term) const
Begin iterator over unstemmed forms of the given stemmed query term.
std::string get_description() const
Return a string describing this object.
DocumentSet(const DocumentSet &other)
Copying is allowed.
MSetIterator()
Create an unpositioned MSetIterator.
Definition: mset.h:455
double get_sumpart(Xapian::termcount wdf, Xapian::termcount doclen, Xapian::termcount uniqterms, Xapian::termcount wdfdocmax) const
Calculate the weight contribution for this object's term to a document.
virtual void operator()(const Xapian::Document &doc, double wt)=0
Register a document with the match spy.
void skip_to(Xapian::docid min_docid, double min_wt)
Advance to the specified docid.
double get_maxpart() const
Return an upper bound on what get_sumpart() can return for any document.
void clear_terms()
Clear all terms from the document.
~ValueIterator()
Destructor.
Definition: valueiterator.h:84
std::string serialise() const
Return this object's parameters serialised as a single string.
MSetIterator operator[](Xapian::doccount i) const
Return iterator pointing to the i-th object in this MSet.
Definition: mset.h:687
void add_to_cluster(const Point &point, unsigned int index)
Add the point to the cluster at position 'index'.
Stopper()
Default constructor.
Definition: queryparser.h:55
WildcardError indicates an error expanding a wildcarded query.
Definition: error.h:1001
void set_collapse_key(valueno collapse_key, doccount collapse_max=1)
Control collapsing of results.
double get_maxpart() const
Return an upper bound on what get_sumpart() can return for any document.
std::string serialise() const
Serialise document into a string.
std::string get_description() const
Return a string describing this object.
void add_boolean_prefix(const std::string &field, const std::string &prefix, const std::string *grouping=NULL)
Add a boolean term prefix allowing the user to restrict a search with a boolean filter specified in t...
Xapian::TermIterator synonym_keys_end(const std::string &=std::string()) const noexcept
End iterator corresponding to synonym_keys_begin(prefix).
Definition: database.h:504
PositionIterator & operator=(PositionIterator &&o)
Move assignment operator.
Definition: positioniterator.h:66
Cluster(const Cluster &other)
Copying is allowed.
Xapian::totallength get_total_length() const
Get the total length of all the documents in the database.
Database open(const std::string &host, unsigned int port, unsigned timeout=10000, unsigned connect_timeout=10000)
Construct a Database object for read-only access to a remote database accessed via a TCP connection.
Xapian::Document get_document(Xapian::docid did, unsigned flags=0) const
Get a document from the database.
void fetch() const
Prefetch hint the whole MSet.
Definition: mset.h:363
void set_stopper(const Stopper *stop=NULL)
Set the stopper.
virtual double get_weight() const
Return the weight contribution for the current document.
std::string get_description() const
Return a string describing this object.
double longitude
A longitude, as decimal degrees.
Definition: geospatial.h:98
ExpandDeciderAnd(const ExpandDecider &first_, const ExpandDecider &second_)
Terms will be checked with first, and if accepted, then checked with second.
Definition: expanddecider.h:98
WritableDatabase(WritableDatabase &&o)
Move constructor.
Definition: database.h:1039
ClusterSet & operator=(ClusterSet &&other)
Move assignment operator.
LatLongCoordsIterator()
Default constructor. Produces an uninitialised iterator.
Definition: geospatial.h:177
DatabaseCreateError(const std::string &msg_, const std::string &context_=std::string(), int errno_=0)
General purpose constructor.
Definition: error.h:455
const char * version_string()
Report the version string of the library which the program is linked with.
Definition: xapian.h:122
TermIterator(const TermIterator &o)
Copy constructor.
std::string short_name() const
Return the short name of the weighting scheme.
Xapian::doccount size() const
Return size of the cluster.
unsigned flags
Flags.
Definition: queryparser.h:160
Xapian::doccount get_termfreq_est() const
An estimate of the number of documents this object can return.
std::string name() const
Return the name of this weighting scheme.
const Xapian::Document & get_document() const
Get the current document.
double get_maxpart() const
Return an upper bound on what get_sumpart() can return for any document.
virtual ~Similarity()
Destructor.
Xapian::termcount get_length() const noexcept
Return the length of this query object.
Xapian::Weight subclass implementing the traditional probabilistic formula.
Definition: weight.h:1057
virtual double similarity(const PointType &a, const PointType &b) const =0
Calculates the similarity between the two documents.
void add_value(Xapian::valueno slot, bool reverse=false, const std::string &defvalue=std::string())
Add a value slot to the list to build a key from.
Definition: keymaker.h:194
PostingSource() noexcept
Allow subclasses to be instantiated.
Definition: postingsource.h:62
Xapian::valueno get_valueno() const
Return the value slot number for the current position.
PostingIterator(const PostingIterator &o)
Copy constructor.
Enquire & operator=(const Enquire &o)
Copying is allowed.
Document()
Default constructor.
std::string get_uuid() const
Get a UUID for the database.
const int DB_CREATE_OR_OVERWRITE
Create database if it doesn't already exist, or overwrite if it does.
Definition: constants.h:38
virtual ~MatchDecider()
Virtual destructor, because we have virtual methods.
Definition: matchdecider.h:50
double get_sumpart(Xapian::termcount wdf, Xapian::termcount doclen, Xapian::termcount uniqterm, Xapian::termcount wdfdocmax) const
Calculate the weight contribution for this object's term to a document.
Build key strings for MSet ordering or collapsing.
std::string get_description() const
Return a string describing this object.
void skip_to(Xapian::docid docid_or_slot)
Advance the iterator to document id or value slot docid_or_slot.
double get_sumpart(Xapian::termcount wdf, Xapian::termcount doclen, Xapian::termcount uniqterms, Xapian::termcount wdfdocmax) const
Calculate the weight contribution for this object's term to a document.
void unserialise(const std::string &serialised)
Unserialise a string and set this object to its coordinate.
bool operator()(const Xapian::Document &doc) const
Decide whether we want a particular document to be in the MSet.
double get_sumextra(Xapian::termcount doclen, Xapian::termcount, Xapian::termcount) const
Calculate the term-independent weight component for a document.
Class for diversifying an MSet using GLS-MPT as given in the paper: Scalable and Efficient Web Search...
Definition: diversify.h:45
TermIterator termlist_end() const noexcept
Return a TermIterator to the end of the termlist.
Definition: cluster.h:251
std::string str
The prefix (or suffix with RP_SUFFIX) string to look for.
Definition: queryparser.h:147
Xapian::valueno values_count() const
Count the value slots used in this document.
void add_cluster(const Cluster &cluster)
Add a cluster to the ClusterSet.
std::string serialise() const
Return this object's parameters serialised as a single string.
Registry & operator=(const Registry &other)
Assignment operator.
const RangeProcessor * release() const
Start reference counting this object.
Definition: queryparser.h:234
TermIterator termlist_begin() const
Start iterating the terms in this document.
TermIterator allterms_end(const std::string &=std::string()) const noexcept
End iterator corresponding to allterms_begin(prefix).
Definition: database.h:299
ExpandDecider()
Default constructor.
Definition: expanddecider.h:47
Class for iterating over document values.
double get_sumextra(Xapian::termcount doclen, Xapian::termcount uniqterms, Xapian::termcount wdfdocmax) const
Calculate the term-independent weight component for a document.
TermGenerator(const TermGenerator &o)
Copy constructor.
const LatLongCoord & operator*() const
Get the LatLongCoord for the current position.
Definition: geospatial.h:180
XAPIAN_REVISION_TYPE rev
Revision number of a database.
Definition: types.h:108
void init(double factor_)
Allow the subclass to perform any initialisation it needs to.
std::input_iterator_tag iterator_category
We implement the semantics of an STL input_iterator.
Definition: unicode.h:204
unsigned to_utf8(unsigned ch, char *buf)
Convert a single Unicode character to UTF-8.
Definition: unicode.h:325
parsing a user query string to build a Xapian::Query object
void begin_transaction(bool flushed=true)
Begin a transaction.
const KeyMaker * release() const
Start reference counting this object.
Definition: keymaker.h:134
double miles_to_metres(double miles) noexcept
Convert from miles to metres.
Definition: geospatial.h:54
Xapian::DocumentSet get_dmset(const MSet &mset)
Implements diversification.
double get_weight(const std::string &term) const
Return the TF-IDF weight associated with a certain term.
virtual void merge_results(const std::string &serialised)
Unserialise some results, and merge them into this matchspy.
virtual std::string name() const
Return the name of this match spy.
ValueWeightPostingSource * unserialise(const std::string &serialised) const
Create object given string serialisation returned by serialise().
Xapian::Query operator()(const std::string &begin, const std::string &end)
Check for a valid byte value range.
double get_maxpart() const
Return an upper bound on what get_sumpart() can return for any document.
Factory functions for constructing Database and WritableDatabase objects.
MSet()
Default constructor.
DocumentSet & operator=(DocumentSet &&other)
Move assignment operator.
NetworkTimeoutError(const std::string &msg_, const std::string &context_=std::string(), int errno_=0)
General purpose constructor.
Definition: error.h:849
void set_stopper(const Xapian::Stopper *stop=NULL)
Set the Xapian::Stopper object to be used for identifying stopwords.
ESet get_eset(termcount maxitems, const RSet &rset, int flags=0, const ExpandDecider *edecider=NULL, double min_weight=0.0) const
Perform query expansion.
DatabaseCreateError indicates a failure to create a database.
Definition: error.h:439
Point(const FreqSource &freqsource, const Document &document)
Constructor Initialise the point with terms and corresponding TF-IDF weights.
TermGenerator(TermGenerator &&o)
Move constructor.
stem_strategy
Stemming strategies, for use with set_stemming_strategy().
Definition: queryparser.h:688
A posting source which generates weights from a value slot.
Definition: postingsource.h:428
const int DB_FULL_SYNC
Try to ensure changes are really written to disk.
Definition: constants.h:83
Class representing a document.
bool empty() const
Return true if this ESet object is empty.
Definition: eset.h:83
Enquire(const Database &db)
Constructor.
Query(Query::op op_)
Construct with just an operator.
Definition: query.h:692
A posting source which reads weights from a value slot.
Definition: postingsource.h:556
BM25Weight * create_from_parameters(const char *params) const
Return the parameterised weighting scheme object.
TermIterator & operator=(const TermIterator &o)
Assignment.
Class representing an abstract class for a clusterer to be implemented.
Definition: cluster.h:548
virtual std::string name() const
Return the name of this match spy.
@ OPEN_PUNCTUATION
Punctuation, open (Ps)
Definition: unicode.h:242
double get_sumextra(Xapian::termcount doclen, Xapian::termcount uniqterms, Xapian::termcount wdfdocmax) const
Calculate the term-independent weight component for a document.
double curr_weight
Weight at current position.
Definition: postingsource.h:610
Xapian::docid get_docid() const
Get the document ID this document came from.
GreatCircleMetric(double radius_)
Construct a GreatCircleMetric using a specified radius.
std::string get_metadata(const std::string &key) const
Get the user-specified metadata associated with a given key.
MSet(MSet &&o)
Move constructor.
void add_database(const WritableDatabase &other)
Add shards from another WritableDatabase.
Definition: database.h:951
Cluster & operator=(const Cluster &other)
Assignment is allowed.
const Xapian::MatchSpy * get_match_spy(const std::string &name) const
Get a match spy given a name.
int get_percent() const
Convert the weight of the current iterator position to a percentage.
Definition: mset.h:600
Utf8Iterator(const char *p_, size_t len)
Create an iterator given a pointer and a length.
Definition: unicode.h:114
#define XAPIAN_DOCID_BASE_TYPE
Base (signed) type for Xapian::docid and related types.
Definition: version.h:71
virtual Xapian::doccount get_termfreq_est() const =0
An estimate of the number of documents this object can return.
unsigned toupper(unsigned ch)
Convert a Unicode character to uppercase.
Definition: unicode.h:388
void set_stemmer(const Xapian::Stem &stemmer)
Set the Xapian::Stem object to be used for generating stemmed terms.
Xapian::termcount get_doclength_upper_bound() const
Get an upper bound on the length of a document in this DB.
Indicates an attempt to access a document not present in the database.
Definition: error.h:662
int minor_version()
Report the minor version of the library which the program is linked with.
Definition: xapian.h:140
Class implementing a "boolean" weighting scheme.
Definition: weight.h:452
Xapian::Query operator()(const std::string &begin, const std::string &end)
Check for a valid date range.
std::string get_collapse_key() const
Return the collapse key for the current position.
void replace_document(Xapian::docid did, const Xapian::Document &document)
Replace a document in the database.
PostingIterator() noexcept
Default constructor.
Definition: postingiterator.h:79
bool items_at_end
Flag, set to true if there are docs after the end of the range.
Definition: postingsource.h:613
Xapian::doccount size() const
Return number of items in this MSet object.
std::string short_name() const
Return the short name of the weighting scheme.
TermIterator get_matching_terms_begin(docid did) const
Iterate query terms matching a document.
Kmeans clusterer: This clusterer implements the K-Means clustering algorithm.
Definition: cluster.h:592
void register_lat_long_metric(const Xapian::LatLongMetric &metric)
Register a user-defined lat-long metric class.
PositionIterator() noexcept
Default constructor.
Definition: positioniterator.h:80
virtual doccount get_termfreq(const std::string &tname) const =0
Return the term frequency of a particular term 'tname'.
@ PARAGRAPH_SEPARATOR
Separator, paragraph (Zp)
Definition: unicode.h:235
std::string name() const
Return the name of this weighting scheme.
ValuePostingSource(Xapian::valueno slot_) noexcept
Construct a ValuePostingSource.
Definition: postingsource.h:448
MSetIterator begin() const
Return iterator pointing to the first item in this MSet.
Definition: mset.h:675
void set_weight(const std::string &term, double weight)
Set the weight 'weight' to the mapping of a term.
type_smoothing
Type of smoothing to use with the Language Model Weighting scheme.
Definition: weight.h:161
std::string serialise() const
Return this object's parameters serialised as a single string.
void set_sort_by_value_then_relevance(valueno sort_key, bool reverse)
Set the sorting to be by value, then by relevance for documents with the same value.
std::string get_description() const
Return a string describing this object.
unsigned XAPIAN_TERMCOUNT_BASE_TYPE termcount
A counts of terms.
Definition: types.h:64
void next(double min_wt)
Advance the current position to the next matching document.
ValueMapPostingSource * clone() const
Clone the posting source.
virtual std::string serialise() const
Return this object's parameters serialised as a single string.
Utf8Iterator(const std::string &s)
Create an iterator given a string.
Definition: unicode.h:125
virtual void init(double factor)=0
Allow the subclass to perform any initialisation it needs to.
DatabaseOpeningError(const std::string &msg_, int errno_)
Construct from message and errno value.
Definition: error.h:593
Base class for calculating the similarity between documents.
Definition: cluster.h:516
void assign(const char *p_, size_t len)
Assign a new string to the iterator.
Definition: unicode.h:72
Class for iterating over term positions.
Definition: positioniterator.h:40
DocumentSet & operator=(const DocumentSet &other)
Assignment is allowed.
void init(const Database &db_)
Older method which did the same job as reset().
LatLongCoords(const LatLongCoord &coord)
Construct a container holding one coordinate.
Definition: geospatial.h:268
double get_weight() const
Get the weight for the current position.
Query(Query &&)=default
Move constructor.
MSetIterator operator++(int)
Advance the iterator to the next position (postfix version).
Definition: mset.h:467
IneB2Weight * unserialise(const std::string &serialised) const
Unserialise parameters.
virtual PostingSource * unserialise(const std::string &serialised) const
Create object given string serialisation returned by serialise().
PositionIterator positionlist_end() const noexcept
Return an end PositionIterator for the current term.
Definition: termiterator.h:103
op
Query operators.
Definition: query.h:76
std::string serialise() const
Return a serialised representation of the coordinate.
double get_maxpart() const
Return an upper bound on what get_sumpart() can return for any document.
DatabaseVersionError(const std::string &msg_, int errno_)
Construct from message and errno value.
Definition: error.h:644
LatLongCoordsIterator begin() const
Get a begin iterator for the coordinates.
Definition: geospatial.h:237
Cluster & operator=(Cluster &&other)
Move assignment operator.
Xapian::totallength get_total_length() const
Total length of all documents in the collection.
Definition: weight.h:443
difference_type operator-(const MSetIterator &o) const
Return the number of positions between o and this iterator.
Definition: mset.h:538
virtual double get_maxextra() const =0
Return an upper bound on what get_sumextra() can return for any document.
void add_value(Xapian::valueno slot, const std::string &value)
Add a value to a slot in this document.
void append(const LatLongCoord &coord)
Append a coordinate to the end of the sequence.
Definition: geospatial.h:259
This class implements the IfB2 weighting scheme.
Definition: weight.h:1208
Xapian::termcount get_doclength() const
Return the length of the document at the current position.
ESetIterator & operator-=(difference_type n)
Move the iterator back by n positions.
Definition: eset.h:237
TermIterator get_matching_terms_end(const MSetIterator &) const noexcept
End iterator corresponding to get_matching_terms_begin().
Definition: enquire.h:441
IneB2Weight * create_from_parameters(const char *params) const
Return the parameterised weighting scheme object.
const int DOC_ASSUME_VALID
Assume document id is valid.
Definition: constants.h:287
Xapian::Weight subclass implementing Coordinate Matching.
Definition: weight.h:1832
TermGenerator & operator=(const TermGenerator &o)
Assignment.
std::string get_description() const
Return a string describing this object.
Xapian::termpos remove_postings(const std::string &term, Xapian::termpos term_pos_first, Xapian::termpos term_pos_last, Xapian::termcount wdf_dec=1)
Remove a range of postings for a term.
LatLongMetric * unserialise(const std::string &serialised) const
Create object given string serialisation returned by serialise().
Build a Xapian::Query object from a user query string.
Definition: queryparser.h:487
Xapian::doccount get_rank() const
Return the MSet rank for the current position.
Definition: mset.h:546
void set_maxweight(double max_weight)
Specify an upper bound on what get_weight() will return from now on.
Definition: postingsource.h:125
void register_match_spy(const Xapian::MatchSpy &spy)
Register a user-defined match spy class.
idf_norm
Idf normalizations.
Definition: weight.h:556
static size_t check(const std::string &path, int opts=0, std::ostream *out=NULL)
Check the integrity of a database or database table.
Definition: database.h:637
Class to represent a document as a point in the Vector Space Model.
Definition: cluster.h:310
std::string short_name() const
Return the short name of the weighting scheme.
static size_t check(int fd, int opts=0, std::ostream *out=NULL)
Check the integrity of a single file database.
Definition: database.h:653
@ CONNECTOR_PUNCTUATION
Punctuation, connector (Pc)
Definition: unicode.h:240
@ CLOSE_PUNCTUATION
Punctuation, close (Pe)
Definition: unicode.h:243
std::string name() const
Return the name of this weighting scheme.
Stem(StemImplementation *p)
Construct a Xapian::Stem object with a user-provided stemming algorithm.
Definition: stem.h:150
LatLongCoordsIterator & operator++()
Advance the iterator to the next position.
Definition: geospatial.h:185
This class implements the DLH weighting scheme, which is a representative scheme of the Divergence fr...
Definition: weight.h:1444
double get_maxextra() const
Return an upper bound on what get_sumextra() can return for any document.
double get_sumpart(Xapian::termcount wdf, Xapian::termcount doclen, Xapian::termcount uniqterm, Xapian::termcount wdfdocmax) const
Calculate the weight contribution for this object's term to a document.
Xapian::termcount get_wdf() const
Return the wdf for the term at the current position.
DiceCoeffWeight * create_from_parameters(const char *params) const
Return the parameterised weighting scheme object.
@ OP_OR
Match documents which at least one subquery matches.
Definition: query.h:90
SimpleStopper()
Default constructor.
Definition: queryparser.h:100
ValueMapPostingSource * unserialise(const std::string &serialised) const
Create object given string serialisation returned by serialise().
CoordWeight()
Construct a CoordWeight.
Definition: weight.h:1842
@ MODIFIER_LETTER
Letter, modifier (Lm)
Definition: unicode.h:225
TermIterator termlist_begin(Xapian::docid did) const
Start iterating the terms in a document.
const int DB_CREATE
Create a new database.
Definition: constants.h:44
const Query operator/=(double factor)
Inverse scale using OP_SCALE_WEIGHT.
Definition: query.h:681
Registry()
Default constructor.
double get_maxpart() const
Return an upper bound on what get_sumpart() can return for any document.
Allow rejection of terms during ESet generation.
std::string serialise() const
Return this object's parameters serialised as a single string.
ClusterSet()
Default constructor.
double get_maxextra() const
Return an upper bound on what get_sumextra() can return for any document.
Xapian::termpos operator*() const
Return the term position at the current iterator position.
bool operator<(const ESetIterator &a, const ESetIterator &b) noexcept
Inequality test for ESetIterator objects.
Definition: eset.h:286
double operator()(const LatLongCoords &a, const LatLongCoords &b) const
Return the distance between two coordinate lists, in metres.
PL2PlusWeight * create_from_parameters(const char *params) const
Return the parameterised weighting scheme object.
Xapian::doccount get_collection_size() const
The number of documents in the collection.
Definition: weight.h:393
bool check(Xapian::docid min_docid, double min_wt)
Check if the specified docid occurs.
InvalidArgumentError indicates an invalid parameter value was passed to the API.
Definition: error.h:229
Query(const std::string &term, Xapian::termcount wqf=1, Xapian::termpos pos=0)
Construct a Query object for a term.
void set_termpos(Xapian::termpos termpos)
Set the current term position.
Xapian::doccount get_termfreq_max() const
An upper bound on the number of documents this object can return.
std::string serialise() const
Return this object's parameters serialised as a single string.
void delete_document(const std::string &unique_term)
Delete any documents indexed by a term from the database.
unsigned nonascii_to_utf8(unsigned ch, char *buf)
Convert a single non-ASCII Unicode character to UTF-8.
Define preprocessor symbols for the library version.
DatabaseOpeningError indicates failure to open a database.
Definition: error.h:569
Database(const std::string &path, int flags=0)
Open a Database.
MatchDecider filtering results based on whether document values are in a user-defined set.
Definition: valuesetmatchdecider.h:44
const int DB_OPEN
Open an existing database.
Definition: constants.h:50
BoolWeight()
Construct a BoolWeight.
Definition: weight.h:459
std::string serialise() const
Return this object's parameters serialised as a single string.
const int DB_CREATE_OR_OPEN
Create database if it doesn't already exist.
Definition: constants.h:35
DecreasingValueWeightPostingSource(Xapian::valueno slot_, Xapian::docid range_start_=0, Xapian::docid range_end_=0)
Construct a DecreasingValueWeightPostingSource.
int valueno_diff
A signed difference between two value slot numbers.
Definition: types.h:97
Stem(const std::string &language, bool fallback=false)
Construct a Xapian::Stem object for a particular language.
ESetIterator & operator++()
Advance the iterator to the next position.
Definition: eset.h:182
Indicates an attempt to access a database not present.
Definition: error.h:1043
virtual std::string get_description() const =0
Returns a string describing the clusterer being used.
std::string serialise() const
Serialise object parameters into a string.
double get_sumpart(Xapian::termcount wdf, Xapian::termcount doclen, Xapian::termcount uniqterm, Xapian::termcount wdfdocmax) const
Calculate the weight contribution for this object's term to a document.
PositionIterator positionlist_begin() const
Return a PositionIterator for the current term.
Stem()
Construct a Xapian::Stem object which doesn't change terms.
Definition: stem.h:86
@ CONTROL
Other, control (Cc)
Definition: unicode.h:236
ValueCountMatchSpy()
Construct an empty ValueCountMatchSpy.
Definition: matchspy.h:234
std::string name() const
Name of the posting source class.
InternalError(const std::string &msg_, const std::string &context_=std::string(), int errno_=0)
General purpose constructor.
Definition: error.h:765
KeyMaker subclass which combines several values.
Definition: keymaker.h:155
std::string short_name() const
Return the short name of the weighting scheme.
Base class for calculating distances between two lat/long coordinates.
Definition: geospatial.h:302
A posting source which returns a fixed weight for all documents.
Definition: postingsource.h:704
bool operator!=(const ESetIterator &a, const ESetIterator &b) noexcept
Inequality test for ESetIterator objects.
Definition: eset.h:279
ExpandDecider * release()
Start reference counting this object.
Definition: expanddecider.h:65
Base class for stop-word decision functor.
Definition: queryparser.h:46
void set_min_wildcard_prefix(unsigned min_prefix_len, unsigned flags=FLAG_WILDCARD|FLAG_PARTIAL)
Specify minimum length for fixed initial portion in wildcard patterns.
void add_database(const Database &other)
Add shards from another Database.
Definition: database.h:109
std::string short_name() const
Return the short name of the weighting scheme.
virtual std::string operator()(const Xapian::Document &doc) const =0
Build a key string for a Document.
Query() noexcept
Construct a query matching no documents.
Definition: query.h:342
QueryParser & operator=(const QueryParser &o)
Assignment.
virtual std::string serialise() const
Return this object's parameters serialised as a single string.
virtual double get_sumextra(Xapian::termcount doclen, Xapian::termcount uniqterms, Xapian::termcount wdfdocmax) const =0
Calculate the term-independent weight component for a document.
@ UNASSIGNED
Other, not assigned (Cn)
Definition: unicode.h:221
SerialisationError(const std::string &msg_, int errno_)
Construct from message and errno value.
Definition: error.h:941
void compact(const std::string &output, unsigned flags, int block_size, Xapian::Compactor &compactor)
Produce a compact version of this database.
Definition: database.h:813
RSet(const RSet &o)
Copying is allowed.
std::string get_description() const
Return a string describing this object.
double get_sumextra(Xapian::termcount doclen, Xapian::termcount uniqterms, Xapian::termcount wdfdocmax) const
Calculate the term-independent weight component for a document.
@ COMBINING_SPACING_MARK
Mark, spacing combining (Mc)
Definition: unicode.h:229
std::string name() const
Return the name of this weighting scheme.
PL2PlusWeight(double c, double delta)
Construct a PL2PlusWeight.
const Query operator&(const Query &a, const Query &b)
Combine two Xapian::Query objects using OP_AND.
Definition: query.h:730
void index_text(const Xapian::Utf8Iterator &itor, Xapian::termcount wdf_inc=1, const std::string &prefix=std::string())
Index some text.
PositionIterator positionlist_begin(Xapian::docid did, const std::string &term) const
Start iterating positions for a term in a document.
FixedWeightPostingSource * clone() const
Clone the posting source.
WritableDatabase(const std::string &path, int flags=0, int block_size=0)
Create or open a Xapian database for both reading and writing.
void recalculate_centroids()
Recalculate the centroid for all the clusters in the ClusterSet.
void add(const std::string &word)
Add a single stop word.
Definition: queryparser.h:115
virtual Xapian::Query operator()(const std::string &str)=0
Convert a field-prefixed string to a Query object.
std::string get_description() const
Return a string describing this object.
const Stopper * release() const
Start reference counting this object.
Definition: queryparser.h:88
MSetIterator & operator--()
Move the iterator to the previous position.
Definition: mset.h:474
CoordWeight * unserialise(const std::string &serialised) const
Unserialise parameters.
double get_sumextra(Xapian::termcount doclen, Xapian::termcount uniqterms, Xapian::termcount wdfdocmax) const
Calculate the term-independent weight component for a document.
std::string serialise() const
Serialise object parameters into a string.
WritableDatabase(const WritableDatabase &o)
Copy constructor.
Definition: database.h:1027
const char * get_error_string() const
Returns any system error string associated with this exception.
PostingIterator & operator=(const PostingIterator &o)
Assignment.
Xapian::Weight subclass implementing Dice Coefficient.
Definition: weight.h:1872
~DocumentSet()
Destructor.
Enquire & operator=(Enquire &&o)
Move assignment operator.
Xapian::Database get_database() const
The database we're reading values from.
Definition: postingsource.h:469
virtual std::string serialise_results() const
Serialise the results of this match spy.
Xapian::docid get_docid() const
Return the current docid.
Xapian::doccount get_termfreq_min() const
A lower bound on the number of documents this object can return.
std::string short_name() const
Return the short name of the weighting scheme.
PostingIterator(PostingIterator &&o)
Move constructor.
Definition: postingiterator.h:59
Xapian::doccount get_termfreq() const
The number of documents which this term indexes.
Definition: weight.h:402
Xapian::docid get_lastdocid() const
Get the highest document id which has been used in the database.
ValueIterator(const ValueIterator &o)
Copy constructor.
Xapian::docid replace_document(const std::string &unique_term, const Xapian::Document &document)
Replace any documents matching a term.
RangeProcessor(Xapian::valueno slot_, const std::string &str_=std::string(), unsigned flags_=0)
Constructor.
Definition: queryparser.h:182
Class representing a list of query expansion terms.
ClusterSet & operator=(const ClusterSet &other)
Assignment is allowed.
Xapian::Database unlock()
Release a database write lock.
std::string name() const
Return the name of this weighting scheme.
Query & operator=(const Query &o)
Copying is allowed.
Definition: query.h:357
virtual bool operator()(const std::string &term) const
Is term a stop-word?
Definition: queryparser.h:117
DatabaseNotFoundError(const std::string &msg_, int errno_)
Construct from message and errno value.
Definition: error.h:1067
TermIterator stoplist_end() const noexcept
End iterator over terms omitted from the query as stopwords.
Definition: queryparser.h:1020
std::string get_value_lower_bound(Xapian::valueno slot) const
Get a lower bound on the values stored in the given value slot.
difference_type operator-(const ESetIterator &o) const
Return the number of positions between o and this iterator.
Definition: eset.h:259
double pointwise_distance(const LatLongCoord &a, const LatLongCoord &b) const
Return the great-circle distance between points on the sphere.
virtual ~FieldProcessor()
Destructor.
const Query operator^=(const Query &o)
Combine with another Xapian::Query object using OP_XOR.
Definition: query.h:925
Virtual base class for key making functors.
Definition: keymaker.h:43
std::string get_description() const
Return a string describing this object.
virtual bool operator()(const std::string &term) const
Do we want this term in the ESet?
void sort_by_relevance()
Sorts the list of documents in MSet according to their weights.
void init(const Xapian::Database &db_)
Older method which did the same job as reset().
TermListGroup(const MSet &docs, const Stopper *stopper=NULL)
Constructor.
LatLongDistanceKeyMaker(Xapian::valueno slot_, const LatLongCoords ¢re_, const LatLongMetric &metric_, double defdistance)
Construct a LatLongDistanceKeyMaker.
Definition: geospatial.h:573
void index_text_without_positions(const Xapian::Utf8Iterator &itor, Xapian::termcount wdf_inc=1, const std::string &prefix=std::string())
Index some text without positional information.
InternalError indicates a runtime problem of some sort.
Definition: error.h:749
bool contains(const Xapian::MSetIterator &it) const
Check if a document is marked as relevant.
Definition: rset.h:119
double get_sumextra(Xapian::termcount doclen, Xapian::termcount uniqterms, Xapian::termcount wdfdocmax) const
Calculate the term-independent weight component for a document.
InL2Weight * unserialise(const std::string &serialised) const
Unserialise parameters.
Query(op op_, const std::string &pattern, Xapian::termcount max_expansion=0, int flags=WILDCARD_LIMIT_ERROR, op combiner=OP_SYNONYM)
Query constructor for OP_EDIT_DISTANCE and OP_WILDCARD queries.
const int DBCOMPACT_NO_RENUMBER
Use the same document ids in the output as in the input(s).
Definition: constants.h:263
Xapian::doccount get_termfreq_max() const
An upper bound on the number of documents this object can return.
Document & operator=(Document &&o)
Move assignment operator.
Xapian::doccount size() const
Return number of items in this ESet object.
FreqSource()
Default constructor.
Definition: cluster.h:144
double get_maxpart() const
Return an upper bound on what get_sumpart() can return for any document.
SimpleStopper(Iterator begin, Iterator end)
Initialise from a pair of iterators.
Definition: queryparser.h:112
MSet & operator=(const MSet &o)
Copying is allowed.
TradWeight(double k=1.0)
Construct a TradWeight.
Definition: weight.h:1079
category
Each Unicode character is in exactly one of these categories.
Definition: unicode.h:220
void add_value(const std::string &value)
Add a value to the test set.
Definition: valuesetmatchdecider.h:75
Xapian::doccount get_termfreq() const
Return the term frequency for the term at the current position.
bool is_none() const
Return true if this is a no-op stemmer.
Definition: stem.h:163
bool reopen()
Reopen the database at the latest available revision.
Query(op op_, const Xapian::Query &a, const Xapian::Query &b)
Construct a Query object by combining two others.
Definition: query.h:405
Xapian::WritableDatabase lock(int flags=0)
Lock a read-only database for writing.
Diversify(Diversify &&other)
Move constructor.
LMWeight(double param_log_=0.0, type_smoothing select_smoothing_=TWO_STAGE_SMOOTHING, double param_smoothing1_=-1.0, double param_smoothing2_=-1.0)
Construct a LMWeight.
Definition: weight.h:1778
virtual std::string name() const
Name of the posting source class.
bool has_positions() const
Does this database have any positional information?
const PostingSource * release() const
Start reference counting this object.
Definition: postingsource.h:411
@ OTHER_NUMBER
Number, other (No)
Definition: unicode.h:232
static const Xapian::Query MatchAll
A query matching all documents.
Definition: query.h:73
Class representing a list of search results.
Definition: eset.h:43
MSet & operator=(MSet &&o)
Move assignment operator.
Xapian::doccount get_matches_lower_bound() const
Lower bound on the total number of matching documents.
Xapian::TermIterator metadata_keys_end(const std::string &=std::string()) const noexcept
End iterator corresponding to metadata_keys_begin().
Definition: database.h:553
Xapian::termcount get_wqf() const
The within-query-frequency of this term.
Definition: weight.h:414
FieldProcessor * release()
Start reference counting this object.
Definition: queryparser.h:468
InvalidOperationError(const std::string &msg_, const std::string &context_=std::string(), int errno_=0)
General purpose constructor.
Definition: error.h:287
op get_type() const noexcept
Get the type of the top level of the query.
std::string name() const
Name of the posting source class.
std::string short_name() const
Return the short name of the weighting scheme.
bool is_currency(unsigned ch)
Test if a given Unicode character is a currency symbol.
Definition: unicode.h:375
RSet()
Default constructor.
Xapian::doccount get_matches_estimated() const
Estimate of the total number of matching documents.
PostingIterator postlist_end(const std::string &) const noexcept
End iterator corresponding to postlist_begin().
Definition: database.h:253
IfB2Weight * unserialise(const std::string &serialised) const
Unserialise parameters.
Cluster & operator[](Xapian::doccount i)
Return the cluster at index 'i'.
virtual bool operator()(const std::string &term) const
Do we want this term in the ESet?
void set_metadata(const std::string &key, const std::string &metadata)
Set the user-specified metadata associated with a given key.
MSetIterator & operator+=(difference_type n)
Move the iterator forwards by n positions.
Definition: mset.h:510
double get_max_possible() const
The maximum possible weight any document could achieve.
std::string serialise() const
Serialise object parameters into a string.
Xapian::Document & operator[](Xapian::doccount i)
Return the Document in the DocumentSet at index i.
double get_sumextra(Xapian::termcount doclen, Xapian::termcount uniqterms, Xapian::termcount wdfdocmax) const
Calculate the term-independent weight component for a document.
Enquire(Enquire &&o)
Move constructor.
std::string serialise() const
Return this object's parameters serialised as a single string.
std::string name() const
Return the name of this weighting scheme.
Stopper * release()
Start reference counting this object.
Definition: queryparser.h:76
std::unordered_map< std::string, double > weights
Implement a map to store the terms within a document and their pre-computed TF-IDF weights.
Definition: cluster.h:229
TermGenerator & operator=(TermGenerator &&o)
Move assignment operator.
KeyMaker * release()
Start reference counting this object.
Definition: keymaker.h:122
TfIdfWeight * create_from_parameters(const char *params) const
Return the parameterised weighting scheme object.
Query(op op_, const Xapian::Query &subquery, double factor)
Scale using OP_SCALE_WEIGHT.
@ OTHER_LETTER
Letter, other (Lo)
Definition: unicode.h:226
std::string get_sort_key() const
Return the sort key for the current position.
DocumentSet get_documents() const
Return the documents that are contained within the cluster.
Xapian::termcount positionlist_count() const
Return the length of the position list for the current position.
void set_stopper_strategy(stop_strategy strategy)
Set the stopper strategy.
const int DB_BACKEND_GLASS
Use the glass backend.
Definition: constants.h:158
@ OP_AND
Match only documents which all subqueries match.
Definition: query.h:82
ExpandDeciderFilterTerms(Iterator reject_begin, Iterator reject_end)
The two iterators specify a list of terms to be rejected.
Definition: expanddecider.h:132
void clear_synonyms(const std::string &term) const
Remove all synonyms for a term.
Abstract base class for match deciders.
std::string serialise() const
Return this object's parameters serialised as a single string.
void swap(ESet &o)
Efficiently swap this ESet object with another.
Definition: eset.h:93
FixedWeightPostingSource(double wt)
Construct a FixedWeightPostingSource.
Registry & operator=(Registry &&other)
Move assignment operator.
XAPIAN_DOCID_BASE_TYPE doccount_diff
A signed difference between two counts of documents.
Definition: types.h:44
BoolWeight * unserialise(const std::string &serialised) const
Unserialise parameters.
const char * raw() const
Return the raw const char* pointer for the current position.
Definition: unicode.h:54
Xapian::doccount size() const
Return the size of the DocumentSet.
Xapian::termcount get_unique_terms_upper_bound() const
Get an upper bound on the unique terms size of a document in this DB.
double get_sumextra(Xapian::termcount doclen, Xapian::termcount uniqterms, Xapian::termcount wdfdocmax) const
Calculate the term-independent weight component for a document.
Xapian::doccount size() const
Return number of documents in this RSet object.
TermIterator & operator=(TermIterator &&o)
Move assignment operator.
Definition: termiterator.h:65
std::string get_value(Xapian::valueno slot) const
Read a value slot in this document.
InvalidOperationError(const std::string &msg_, int errno_)
Construct from message and errno value.
Definition: error.h:295
std::string operator*() const
Return the term at the current position.
TermIterator allterms_begin(const std::string &prefix=std::string()) const
Start iterating all terms in the database with a given prefix.
bool operator!=(const Utf8Iterator &other) const noexcept
Test two Utf8Iterators for inequality.
Definition: unicode.h:198
DiceCoeffWeight * clone() const
Clone this object.
Querying session.
Definition: enquire.h:58
Xapian::docid range_end
End of range of docids for which weights are known to be decreasing.
Definition: postingsource.h:607
void set_stopper(const Xapian::Stopper *stop=NULL)
Set the Xapian::Stopper object to be used for identifying stopwords.
@ OP_INVALID
Construct an invalid query.
Definition: query.h:266
void append_utf8(std::string &s, unsigned ch)
Append the UTF-8 representation of a single Unicode character to a std::string.
Definition: unicode.h:336
std::string name() const
Return the name of this weighting scheme.
virtual bool operator()(const std::string &term) const =0
Is term a stop-word?
DLHWeight * unserialise(const std::string &serialised) const
Unserialise parameters.
Point & operator[](Xapian::doccount i)
Return the point at the given index in the cluster.
virtual ~KeyMaker()
Virtual destructor, because we have virtual methods.
Diversify(Xapian::doccount k_, Xapian::doccount r_, double lambda_=0.5, double b_=5.0, double sigma_sqr_=1e-3)
Constructor specifying the number of diversified search results.
unsigned operator*() const noexcept
Get the current Unicode character value pointed to by the iterator.
XAPIAN_TOTALLENGTH_TYPE totallength
The total length of all documents in a database.
Definition: types.h:114
ESetIterator end() const
Return iterator pointing to just after the last item in this ESet.
Definition: eset.h:330
virtual bool at_end() const =0
Return true if the current position is past the last entry in this list.
void skip_to(Xapian::docid min_docid, double min_wt)
Advance to the specified docid.
double get_sumpart(Xapian::termcount wdf, Xapian::termcount doclen, Xapian::termcount uniqterms, Xapian::termcount wdfdocmax) const
Calculate the weight contribution for this object's term to a document.
void register_weighting_scheme(const Xapian::Weight &wt)
Register a weighting scheme.
@ OP_XOR
Match documents which an odd number of subqueries match.
Definition: query.h:105
Diversify & operator=(const Diversify &other)
Assignment is allowed.
TermIterator top_values_begin(size_t maxvalues) const
Get an iterator over the most frequent values seen in the slot.
void clear_mappings()
Clear all mappings.
ValueSetMatchDecider(Xapian::valueno slot, bool inclusive_)
Construct a ValueSetMatchDecider.
Definition: valuesetmatchdecider.h:68
XAPIAN_TERMCOUNT_BASE_TYPE termcount_diff
A signed difference between two counts of terms.
Definition: types.h:71
AssertionError is thrown if a logical assertion inside Xapian fails.
Definition: error.h:187
@ DECIMAL_DIGIT_NUMBER
Number, decimal digit (Nd)
Definition: unicode.h:230
XAPIAN_TERMPOS_BASE_TYPE termpos_diff
A signed difference between two term positions.
Definition: types.h:82
@ LOWERCASE_LETTER
Letter, lowercase (Ll)
Definition: unicode.h:223
stop_strategy
Stopper strategies, for use with set_stopper_strategy().
Definition: termgenerator.h:135
void skip_to(const std::string &term)
Advance the iterator to term term.
DatabaseModifiedError(const std::string &msg_, int errno_)
Construct from message and errno value.
Definition: error.h:551
@ LEAF_POSTING_SOURCE
Value returned by get_type() for a PostingSource.
Definition: query.h:272
Clusterer * release()
Start reference counting this object.
Definition: cluster.h:571
ValueWeightPostingSource * clone() const
Clone the posting source.
virtual Xapian::Query operator()(const std::string &begin, const std::string &end)
Check for a valid range of this type.
static Document unserialise(const std::string &serialised)
Unserialise a document from a string produced by serialise().
Class representing a set of documents in a cluster.
Definition: cluster.h:75
This class implements the DPH weighting scheme.
Definition: weight.h:1673
void set_sort_by_relevance_then_key(KeyMaker *sorter, bool reverse) XAPIAN_NONNULL()
Set the sorting to be by relevance, then by keys generated from values.
bool is_whitespace(unsigned ch)
Test if a given Unicode character is a whitespace character.
Definition: unicode.h:365
QueryParser(const QueryParser &o)
Copy constructor.
void set_database(const Database &db)
Specify the database being searched.
void add_mapping(const std::string &key, double wt)
Add a mapping.
Base class for field processors.
Definition: queryparser.h:439
double get_maxpart() const
Return an upper bound on what get_sumpart() can return for any document.
This class implements the PL2 weighting scheme.
Definition: weight.h:1506
stat_flags
Stats which the weighting scheme can use (see need_stat()).
Definition: weight.h:39
Xapian::termcount get_unique_terms() const
Return the number of unique terms in the current document.
PointType * release()
Start reference counting this object.
Definition: cluster.h:289
std::string reconstruct_text(Xapian::docid did, size_t length=0, const std::string &prefix=std::string(), Xapian::termpos start_pos=0, Xapian::termpos end_pos=0) const
Reconstruct document text.
virtual ~LatLongMetric()
Destructor.
void set_time_limit(double time_limit)
Set a time limit for the match.
ValueIterator() noexcept
Default constructor.
Definition: valueiterator.h:80
void remove_value(Xapian::valueno slot)
Remove any value from the specified slot.
Definition: document.h:231
void set_sort_by_relevance_then_value(valueno sort_key, bool reverse)
Set the sorting to be by relevance then value.
DiceCoeffWeight()
Construct a DiceCoeffWeight.
Definition: weight.h:1885
const Query operator*(double factor, const Query &q)
Scale a Xapian::Query object using OP_SCALE_WEIGHT.
Definition: query.h:755
RSet(RSet &&o)
Move constructor.
double magnitude
Store the squared magnitude of the PointType.
Definition: cluster.h:232
std::string short_name() const
Return the short name of the weighting scheme.
InternalError(const std::string &msg_, int errno_)
Construct from message and errno value.
Definition: error.h:773
void add_document(Xapian::docid did)
Mark a document as relevant.
TfIdfWeight(wdf_norm wdf_normalization, idf_norm idf_normalization, wt_norm wt_normalization)
Construct a TfIdfWeight.
Definition: weight.h:757
ESetIterator operator+(ESetIterator::difference_type n, const ESetIterator &it)
Return ESetIterator it incremented by n positions.
Definition: eset.h:317
std::string get_description() const
Return a string describing this object.
virtual void merge_results(const std::string &serialised)
Unserialise some results, and merge them into this matchspy.
Xapian::TermIterator synonym_keys_begin(const std::string &prefix=std::string()) const
An iterator which returns all terms which have synonyms.
virtual LatLongMetric * unserialise(const std::string &serialised) const =0
Create object given string serialisation returned by serialise().
DatabaseClosedError(const std::string &msg_, const std::string &context_=std::string(), int errno_=0)
General purpose constructor.
Definition: error.h:1101
Query & operator=(Query &&)=default
Move assignment operator.
Abstract class representing a point in the VSM.
Definition: cluster.h:224
void skip_to(Xapian::docid min_docid, double min_wt)
Advance to the specified docid.
PositionIterator positionlist_end() const noexcept
Return an end PositionIterator for the current document.
Definition: postingiterator.h:111
double get_sumextra(Xapian::termcount, Xapian::termcount, Xapian::termcount) const
Calculate the term-independent weight component for a document.
void skip_if_in_range(double min_wt)
Skip the iterator forward if in the decreasing range, and weight is low.
void add_term(const std::string &term, Xapian::termcount wdf_inc=1)
Add a term to this document.
Xapian::rev get_revision() const
Get the revision of the database.
bool term_exists(const std::string &term) const
Test is a particular term is present in any document.
Class for iterating over a list of document ids.
WritableDatabase & operator=(WritableDatabase &&o)
Move assignment operator.
Definition: database.h:1042
An iterator which returns Unicode character values from a UTF-8 encoded string.
Definition: unicode.h:38
virtual Xapian::doccount get_termfreq_max() const =0
An upper bound on the number of documents this object can return.
std::string short_name() const
Return the short name of the weighting scheme.
QueryParserError(const std::string &msg_, const std::string &context_=std::string(), int errno_=0)
General purpose constructor.
Definition: error.h:891
ValueIterator valuestream_begin(Xapian::valueno slot) const
Return an iterator over the value in slot slot for each document.
LatLongCoord() noexcept
Construct an uninitialised coordinate.
Definition: geospatial.h:102
void remove_synonym(const std::string &term, const std::string &synonym) const
Remove a synonym for a term.
stem_strategy
Stemming strategies, for use with set_stemming_strategy().
Definition: termgenerator.h:130
void set_default_weight(double wt)
Set a default weight for document values not in the map.
void compact(const std::string &output, unsigned flags=0, int block_size=0)
Produce a compact version of this database.
Definition: database.h:704
~Query()
Destructor.
Definition: query.h:345
Xapian::Query operator()(const std::string &begin, const std::string &end)
Check for a valid numeric range.
virtual void reset(const Database &db, Xapian::doccount shard_index)
Set this PostingSource to the start of the list of postings.
Base class which provides an "external" source of postings.
Definition: postingsource.h:47
PositionIterator & operator++()
Advance the iterator to the next position.
const int DBCHECK_SHOW_FREELIST
Show the bitmap for the B-tree.
Definition: constants.h:229
bool check(Xapian::docid min_docid, double min_wt)
Check if the specified docid occurs.
ValueMapPostingSource(Xapian::valueno slot_)
Construct a ValueMapPostingSource.
ESet & operator=(const ESet &o)
Copying is allowed.
Base class for range processors.
Definition: queryparser.h:132
ExpandDecider subclass which restrict terms to a particular prefix.
Definition: expanddecider.h:143
Document(const Document &o)
Copy constructor.
Xapian::doccount get_doccount() const
Get the number of documents in the database.
Query(const Query &o)
Copying is allowed.
Definition: query.h:351
unsigned tolower(unsigned ch)
Convert a Unicode character to lowercase.
Definition: unicode.h:380
const Clusterer * release() const
Start reference counting this object.
Definition: cluster.h:583
QueryParser & operator=(QueryParser &&o)
Move assignment operator.
@ PRIVATE_USE
Other, private use (Co)
Definition: unicode.h:238
Xapian::Weight subclass implementing the BM25 probabilistic formula.
Definition: weight.h:813
AssertionError(const std::string &msg_, const std::string &context_=std::string(), int errno_=0)
General purpose constructor.
Definition: error.h:203
std::string serialise() const
Return this object's parameters serialised as a single string.
std::string get_description() const
Return a string describing this object.
NumberRangeProcessor(Xapian::valueno slot_, const std::string &str_=std::string(), unsigned flags_=0)
Constructor.
Definition: queryparser.h:374
virtual void skip_to(Xapian::docid did, double min_wt)
Advance to the specified docid.
Class representing a set of documents judged as relevant.
Definition: rset.h:40
ESet()
Default constructor.
void commit_transaction()
Complete the transaction currently in progress.
Definition: database.h:1143
virtual KeyMaker * unserialise(const std::string &serialised, const Registry &context) const
Unserialise parameters.
Xapian::doccount size() const
Return the number of clusters.
ValueIterator valuestream_end(Xapian::valueno) const noexcept
Return end iterator corresponding to valuestream_begin().
Definition: database.h:407
virtual Weight * unserialise(const std::string &serialised) const
Unserialise parameters.
ESetIterator back() const
Return iterator pointing to the last object in this ESet.
Definition: eset.h:342
Indicates an attempt to use a feature which is unavailable.
Definition: error.h:707
Xapian::doccount get_termfreq(const std::string &term) const
Get the termfreq of a term.
virtual ~Weight()
Virtual destructor, because we have virtual methods.
double get_maxextra() const
Return an upper bound on what get_sumextra() can return for any document.
const Query operator&=(Query &a, const InvertedQuery_ &b)
Combine two Xapian::Query objects using OP_AND_NOT with result in the first.
Definition: query.h:820
int revision()
Report the revision of the library which the program is linked with.
Definition: xapian.h:149
bool contains(Xapian::docid did) const
Check if a document is marked as relevant.
void set_weighting_scheme(const Weight &weight)
Set the weighting scheme to use.
void set_document(const Xapian::Document &doc)
Set the current document.
std::string sortable_serialise(double value)
Convert a floating point number to a string, preserving sort order.
Definition: queryparser.h:1079
int convert_to_percent(double weight) const
Convert a weight to a percentage.
Hierarchy of classes which Xapian can throw as exceptions.
const Point & operator[](Xapian::doccount i) const
Return the point at the given index in the cluster.
double get_sumextra(Xapian::termcount doclen, Xapian::termcount uniqterms, Xapian::termcount wdfdocmax) const
Calculate the term-independent weight component for a document.
bool is_wordchar(unsigned ch)
Test if a given Unicode character is "word character".
Definition: unicode.h:347
Handle a number range.
Definition: queryparser.h:334
std::string get_corrected_query_string() const
Get the spelling-corrected query string.
virtual std::string get_description() const
Return a string describing this object.
Xapian::termcount get_doclength_lower_bound() const
A lower bound on the minimum length of any document in the database.
Definition: weight.h:430
LatLongMetric * clone() const
Clone the metric.
MSetIterator operator+(difference_type n) const
Return the iterator incremented by n positions.
Definition: mset.h:525
const Xapian::LatLongMetric * get_lat_long_metric(const std::string &name) const
Get a lat-long metric given a name.
BB2Weight * unserialise(const std::string &serialised) const
Unserialise parameters.
const Query operator|(const Query &a, const Query &b)
Combine two Xapian::Query objects using OP_OR.
Definition: query.h:737
void next(double min_wt)
Advance the current position to the next matching document.
Class to represent cluster centroids in the vector space.
Definition: cluster.h:332
std::string operator()(const Xapian::Document &doc) const
Build a key string for a Document.
Stem & operator=(Stem &&)=default
Move assignment operator.
Class for iterating over document values.
Definition: valueiterator.h:40
WritableDatabase open_writable(const std::string &host, unsigned int port, unsigned timeout=0, unsigned connect_timeout=10000, int flags=0)
Construct a WritableDatabase object for update access to a remote database accessed via a TCP connect...
double get_sumpart(Xapian::termcount wdf, Xapian::termcount doclen, Xapian::termcount uniqterms, Xapian::termcount wdfdocmax) const
Calculate the weight contribution for this object's term to a document.
Utf8Iterator() noexcept
Create an iterator which is at the end of its iteration.
Definition: unicode.h:132
int flags
For backward compatibility with Xapian 1.2.
Definition: termgenerator.h:96
std::string name() const
Return the name of this weighting scheme.
void set_sort_by_key_then_relevance(KeyMaker *sorter, bool reverse) XAPIAN_NONNULL()
Set the sorting to be by keys generated from values, then by relevance for documents with identical k...
void index_text(const std::string &text, Xapian::termcount wdf_inc=1, const std::string &prefix=std::string())
Index some text in a std::string.
Definition: termgenerator.h:226
std::string get_description() const
Return a string describing this object.
double get_maxextra() const
Return an upper bound on what get_sumextra() can return for any document.
void compact(int fd, unsigned flags, int block_size, Xapian::Compactor &compactor)
Produce a compact version of this database.
Definition: database.h:874
std::string get_description() const
Return a string describing this object.
void set_cutoff(int percent_threshold, double weight_threshold=0)
Set lower bounds on percentage and/or weight.
Class for counting the frequencies of values in the matching documents.
Definition: matchspy.h:205
RangeProcessor * release()
Start reference counting this object.
Definition: queryparser.h:222
void unserialise(const char **ptr, const char *end)
Unserialise a buffer and set this object to its coordinate.
std::string serialise() const
Serialise this object into a string.
void clear_matchspies()
Remove all the matchspies.
const ExpandDecider * release() const
Start reference counting this object.
Definition: expanddecider.h:77
virtual bool operator()(const std::string &term) const =0
Do we want this term in the ESet?
Cluster(Cluster &&other)
Move constructor.
Xapian::doccount get_matches_upper_bound() const
Upper bound on the total number of matching documents.
const Query get_subquery(size_t n) const
Read a top level subquery.
InvalidArgumentError(const std::string &msg_, int errno_)
Construct from message and errno value.
Definition: error.h:253
Xapian::termcount get_wdfdocmax() const
Return the max_wdf in the current document.
ValueIterator & operator=(const ValueIterator &o)
Assignment.
DatabaseError indicates some sort of database related error.
Definition: error.h:355
TfIdfWeight(const std::string &normalizations, double slope, double delta)
Construct a TfIdfWeight.
const Query operator/(const Query &q, double factor)
Inverse-scale a Xapian::Query object using OP_SCALE_WEIGHT.
Definition: query.h:777
void init(const Database &db_)
Older method which did the same job as reset().
FeatureUnavailableError(const std::string &msg_, const std::string &context_=std::string(), int errno_=0)
General purpose constructor.
Definition: error.h:723
The base class for exceptions indicating errors in the program logic.
Definition: error.h:142
@ LINE_SEPARATOR
Separator, line (Zl)
Definition: unicode.h:234
TermGenerator()
Default constructor.
Registry(const Registry &other)
Copy constructor.
MatchSpy() noexcept
Default constructor, needed by subclass constructors.
Definition: matchspy.h:60
double get_maxweight() const noexcept
Return the currently set upper bound on what get_weight() can return.
Definition: postingsource.h:133
virtual MatchSpy * unserialise(const std::string &serialised, const Registry &context) const
Unserialise parameters.
A sequence of latitude-longitude coordinates.
Definition: geospatial.h:231
double get_average_length() const
Get the mean document length in the database.
Indicates an attempt to access a closed database.
Definition: error.h:1085
const TermIterator get_terms_begin() const
Begin iterator for terms in the query object.
InL2Weight * create_from_parameters(const char *params) const
Return the parameterised weighting scheme object.
DatabaseError(const std::string &msg_, int errno_)
Construct from message and errno value.
Definition: error.h:379
void set_max_word_length(unsigned max_word_length)
Set the maximum length word to index.
double get_magnitude() const
Return the pre-computed squared magnitude.
Class for iterating over a list of terms.
Definition: termiterator.h:41
@ MODIFIER_SYMBOL
Symbol, modified (Sk)
Definition: unicode.h:249
LatLongCoords()
Construct an empty container.
Definition: geospatial.h:265
DatabaseModifiedError(const std::string &msg_, const std::string &context_=std::string(), int errno_=0)
General purpose constructor.
Definition: error.h:543
bool operator>=(const ESetIterator &a, const ESetIterator &b) noexcept
Inequality test for ESetIterator objects.
Definition: eset.h:300
void replace_weights(Iterator first, Iterator last)
Assigns new weights and updates MSet.
Definition: mset.h:117
Xapian::termcount get_doclength_upper_bound() const
An upper bound on the maximum length of any document in the database.
Definition: weight.h:420
virtual double get_maxpart() const =0
Return an upper bound on what get_sumpart() can return for any document.
LatLongDistancePostingSource * clone() const
Clone the posting source.
PositionIterator & operator=(const PositionIterator &o)
Assignment.
Class representing a stemming algorithm.
Definition: stem.h:62
double get_maxextra() const
Return an upper bound on what get_sumextra() can return for any document.
@ OP_AND_NOT
Match documents which the first subquery matches but no others do.
Definition: query.h:97
const Xapian::Weight * get_weighting_scheme(const std::string &name) const
Get the weighting scheme given a name.
Query(op op_, I begin, I end, Xapian::termcount window=0)
Construct a Query object from a begin/end iterator pair.
Definition: query.h:546
DatabaseCorruptError indicates database corruption was detected.
Definition: error.h:397
void clear_values()
Clear all value slots in this document.
void remove_value(const std::string &value)
Remove a value from the test set.
Definition: valuesetmatchdecider.h:84
TermIterator termlist_end(Xapian::docid) const noexcept
End iterator corresponding to termlist_begin().
Definition: database.h:266
double operator()(const LatLongCoords &a, const std::string &b) const
Return the distance between two coordinate lists, in metres.
Definition: geospatial.h:336
An iterator across the values in a LatLongCoords object.
Definition: geospatial.h:164
void clear_clusters()
Clear all the clusters in the ClusterSet.
void divide(double cluster_size)
Divide the weight of terms in the centroid by 'size' and recalculate the magnitude.
double get_sumpart(Xapian::termcount wdf, Xapian::termcount doclen, Xapian::termcount uniqterm, Xapian::termcount wdfdocmax) const
Calculate the weight contribution for this object's term to a document.
TermIterator top_values_end(size_t) const noexcept
End iterator corresponding to top_values_begin()
Definition: matchspy.h:272
virtual std::string serialise() const =0
Serialise object parameters into a string.
Indicates an error in the std::string serialisation of an object.
Definition: error.h:917
Xapian::termcount get_collection_freq(const std::string &term) const
Get the total number of occurrences of a specified term.
const int DBCHECK_FIX
Fix problems.
Definition: constants.h:250
double get_sumextra(Xapian::termcount doclen, Xapian::termcount uniqterms, Xapian::termcount wdfdocmax) const
Calculate the term-independent weight component for a document.
virtual std::string get_description() const
Return a string describing this object.
std::string name() const
Return the name of this weighting scheme.
DLHWeight * create_from_parameters(const char *params) const
Return the parameterised weighting scheme object.
LCD clusterer: This clusterer implements the LCD clustering algorithm adapted from Modelling efficien...
Definition: cluster.h:658
int major_version()
Report the major version of the library which the program is linked with.
Definition: xapian.h:131
Xapian::TermIterator metadata_keys_begin(const std::string &prefix=std::string()) const
An iterator which returns all user-specified metadata keys.
Xapian::termcount termlist_size() const
Return the size of the termlist.
std::string name() const
Return the name of this weighting scheme.
void set_query(const Query &query, termcount query_length=0)
Set the query.
termcount remove_spelling(const std::string &word, termcount freqdec=1) const
Remove a word from the spelling dictionary.
double get_sumpart(Xapian::termcount wdf, Xapian::termcount doclen, Xapian::termcount uniqueterms, Xapian::termcount wdfdocmax) const
Calculate the weight contribution for this object's term to a document.
#define XAPIAN_TOTALLENGTH_TYPE
Type for returning total document length.
Definition: version.h:80
TermIterator get_matching_terms_end(docid) const noexcept
End iterator corresponding to get_matching_terms_begin().
Definition: enquire.h:436
bool operator>(const ESetIterator &a, const ESetIterator &b) noexcept
Inequality test for ESetIterator objects.
Definition: eset.h:293
double similarity(const PointType &a, const PointType &b) const
Calculates and returns the cosine similarity using the formula cos(theta) = a.b/(|a|*|b|)
Class for looking up user subclasses during unserialisation.
LMWeight * create_from_parameters(const char *params) const
Return the parameterised weighting scheme object.
void set_termfreq_min(Xapian::doccount termfreq_min_)
Set a lower bound on the term frequency.
Definition: postingsource.h:507
std::string short_name() const
Return the short name of the weighting scheme.
IfB2Weight * create_from_parameters(const char *params) const
Return the parameterised weighting scheme object.
virtual std::string operator()(const std::string &word)=0
Stem the specified word.
virtual std::string serialise() const
Return this object's parameters serialised as a single string.
void add_document(const Xapian::MSetIterator &it)
Mark a document as relevant.
Definition: rset.h:97
double get_sumpart(Xapian::termcount wdf, Xapian::termcount doclen, Xapian::termcount uniqterms, Xapian::termcount wdfdocmax) const
Calculate the weight contribution for this object's term to a document.
double get_maxpart() const
Return an upper bound on what get_sumpart() can return for any document.
Xapian::valueno get_slot() const
The slot we're reading values from.
Definition: postingsource.h:475
BB2Weight * create_from_parameters(const char *params) const
Return the parameterised weighting scheme object.
std::string serialise() const
Return a serialised form of the coordinate list.
TfIdfWeight(wdf_norm wdf_norm_, idf_norm idf_norm_, wt_norm wt_norm_, double slope, double delta)
Construct a TfIdfWeight.
ClusterSet cluster(const MSet &mset)
Implements the LCD clustering algorithm.
DPHWeight * unserialise(const std::string &serialised) const
Unserialise parameters.
const Xapian::Document & operator[](Xapian::doccount i) const
Return the Document in the DocumentSet at index i.
void set_stemming_strategy(stem_strategy strategy)
Set the stemming strategy.
bool check(Xapian::docid docid)
Check if the specified docid occurs.
DateRangeProcessor(Xapian::valueno slot_, const std::string &str_, unsigned flags_=0, int epoch_year_=1970)
Constructor.
Definition: queryparser.h:308
DatabaseNotFoundError(const std::string &msg_, const std::string &context_=std::string(), int errno_=0)
General purpose constructor.
Definition: error.h:1059
std::string serialise() const
Serialise object parameters into a string.
LatLongCoord(double latitude_, double longitude_)
Construct a coordinate.
Xapian::doccount get_termfreq_min() const
A lower bound on the number of documents this object can return.
std::string snippet(const std::string &text, size_t length=500, const Xapian::Stem &stemmer=Xapian::Stem(), unsigned flags=SNIPPET_BACKGROUND_MODEL|SNIPPET_EXHAUSTIVE, const std::string &hi_start="<b>", const std::string &hi_end="</b>", const std::string &omit="...") const
Generate a snippet.
LCDClusterer(unsigned int k_)
Constructor specifying number of clusters.
Compact a database, or merge and compact several.
Definition: compactor.h:40
Handle a date range.
Definition: queryparser.h:244
std::string serialise() const
Return this object's parameters serialised as a single string.
TermIterator(TermIterator &&o)
Move constructor.
Definition: termiterator.h:59
KeyMaker * unserialise(const std::string &serialised, const Registry &context) const
Unserialise parameters.
Document get_document() const
Returns the document corresponding to this Point.
CoordWeight * clone() const
Clone this object.
const int DB_NO_SYNC
Don't attempt to ensure changes have hit disk.
Definition: constants.h:66
virtual ~FreqSource()
Destructor.
Class for iterating over a list of terms.
bool operator==(const ESetIterator &a, const ESetIterator &b) noexcept
Equality test for ESetIterator objects.
Definition: eset.h:272
void commit()
Commit pending modifications.
std::string name() const
Name of the posting source class.
void set_database(const Xapian::WritableDatabase &db)
Set the database to index spelling data to.
This class implements the IneB2 weighting scheme.
Definition: weight.h:1286
double get_maxextra() const
Return an upper bound on what get_sumextra() can return for any document.
Query::op get_default_op() const
Get the current default operator.
BB2Weight(double c)
Construct a BB2Weight.
Database(int fd, int flags=0)
Open a single-file Database.
bool operator()(const std::string &term) const
Is term a stop-word?
Definition: cluster.h:60
static const Xapian::Query MatchNothing
A query matching no documents.
Definition: query.h:63
virtual std::string get_description() const
Return a string describing this object.
DiceCoeffWeight * unserialise(const std::string &serialised) const
Unserialise parameters.
doccount get_termfreq(const std::string &tname) const
Return the number of documents that the term 'tname' exists in.
UnimplementedError(const std::string &msg_, int errno_)
Construct from message and errno value.
Definition: error.h:337
void add_document(const Document &document)
Add a new Document to the DocumentSet.
void swap(RSet &o)
Efficiently swap this RSet object with another.
Definition: rset.h:85
size_t size() const
Get the number of coordinates in the container.
Definition: geospatial.h:247
~Stem()
Destructor.
Definition: stem.h:153
@ OTHER_SYMBOL
Symbol, other (So)
Definition: unicode.h:250
PositionIterator(PositionIterator &&o)
Move constructor.
Definition: positioniterator.h:60
virtual Weight * clone() const =0
Clone this object.
AssertionError(const std::string &msg_, int errno_)
Construct from message and errno value.
Definition: error.h:211
bool empty() const noexcept
Check if this query is Xapian::Query::MatchNothing.
Definition: query.h:603
bool locked() const
Test if this database is currently locked for writing.
PL2Weight(double c)
Construct a PL2Weight.
Iterator over a Xapian::MSet.
Definition: mset.h:437
bool check(Xapian::docid min_docid, double min_wt)
Check if the specified docid occurs.
TradWeight * create_from_parameters(const char *params) const
Return the parameterised weighting scheme object.
PostingSource * release()
Start reference counting this object.
Definition: postingsource.h:399
const Centroid & get_centroid() const
Return the current centroid of the cluster.
ExpandDecider subclass which rejects terms using two ExpandDeciders.
Definition: expanddecider.h:88
Centroid()
Default constructor.
void remove_posting(const std::string &term, Xapian::termpos term_pos, Xapian::termcount wdf_dec=1)
Remove posting for a term.
std::string serialise() const
Return this object's parameters serialised as a single string.
const char * get_type() const noexcept
The type of this error (e.g. "DocNotFoundError".)
Definition: error.h:106
void init(const Database &db_)
Older method which did the same job as reset().
const TermIterator get_unique_terms_end() const noexcept
End iterator for unique terms in the query object.
Definition: query.h:595
doccount get_doccount() const
Return the number of documents within the MSet.
DatabaseVersionError indicates that a database is in an unsupported format.
Definition: error.h:620
wdf_norm
Wdf normalizations.
Definition: weight.h:485
DocumentSet()
Default constructor.
std::string short_name() const
Return the short name of the weighting scheme.
LMWeight * unserialise(const std::string &serialised) const
Unserialise parameters.
const int DB_DANGEROUS
Update the database in-place.
Definition: constants.h:103
Xapian::termcount get_unique_terms(Xapian::docid did) const
Get the number of unique terms in a document.
DocNotFoundError(const std::string &msg_, const std::string &context_=std::string(), int errno_=0)
General purpose constructor.
Definition: error.h:678
BM25Weight(double k1, double k2, double k3, double b, double min_normlen)
Construct a BM25Weight.
Definition: weight.h:858
const int DBCHECK_FULL_TREE
Show a full display of the B-tree contents.
Definition: constants.h:223
virtual MatchSpy * unserialise(const std::string &serialised, const Registry &context) const
Unserialise parameters.
BM25PlusWeight * create_from_parameters(const char *params) const
Return the parameterised weighting scheme object.
@ SPACE_SEPARATOR
Separator, space (Zs)
Definition: unicode.h:233
void set_expansion_scheme(const std::string &eweightname, double expand_k=1.0) const
Set the weighting scheme to use for expansion.
double metres_to_miles(double metres) noexcept
Convert from metres to miles.
Definition: geospatial.h:67
const int DBCHECK_SHORT_TREE
Show a short-format display of the B-tree contents.
Definition: constants.h:217
void skip_to(Xapian::termpos termpos)
Advance the iterator to term position termpos.
BM25PlusWeight * unserialise(const std::string &serialised) const
Unserialise parameters.
static const Weight * create(const std::string &scheme, const Registry ®=Registry())
Return the appropriate weighting scheme object.
MSetIterator & operator-=(difference_type n)
Move the iterator back by n positions.
Definition: mset.h:516
Class for iterating over a list of terms.
Definition: postingiterator.h:41
Indicates a query string can't be parsed.
Definition: error.h:875
QueryParser()
Default constructor.
void clear()
Clear the terms and corresponding values of the centroid.
Cluster()
Default constructor.
void done()
End the iteration.
Definition: postingsource.h:489
std::string get_description() const
Return a string describing this object.
~TermGenerator()
Destructor.
@ DASH_PUNCTUATION
Punctuation, dash (Pd)
Definition: unicode.h:241
virtual MatchSpy * clone() const
Clone the match spy.
virtual std::string get_description() const =0
Return a string describing this object.
void add(const std::string &term)
Add a single stop word and its stemmed equivalent.
Xapian::doccount get_termfreq(const std::string &term) const
Get the number of documents indexed by a specified term.
ESetIterator & operator+=(difference_type n)
Move the iterator forwards by n positions.
Definition: eset.h:231
MultiValueKeyMaker(Iterator begin, Iterator end)
Construct a MultiValueKeyMaker from a pair of iterators.
Definition: keymaker.h:176
DatabaseError(const std::string &msg_, const std::string &context_=std::string(), int errno_=0)
General purpose constructor.
Definition: error.h:371
DatabaseCorruptError(const std::string &msg_, const std::string &context_=std::string(), int errno_=0)
General purpose constructor.
Definition: error.h:413
RSet & operator=(RSet &&o)
Move assignment operator.
const Query operator^(const Query &a, const Query &b)
Combine two Xapian::Query objects using OP_XOR.
Definition: query.h:744
wt_norm
Weight normalizations.
Definition: weight.h:621
ESetIterator operator--(int)
Move the iterator to the previous position (postfix version).
Definition: eset.h:201
Xapian::termcount get_doclength(Xapian::docid did) const
Get the length of a document.
virtual std::string resolve_duplicate_metadata(const std::string &key, size_t num_tags, const std::string tags[])
Resolve multiple user metadata entries with the same key.
Utf8Iterator & operator++()
Move forward to the next Unicode character.
Definition: unicode.h:176
void register_key_maker(Xapian::KeyMaker *keymaker)
Register a user-defined KeyMaker subclass.
bool contains(const std::string &term) const
Validate whether a certain term exists in the termlist or not by performing a lookup operation in the...
virtual ~Database()
Destructor.
void add_prefix(const std::string &field, const std::string &prefix)
Add a free-text field term prefix.
LatLongDistanceKeyMaker(Xapian::valueno slot_, const LatLongCoord ¢re_, const LatLongMetric &metric_)
Construct a LatLongDistanceKeyMaker.
Definition: geospatial.h:649
bool empty() const
Return true if this RSet object is empty.
Definition: rset.h:82
double get_sumpart(Xapian::termcount wdf, Xapian::termcount doclen, Xapian::termcount uniqterms, Xapian::termcount wdfdocmax) const
Calculate the weight contribution for this object's term to a document.
TermIterator get_matching_terms_begin(const MSetIterator &it) const
Iterate query terms matching a document.
Definition: enquire.h:431
Query(op op_, const std::string &a, const std::string &b)
Construct a Query object by combining two terms.
Definition: query.h:420
@ WILDCARD_LIMIT_ERROR
Throw an error if OP_WILDCARD exceeds its expansion limit.
Definition: query.h:294
MSetIterator end() const
Return iterator pointing to just after the last item in this MSet.
Definition: mset.h:680
void close()
Close the database.
Xapian::termcount get_wdf_upper_bound() const
An upper bound on the wdf of this term.
Definition: weight.h:438
PostingIterator & operator++()
Advance the iterator to the next position.
unsigned valueno
The number for a value slot in a document.
Definition: types.h:90
SerialisationError(const std::string &msg_, const std::string &context_=std::string(), int errno_=0)
General purpose constructor.
Definition: error.h:933
Compact a database, or merge and compact several.
std::string name() const
Return the name of this weighting scheme.
DecreasingValueWeightPostingSource * unserialise(const std::string &serialised) const
Create object given string serialisation returned by serialise().
Xapian::docid add_document(const Xapian::Document &doc)
Add a document to the database.
std::string name() const
Return the full name of the metric.
double doclength
A normalised document length.
Definition: types.h:58
QueryParserError(const std::string &msg_, int errno_)
Construct from message and errno value.
Definition: error.h:899
std::string short_name() const
Return the short name of the weighting scheme.
Xapian::doccount get_value_freq(Xapian::valueno slot) const
Return the frequency of a given value slot.
void set_centroid(const Centroid ¢roid)
Set the centroid of the Cluster to 'centroid'.
Weight(const Weight &)
Don't allow copying.
double get_maxextra() const
Return an upper bound on what get_sumextra() can return for any document.
The base class for exceptions indicating errors only detectable at runtime.
Definition: error.h:164
const int DB_BACKEND_INMEMORY
Use the "in memory" backend.
Definition: constants.h:189
UnitRangeProcessor(Xapian::valueno slot_, const std::string &str_=std::string())
Constructor.
Definition: queryparser.h:417
An indexed database of documents.
DPHWeight()
Construct a DPHWeight.
Definition: weight.h:1687
std::string short_name() const
Return the short name of the weighting scheme.
Constants in the Xapian namespace.
All exceptions thrown by Xapian are subclasses of Xapian::Error.
Definition: error.h:41
IneB2Weight(double c)
Construct an IneB2Weight.
double get_maxpart() const
Return an upper bound on what get_sumpart() can return for any document.
virtual std::string name() const =0
Return the full name of the metric.
Xapian::TermIterator synonyms_begin(const std::string &term) const
An iterator which returns all the synonyms for a given term.
virtual MatchSpy * clone() const
Clone the match spy.
External sources of posting information.
LatLongDistancePostingSource(Xapian::valueno slot_, const LatLongCoords ¢re_, const LatLongMetric &metric_, double max_range_=0.0, double k1_=1000.0, double k2_=1.0)
Construct a new posting source which returns only documents within range of one of the central coordi...
bool at_end() const
Return true if the current position is past the last entry in this list.
void set_sort_by_key(KeyMaker *sorter, bool reverse) XAPIAN_NONNULL()
Set the sorting to be by key generated from values only.
Class for iterating over term positions.
ExpandDecider subclass which rejects terms in a specified list.
Definition: expanddecider.h:119
double get_weight() const
Return the weight contribution for the current document.
void add_rangeprocessor(Xapian::RangeProcessor *range_proc, const std::string *grouping=NULL)
Register a RangeProcessor.
virtual std::string name() const
Return the name of this weighting scheme.
std::string short_name() const
Return the short name of the weighting scheme.
Xapian::Document get_document() const
Get the Document object for the current position.
ValueIterator(ValueIterator &&o)
Move constructor.
Definition: valueiterator.h:60
LatLongDistancePostingSource(Xapian::valueno slot_, const LatLongCoords ¢re_, double max_range_=0.0, double k1_=1000.0, double k2_=1.0)
Construct a new posting source which returns only documents within range of one of the central coordi...
virtual void set_status(const std::string &table, const std::string &status)
Update progress.
Xapian::doccount get_firstitem() const
Rank of first item in this MSet.
void set_stemming_strategy(stem_strategy strategy)
Set the stemming strategy.
bool empty() const
Return true if this MSet object is empty.
Definition: mset.h:369
IfB2Weight(double c)
Construct an IfB2Weight.
Query(op op_, const std::string &pattern, Xapian::termcount max_expansion, int flags, op combiner, unsigned edit_distance, size_t min_prefix_len=0)
Query constructor for OP_EDIT_DISTANCE queries.
BM25PlusWeight(double k1, double k2, double k3, double b, double min_normlen, double delta)
Construct a BM25PlusWeight.
Definition: weight.h:980
Simple implementation of Stopper class - this will suit most users.
Definition: queryparser.h:95
virtual Weight * create_from_parameters(const char *params) const
Return the parameterised weighting scheme object.
unsigned XAPIAN_TERMPOS_BASE_TYPE termpos
A term position within a document or query.
Definition: types.h:75
MSet(const MSet &o)
Copying is allowed.
A posting source which looks up weights in a map using values as the key.
Definition: postingsource.h:655
UnimplementedError indicates an attempt to use an unimplemented feature.
Definition: error.h:313
void next(double min_wt)
Advance the current position to the next matching document.
ValueWeightPostingSource(Xapian::valueno slot_)
Construct a ValueWeightPostingSource.
std::string serialise() const
Return this object's parameters serialised as a single string.
double get_maxextra() const
Return an upper bound on what get_sumextra() can return for any document.
Virtual base class for expand decider functor.
Definition: expanddecider.h:38
virtual ~Stopper()
Class has virtual methods, so provide a virtual destructor.
Definition: queryparser.h:64
size_t left() const
Return the number of bytes left in the iterator's buffer.
Definition: unicode.h:59
LatLongDistanceKeyMaker(Xapian::valueno slot_, const LatLongCoord ¢re_, const LatLongMetric &metric_, double defdistance)
Construct a LatLongDistanceKeyMaker.
Definition: geospatial.h:628
std::string get_description() const
Return a string describing this object.
std::string get_value() const
Read current value.
Definition: postingsource.h:481
virtual Xapian::doccount get_termfreq_min() const =0
A lower bound on the number of documents this object can return.
Xapian::docid get_docid() const
Return the docid at the current position.
Class representing a list of search results.
Definition: mset.h:45
ESetIterator operator-(difference_type n) const
Return the iterator decremented by n positions.
Definition: eset.h:254
const std::string & get_context() const noexcept
Optional context information.
Definition: error.h:121
unsigned XAPIAN_DOCID_BASE_TYPE doccount
A count of documents.
Definition: types.h:37
DatabaseLockError indicates failure to lock a database.
Definition: error.h:481
Abstract base class for weighting schemes.
Definition: weight.h:36
@ ENCLOSING_MARK
Mark, enclosing (Me)
Definition: unicode.h:228
@ LETTER_NUMBER
Number, letter (Nl)
Definition: unicode.h:231
void set_termfreq_max(Xapian::doccount termfreq_max_)
An upper bound on the term frequency.
Definition: postingsource.h:529
MSet get_mset(doccount first, doccount maxitems, const RSet *rset, const MatchDecider *mdecider=NULL) const
Run the query.
Definition: enquire.h:405
DPHWeight * create_from_parameters(const char *params) const
Return the parameterised weighting scheme object.
PL2Weight * create_from_parameters(const char *params) const
Return the parameterised weighting scheme object.
size_t get_num_subqueries() const noexcept
Get the number of subqueries of the top level query.
DatabaseCorruptError(const std::string &msg_, int errno_)
Construct from message and errno value.
Definition: error.h:421
double get_maxpart() const
Return an upper bound on what get_sumpart() can return for any document.
Xapian::doccount get_uncollapsed_matches_lower_bound() const
Lower bound on the total number of matching documents before collapsing.
RSet & operator=(const RSet &o)
Copying is allowed.
LatLongDistancePostingSource * unserialise_with_registry(const std::string &serialised, const Registry ®istry) const
Create object given string serialisation returned by serialise().
ESet(const ESet &o)
Copying is allowed.
This class implements the InL2 weighting scheme.
Definition: weight.h:1132
~PostingIterator()
Destructor.
Definition: postingiterator.h:83
RangeError(const std::string &msg_, const std::string &context_=std::string(), int errno_=0)
General purpose constructor.
Definition: error.h:975
Xapian::Weight subclass implementing the PL2+ probabilistic formula.
Definition: weight.h:1572
double get_sumpart(Xapian::termcount wdf, Xapian::termcount doclen, Xapian::termcount uniqterms, Xapian::termcount wdfdocmax) const
Calculate the weight contribution for this object's term to a document.
virtual std::string get_description() const
Return a string describing this object.
Stem(Stem &&)=default
Move constructor.
virtual std::string operator()(const Xapian::Document &doc) const
Build a key string for a Document.
FeatureUnavailableError(const std::string &msg_, int errno_)
Construct from message and errno value.
Definition: error.h:731
void add_point(const Point &point)
Add a document to the Cluster.
virtual std::string get_description() const
Return a string describing this object.
double get_weight() const
Return the weight contribution for the current document.
Database & operator=(const Database &o)
Assignment operator.
Query parse_query(const std::string &query_string, unsigned flags=FLAG_DEFAULT, const std::string &default_prefix=std::string())
Parse a query.
MSetIterator operator--(int)
Move the iterator to the previous position (postfix version).
Definition: mset.h:480
const Query & get_query() const
Get the currently set query.
TermIterator stoplist_begin() const
Begin iterator over terms omitted from the query as stopwords.
Unicode and UTF-8 related classes and functions.
#define XAPIAN_TERMPOS_BASE_TYPE
Base (signed) type for Xapian::termpos.
Definition: version.h:77
void set_termfreq_est(Xapian::doccount termfreq_est_)
An estimate of the term frequency.
Definition: postingsource.h:518
double get_max_attained() const
The maximum weight attained by any document.
Document & operator=(const Document &o)
Assignment operator.
Xapian::docid get_docid() const
Return the current docid.
const int DB_BACKEND_STUB
Open a stub database file.
Definition: constants.h:173
void compact(int fd, unsigned flags=0, int block_size=0)
Produce a compact version of this database.
Definition: database.h:758
std::string get_data() const
Get the document data.
std::string serialise() const
Serialise object parameters into a string.
BM25Weight * unserialise(const std::string &serialised) const
Unserialise parameters.
ExpandDeciderAnd(const ExpandDecider *first_, const ExpandDecider *second_)
Compatibility method.
Definition: expanddecider.h:107
double get_maxextra() const
Return an upper bound on what get_sumextra() can return for any document.
ESet(ESet &&o)
Move constructor.
std::string operator*() const
Return the value at the current position.
virtual double pointwise_distance(const LatLongCoord &a, const LatLongCoord &b) const =0
Return the distance between two coordinates, in metres.
virtual bool operator()(const Xapian::Document &doc) const =0
Decide whether to accept a document.
PL2Weight * unserialise(const std::string &serialised) const
Unserialise parameters.
DateRangeProcessor(Xapian::valueno slot_, unsigned flags_=0, int epoch_year_=1970)
Constructor.
Definition: queryparser.h:261
virtual std::string name() const
Return the name of this KeyMaker.
LatLongDistanceKeyMaker(Xapian::valueno slot_, const LatLongCoord ¢re_)
Construct a LatLongDistanceKeyMaker.
Definition: geospatial.h:670
InvalidOperationError indicates the API was used in an invalid way.
Definition: error.h:271
bool check(Xapian::docid min_docid, double min_wt)
Check if the specified docid occurs.
NetworkError(const std::string &msg_, int errno_)
Construct from message and errno value.
Definition: error.h:815
std::string serialise() const
Return this object's parameters serialised as a single string.
double get_sumextra(Xapian::termcount doclen, Xapian::termcount uniqterms, Xapian::termcount wdfdocmax) const
Calculate the term-independent weight component for a document.
CoordWeight * create_from_parameters(const char *params) const
Return the parameterised weighting scheme object.
size_t size() const
Return number of shards in this Database object.
void init(const Database &db_)
Older method which did the same job as reset().
MSetIterator back() const
Return iterator pointing to the last object in this MSet.
Definition: mset.h:692
virtual void init(const Database &db)
Older method which did the same job as reset().
#define XAPIAN_TERMCOUNT_BASE_TYPE
Base (signed) type for Xapian::termcount and related types.
Definition: version.h:74
const MatchSpy * release() const
Start reference counting this object.
Definition: matchspy.h:196
double get_sumextra(Xapian::termcount doclen, Xapian::termcount uniqterms, Xapian::termcount wdfdocmax) const
Calculate the term-independent weight component for a document.
std::string get_description() const
Return a string describing this object.
ESet & operator=(ESet &&o)
Move assignment operator.
Xapian::doccount get_uncollapsed_matches_estimated() const
Estimate of the total number of matching documents before collapsing.
WritableDatabase()
Create a WritableDatabase with no subdatabases.
Definition: database.h:939
const Cluster & operator[](Xapian::doccount i) const
Return the cluster at index 'i'.
std::string name() const
Return the name of this weighting scheme.
double get_maxextra() const
Return an upper bound on what get_sumextra() can return for any document.
bool operator==(const Utf8Iterator &other) const noexcept
Test two Utf8Iterators for equality.
Definition: unicode.h:189
Indicates a timeout expired while communicating with a remote database.
Definition: error.h:833
Abstract base class for match deciders.
Definition: matchdecider.h:37
@ UPPERCASE_LETTER
Letter, uppercase (Lu)
Definition: unicode.h:222
Xapian::termcount get_wdf_upper_bound(const std::string &term) const
Get an upper bound on the wdf of term term.
std::string get_description() const
Return a string describing this object.
Xapian::docid operator*() const
Return the document id at the current position.
FreqSource * release()
Start reference counting this object.
Definition: cluster.h:165
Database & operator=(Database &&o)
Move assignment operator.
WildcardError(const std::string &msg_, int errno_)
Construct from message and errno value.
Definition: error.h:1025
Class representing a stemming algorithm implementation.
Definition: stem.h:40
const int DBCHECK_SHOW_STATS
Show statistics for the B-tree.
Definition: constants.h:235
std::string name() const
Return the name of this KeyMaker.
Parses a piece of text and generate terms.
Definition: termgenerator.h:48
double get_maxpart() const
Return an upper bound on what get_sumpart() can return for any document.
#define XAPIAN_REVISION_TYPE
Underlying type for Xapian::rev.
Definition: version.h:83
virtual double get_sumpart(Xapian::termcount wdf, Xapian::termcount doclen, Xapian::termcount uniqterms, Xapian::termcount wdfdocmax) const =0
Calculate the weight contribution for this object's term to a document.
Iterator over a Xapian::ESet.
Definition: eset.h:158
MSetIterator & operator++()
Advance the iterator to the next position.
Definition: mset.h:461
DocumentSet(DocumentSet &&other)
Move constructor.
stem_strategy
Stemming strategies.
Definition: cluster.h:47
double get_maxextra() const
Return an upper bound on what get_sumextra() can return for any document.
std::string name() const
Name of the posting source class.
Document(Document &&o)
Move constructor.
InL2Weight(double c)
Construct an InL2Weight.
void add_matchspy(MatchSpy *spy) XAPIAN_NONNULL()
Add a matchspy.
Base class for TermListGroup Stores and provides terms that are contained in a document and their res...
Definition: cluster.h:135
Query(op op_, Xapian::valueno slot, const std::string &range_limit)
Construct a Query object for a single-ended value range.
Xapian::doccount get_uncollapsed_matches_upper_bound() const
Upper bound on the total number of matching documents before collapsing.
Centroid(const Point &point)
Constructor with Point argument.
bool operator==(const LatLongCoordsIterator &other) const
Equality test for LatLongCoordsIterator objects.
Definition: geospatial.h:198
std::string name() const
Return the name of this weighting scheme.
Xapian::doclength get_average_length() const
The average length of a document in the collection.
Definition: weight.h:399
Xapian::termpos get_termpos() const
Get the current term position.
KeyMaker subclass which sorts by distance from a latitude/longitude.
Definition: geospatial.h:550
std::string serialise() const
Return this object's parameters serialised as a single string.
DatabaseOpeningError(const std::string &msg_, const std::string &context_=std::string(), int errno_=0)
General purpose constructor.
Definition: error.h:585
ESet get_eset(termcount maxitems, const RSet &rset, const ExpandDecider *edecider) const
Perform query expansion.
Definition: enquire.h:520
PositionIterator positionlist_end(Xapian::docid, const std::string &) const noexcept
End iterator corresponding to positionlist_begin().
Definition: database.h:286
Posting source which returns a weight based on geospatial distance.
Definition: geospatial.h:454
Xapian::doccount get_reltermfreq() const
The number of relevant documents which this term indexes.
Definition: weight.h:405
StemImplementation()
Default constructor.
Definition: stem.h:49
void swap(Document &o)
Efficiently swap this Document object with another.
Definition: document.h:253
DerefWrapper_< LatLongCoord > operator++(int)
Advance the iterator to the next position (postfix version).
Definition: geospatial.h:191
double latitude
A latitude, as decimal degrees.
Definition: geospatial.h:88
TermIterator termlist_end() const noexcept
End iterator corresponding to termlist_begin().
Definition: document.h:208
virtual ClusterSet cluster(const MSet &mset)=0
Implement the required clustering algorithm in the subclass and and return clustered output as Cluste...
Diversify & operator=(Diversify &&other)
Move assignment operator.
std::string name() const
Return the name of this weighting scheme.
virtual std::string short_name() const
Return the short name of the weighting scheme.
Xapian::doccount get_collapse_count() const
Return a count of the number of collapses done onto the current key.
double get_avlength() const
Old name for get_average_length() for backward compatibility.
Definition: database.h:314
GreatCircleMetric()
Construct a GreatCircleMetric.
std::string get_description() const
Return a string describing this object.
Xapian::docid range_start
Start of range of docids for which weights are known to be decreasing.
Definition: postingsource.h:601
TermIterator & operator++()
Advance the iterator to the next position.
double sortable_unserialise(const std::string &serialised) noexcept
Convert a string encoded using sortable_serialise back to a floating point number.
std::string operator()(const std::string &word) const
Stem a word.
virtual std::string serialise() const
Return this object's parameters serialised as a single string.
MatchDecider subclass for filtering results by value.
const TermIterator get_unique_terms_begin() const
Begin iterator for unique terms in the query object.
std::string name() const
Return the name of this weighting scheme.
std::string get_description() const
Return a string describing this object.
void cancel_transaction()
Abort the transaction currently in progress.
Definition: database.h:1156
static const Query unserialise(const std::string &serialised, const Registry ®=Registry())
Unserialise a string and return a Query object.
double get_weight() const
Get the weight for the current position.
double get_maxpart() const
Return an upper bound on what get_sumpart() can return for any document.
TermIterator values_end() const noexcept
End iterator corresponding to values_begin()
Definition: matchspy.h:255
Xapian::Weight subclass implementing the Language Model formula.
Definition: weight.h:1728
Xapian::termcount get_collection_freq() const
The collection frequency of the term.
Definition: weight.h:408
std::string get_description() const
Return a string describing this object.
ESetIterator & operator--()
Move the iterator to the previous position.
Definition: eset.h:195
std::string get_description() const
Return a string describing this object.
double get_maxextra() const
Return an upper bound on what get_sumextra() can return for any document.
Xapian::termcount get_ebound() const
Return a bound on the full size of this ESet object.
std::string get_description() const
Return a string describing this object.
ValueIterator values_end() const noexcept
End iterator corresponding to values_begin().
Definition: document.h:248
@ FORMAT
Other, format (Cf)
Definition: unicode.h:237
double get_maxextra() const
Return an upper bound on what get_sumextra() can return for any document.
void add_boolean_prefix(const std::string &field, const std::string &prefix, bool exclusive)
Add a boolean term prefix allowing the user to restrict a search with a boolean filter specified in t...
Definition: queryparser.h:986
Xapian::doccount get_rset_size() const
The number of documents marked as relevant.
Definition: weight.h:396
bool at_end() const
Return true if the current position is past the last entry in this list.
unsigned XAPIAN_DOCID_BASE_TYPE docid
A unique identifier for a document.
Definition: types.h:51
Read weights from a value which is known to decrease as docid increases.
Definition: postingsource.h:595
DerefWrapper_< Xapian::termpos > operator++(int)
Advance the iterator to the next position (postfix version).
Definition: positioniterator.h:95
LatLongDistanceKeyMaker(Xapian::valueno slot_, const LatLongCoords ¢re_)
Construct a LatLongDistanceKeyMaker.
Definition: geospatial.h:613
LatLongDistanceKeyMaker(Xapian::valueno slot_, const LatLongCoords ¢re_, const LatLongMetric &metric_)
Construct a LatLongDistanceKeyMaker.
Definition: geospatial.h:593
bool get_started() const
Flag indicating if we've started (true if we have).
Definition: postingsource.h:498
parse free text and generate terms
This class provides read/write access to a database.
Definition: database.h:925
PL2PlusWeight * unserialise(const std::string &serialised) const
Unserialise parameters.
const int DBCOMPACT_MULTIPASS
If merging more than 3 databases, merge the postlists in multiple passes.
Definition: constants.h:269
Error(const Error &)=default
Default copy constructor.
RangeProcessor()
Default constructor.
Definition: queryparser.h:164
bool operator<=(const ESetIterator &a, const ESetIterator &b) noexcept
Inequality test for ESetIterator objects.
Definition: eset.h:307
DocNotFoundError(const std::string &msg_, int errno_)
Construct from message and errno value.
Definition: error.h:686
TradWeight * unserialise(const std::string &serialised) const
Unserialise parameters.
TfIdfWeight()
Construct a TfIdfWeight using the default normalizations ("ntn").
Definition: weight.h:780
double get_maxpart() const
Return an upper bound on what get_sumpart() can return for any document.
void add_weight(const std::string &term, double weight)
Add the weight 'weight' to the mapping of a term.
MatchSpy * release()
Start reference counting this object.
Definition: matchspy.h:184
void skip_to(Xapian::docid min_docid, double min_wt)
Advance to the specified docid.
virtual Xapian::docid get_docid() const =0
Return the current docid.
The Xapian namespace contains public interfaces for the Xapian library.
Definition: error.h:34
DatabaseLockError(const std::string &msg_, int errno_)
Construct from message and errno value.
Definition: error.h:505
Xapian::TermIterator spellings_end() const noexcept
End iterator corresponding to spellings_begin().
Definition: database.h:481
WildcardError(const std::string &msg_, const std::string &context_=std::string(), int errno_=0)
General purpose constructor.
Definition: error.h:1017
Calculate the great-circle distance between two coordinates on a sphere.
Definition: geospatial.h:398
@ CURRENCY_SYMBOL
Symbol, currency (Sc)
Definition: unicode.h:248
double get_sumpart(Xapian::termcount wdf, Xapian::termcount doclen, Xapian::termcount uniqterms, Xapian::termcount wdfdocmax) const
Calculate the weight contribution for this object's term to a document.
void assign(const std::string &s)
Assign a new string to the iterator.
Definition: unicode.h:92
NetworkTimeoutError(const std::string &msg_, int errno_)
Construct from message and errno value.
Definition: error.h:857
Stem(const Stem &o)
Copy constructor.
Definition: stem.h:68
std::string get_description() const
Return a string describing this object.
std::string name() const
Return the name of this weighting scheme.
virtual ~ExpandDecider()
Virtual destructor, because we have virtual methods.
void add_spelling(const std::string &word, Xapian::termcount freqinc=1) const
Add a word to the spelling dictionary.
void operator()(const Xapian::Document &doc, double wt)
Implementation of virtual operator().
double get_sumextra(Xapian::termcount doclen, Xapian::termcount uniqterms, Xapian::termcount wdfdocmax) const
Calculate the term-independent weight component for a document.
void set_stemmer(const Xapian::Stem &stemmer)
Set the stemmer.
DatabaseCreateError(const std::string &msg_, int errno_)
Construct from message and errno value.
Definition: error.h:463
void add_posting(const std::string &term, Xapian::termpos term_pos, Xapian::termcount wdf_inc=1)
Add a posting for a term.
TermIterator values_begin() const
Get an iterator over the values seen in the slot.
Utf8Iterator(const char *p_)
Create an iterator given a pointer to a null terminated string.
DatabaseClosedError(const std::string &msg_, int errno_)
Construct from message and errno value.
Definition: error.h:1109
PostingIterator postlist_begin(const std::string &term) const
Start iterating the postings of a term.
LatLongCoordsIterator end() const
Get an end iterator for the coordinates.
Definition: geospatial.h:242
virtual ~MatchSpy()
Virtual destructor, because we have virtual methods.
Xapian::Weight subclass implementing the tf-idf weighting scheme.
Definition: weight.h:482
Class representing a query.
Definition: query.h:44
void skip_to(Xapian::docid did)
Advance the iterator to document did.
std::string serialise() const
Return this object's parameters serialised as a single string.
virtual ~Clusterer()
Destructor.
Xapian::docid operator*() const
Get the numeric document id for the current position.
ESetIterator operator+(difference_type n) const
Return the iterator incremented by n positions.
Definition: eset.h:246
Database(Database &&o)
Move constructor.
ESetIterator operator++(int)
Advance the iterator to the next position (postfix version).
Definition: eset.h:188
size_t get_total() const noexcept
Return the total number of documents tallied.
Definition: matchspy.h:241
bool empty() const
Return true if and only if there are no coordinates in the container.
Definition: geospatial.h:253
void swap(MSet &o)
Efficiently swap this MSet object with another.
Definition: mset.h:372
TfIdfWeight(const std::string &normalizations)
Construct a TfIdfWeight.
Definition: weight.h:701
std::string serialise() const
Serialise object parameters into a string.
const int DB_BACKEND_HONEY
Use the honey backend.
Definition: constants.h:202
Mechanism for accessing a struct of constant information.
double get_maxextra() const
Return an upper bound on what get_sumextra() can return for any document.
RangeError indicates an attempt to access outside the bounds of a container.
Definition: error.h:959
std::string name() const
Name of the posting source class.
virtual bool check(Xapian::docid did, double min_wt)
Check if the specified docid occurs.
Diversify(const Diversify &other)
Copying is allowed.
virtual ~RangeProcessor()
Destructor.
void index_text_without_positions(const std::string &text, Xapian::termcount wdf_inc=1, const std::string &prefix=std::string())
Index some text in a std::string without positional information.
Definition: termgenerator.h:256
A class for construction of termlists which store the terms for a document along with the number of d...
Definition: cluster.h:187
void add_boolean_prefix(const std::string &field, Xapian::FieldProcessor *proc, bool exclusive)
Register a FieldProcessor for a boolean prefix.
Definition: queryparser.h:1006
double operator()(const LatLongCoords &a, const char *b_ptr, size_t b_len) const
Return the distance between two coordinate lists, in metres.
DerefWrapper_< Xapian::docid > operator++(int)
Advance the iterator to the next position (postfix version).
Definition: postingiterator.h:119
virtual void next(double min_wt)=0
Advance the current position to the next matching document.
Xapian::valueno slot
The value slot to process.
Definition: queryparser.h:144
Query(Xapian::PostingSource *source)
Construct a Query object for a PostingSource.
PointType()
Default constructor.
Definition: cluster.h:245
MSetIterator operator-(difference_type n) const
Return the iterator decremented by n positions.
Definition: mset.h:533
void next(double min_wt)
Advance the current position to the next matching document.
ESetIterator begin() const
Return iterator pointing to the first item in this ESet.
Definition: eset.h:325
void delete_document(Xapian::docid did)
Delete a document from the database.
Xapian::termcount termlist_count() const
Return the number of distinct terms in this document.
QueryParser(QueryParser &&o)
Move constructor.
double get_weight() const
Return the weight contribution for the current document.
Stem & operator=(const Stem &o)
Assignment.
Definition: stem.h:71
const PointType * release() const
Start reference counting this object.
Definition: cluster.h:301
Xapian::Query check_range(const std::string &b, const std::string &e)
Check prefix/suffix on range.
ESetIterator()
Create an unpositioned ESetIterator.
Definition: eset.h:176
Xapian::termcount get_doclength_lower_bound() const
Get a lower bound on the length of a document in this DB.
Xapian::TermIterator synonyms_end(const std::string &) const noexcept
End iterator corresponding to synonyms_begin(term).
Definition: database.h:492
Stopper subclass which checks for both stemmed and unstemmed stopwords.
Definition: cluster.h:44
void remove_document(Xapian::docid did)
Unmark a document as relevant.
void add_prefix(const std::string &field, Xapian::FieldProcessor *proc)
Register a FieldProcessor.
Query(double factor, const Xapian::Query &subquery)
Scale using OP_SCALE_WEIGHT.
void init(const Database &db_)
Older method which did the same job as reset().
FieldProcessor()
Default constructor.
Definition: queryparser.h:448
std::string get_spelling_suggestion(const std::string &word, unsigned max_edit_distance=2) const
Suggest a spelling correction.
Geospatial search support routines.
const int DB_NO_TERMLIST
When creating a database, don't create a termlist table.
Definition: constants.h:136
Class representing a document.
Definition: document.h:64
ValueIterator values_begin() const
Start iterating the values in this document.
MatchDecider() noexcept
Default constructor, needed by subclass constructors.
Definition: matchdecider.h:47
Xapian::Weight subclass implementing the BM25+ probabilistic formula.
Definition: weight.h:925
const int DB_BACKEND_CHERT
Use the chert backend.
Definition: constants.h:164
virtual std::string serialise() const
Serialise object parameters into a string.
double get_termweight(const std::string &term) const
Get the term weight of a term.
ValueIterator & operator++()
Advance the iterator to the next position.
An indexed database of documents.
Definition: database.h:75
std::string operator*() const
Get the term at the current position.
const std::string & get_msg() const noexcept
Message giving details of the error, intended for human consumption.
Definition: error.h:111
ClusterSet(const ClusterSet &other)
Copying is allowed.
Xapian::termcount get_unique_terms_lower_bound() const
Get a lower bound on the unique terms size of a document in this DB.
ClusterSet cluster(const MSet &mset)
Implements the KMeans clustering algorithm.
std::string get_description() const
Return a string describing this object.
docid_order
Ordering of docids.
Definition: enquire.h:131
DecreasingValueWeightPostingSource * clone() const
Clone the posting source.
Database()
Construct a Database containing no shards.
double get_maxextra() const
Return an upper bound on what get_sumextra() can return for any document.
Class representing a list of search results.
Query(op op_, Xapian::valueno slot, const std::string &range_lower, const std::string &range_upper)
Construct a Query object for a value range.
UnimplementedError(const std::string &msg_, const std::string &context_=std::string(), int errno_=0)
General purpose constructor.
Definition: error.h:329
void remove_document(const Xapian::MSetIterator &it)
Unmark a document as relevant.
Definition: rset.h:111
double get_sumpart(Xapian::termcount wdf, Xapian::termcount doclen, Xapian::termcount uniqterms, Xapian::termcount wdfdocmax) const
Calculate the weight contribution for this object's term to a document.
~PositionIterator()
Destructor.
Definition: positioniterator.h:84
PostingIterator & operator=(PostingIterator &&o)
Move assignment operator.
Definition: postingiterator.h:65
@ LEAF_MATCH_ALL
Value returned by get_type() for MatchAll or equivalent.
Definition: query.h:279
std::string get_description() const
Return a string describing this object.
std::string short_name() const
Return the short name of the weighting scheme.
void add_synonym(const std::string &term, const std::string &synonym) const
Add a synonym for a term.
KeyMaker()
Default constructor.
Definition: keymaker.h:52
PositionIterator(const PositionIterator &o)
Copy constructor.
void set_sort_by_value(valueno sort_key, bool reverse)
Set the sorting to be by value only.
void add_boolean_term(const std::string &term)
Add a boolean filter term to the document.
Definition: document.h:140
Registry(Registry &&other)
Move constructor.
void add_boolean_prefix(const std::string &field, Xapian::FieldProcessor *proc, const std::string *grouping=NULL)
Register a FieldProcessor for a boolean prefix.
Indicates a problem communicating with a remote database.
Definition: error.h:791
static std::string get_available_languages()
Return a list of available languages.
Definition: stem.h:178
std::string get_description() const
Return a string describing this object.
double get_sumextra(Xapian::termcount doclen, Xapian::termcount uniqterms, Xapian::termcount wdfdocmax) const
Calculate the term-independent weight component for a document.
std::string get_description() const
Return a string describing this object.