xapian-core
1.5.0
|
This class provides read/write access to a database. More...
Public Member Functions | |
WritableDatabase () | |
Create a WritableDatabase with no subdatabases. More... | |
void | add_database (const WritableDatabase &other) |
Add shards from another WritableDatabase. More... | |
WritableDatabase (const std::string &path, int flags=0, int block_size=0) | |
Create or open a Xapian database for both reading and writing. More... | |
WritableDatabase (const WritableDatabase &o) | |
Copy constructor. More... | |
WritableDatabase & | operator= (const WritableDatabase &o) |
Assignment operator. More... | |
WritableDatabase (WritableDatabase &&o) | |
Move constructor. | |
WritableDatabase & | operator= (WritableDatabase &&o) |
Move assignment operator. | |
void | commit () |
Commit pending modifications. More... | |
void | begin_transaction (bool flushed=true) |
Begin a transaction. More... | |
void | commit_transaction () |
Complete the transaction currently in progress. More... | |
void | cancel_transaction () |
Abort the transaction currently in progress. More... | |
Xapian::docid | add_document (const Xapian::Document &doc) |
Add a document to the database. More... | |
void | delete_document (Xapian::docid did) |
Delete a document from the database. More... | |
void | delete_document (const std::string &unique_term) |
Delete any documents indexed by a term from the database. More... | |
void | replace_document (Xapian::docid did, const Xapian::Document &document) |
Replace a document in the database. More... | |
Xapian::docid | replace_document (const std::string &unique_term, const Xapian::Document &document) |
Replace any documents matching a term. More... | |
void | add_spelling (const std::string &word, Xapian::termcount freqinc=1) const |
Add a word to the spelling dictionary. More... | |
termcount | remove_spelling (const std::string &word, termcount freqdec=1) const |
Remove a word from the spelling dictionary. More... | |
void | add_synonym (const std::string &term, const std::string &synonym) const |
Add a synonym for a term. More... | |
void | remove_synonym (const std::string &term, const std::string &synonym) const |
Remove a synonym for a term. More... | |
void | clear_synonyms (const std::string &term) const |
Remove all synonyms for a term. More... | |
void | set_metadata (const std::string &key, const std::string &metadata) |
Set the user-specified metadata associated with a given key. More... | |
std::string | get_description () const |
Return a string describing this object. | |
![]() | |
void | add_database (const Database &other) |
Add shards from another Database. More... | |
size_t | size () const |
Return number of shards in this Database object. | |
Database () | |
Construct a Database containing no shards. More... | |
Database (const std::string &path, int flags=0) | |
Open a Database. More... | |
Database (int fd, int flags=0) | |
Open a single-file Database. More... | |
virtual | ~Database () |
Destructor. | |
Database (const Database &o) | |
Copy constructor. More... | |
Database & | operator= (const Database &o) |
Assignment operator. More... | |
Database (Database &&o) | |
Move constructor. | |
Database & | operator= (Database &&o) |
Move assignment operator. | |
bool | reopen () |
Reopen the database at the latest available revision. More... | |
void | close () |
Close the database. More... | |
PostingIterator | postlist_begin (const std::string &term) const |
Start iterating the postings of a term. More... | |
PostingIterator | postlist_end (const std::string &) const noexcept |
End iterator corresponding to postlist_begin(). | |
TermIterator | termlist_begin (Xapian::docid did) const |
Start iterating the terms in a document. More... | |
TermIterator | termlist_end (Xapian::docid) const noexcept |
End iterator corresponding to termlist_begin(). | |
bool | has_positions () const |
Does this database have any positional information? | |
PositionIterator | positionlist_begin (Xapian::docid did, const std::string &term) const |
Start iterating positions for a term in a document. More... | |
PositionIterator | positionlist_end (Xapian::docid, const std::string &) const noexcept |
End iterator corresponding to positionlist_begin(). | |
TermIterator | allterms_begin (const std::string &prefix=std::string()) const |
Start iterating all terms in the database with a given prefix. More... | |
TermIterator | allterms_end (const std::string &=std::string()) const noexcept |
End iterator corresponding to allterms_begin(prefix). | |
Xapian::doccount | get_doccount () const |
Get the number of documents in the database. | |
Xapian::docid | get_lastdocid () const |
Get the highest document id which has been used in the database. | |
double | get_average_length () const |
Get the mean document length in the database. | |
double | get_avlength () const |
Old name for get_average_length() for backward compatibility. | |
Xapian::totallength | get_total_length () const |
Get the total length of all the documents in the database. More... | |
Xapian::doccount | get_termfreq (const std::string &term) const |
Get the number of documents indexed by a specified term. More... | |
bool | term_exists (const std::string &term) const |
Test is a particular term is present in any document. More... | |
Xapian::termcount | get_collection_freq (const std::string &term) const |
Get the total number of occurrences of a specified term. More... | |
Xapian::doccount | get_value_freq (Xapian::valueno slot) const |
Return the frequency of a given value slot. More... | |
std::string | get_value_lower_bound (Xapian::valueno slot) const |
Get a lower bound on the values stored in the given value slot. More... | |
std::string | get_value_upper_bound (Xapian::valueno slot) const |
Get an upper bound on the values stored in the given value slot. More... | |
Xapian::termcount | get_doclength_lower_bound () const |
Get a lower bound on the length of a document in this DB. More... | |
Xapian::termcount | get_doclength_upper_bound () const |
Get an upper bound on the length of a document in this DB. | |
Xapian::termcount | get_wdf_upper_bound (const std::string &term) const |
Get an upper bound on the wdf of term term. | |
Xapian::termcount | get_unique_terms_lower_bound () const |
Get a lower bound on the unique terms size of a document in this DB. | |
Xapian::termcount | get_unique_terms_upper_bound () const |
Get an upper bound on the unique terms size of a document in this DB. | |
ValueIterator | valuestream_begin (Xapian::valueno slot) const |
Return an iterator over the value in slot slot for each document. | |
ValueIterator | valuestream_end (Xapian::valueno) const noexcept |
Return end iterator corresponding to valuestream_begin(). | |
Xapian::termcount | get_doclength (Xapian::docid did) const |
Get the length of a document. More... | |
Xapian::termcount | get_unique_terms (Xapian::docid did) const |
Get the number of unique terms in a document. More... | |
Xapian::termcount | get_wdfdocmax (Xapian::docid did) const |
void | keep_alive () |
Send a keep-alive message. More... | |
Xapian::Document | get_document (Xapian::docid did, unsigned flags=0) const |
Get a document from the database. More... | |
std::string | get_spelling_suggestion (const std::string &word, unsigned max_edit_distance=2) const |
Suggest a spelling correction. More... | |
Xapian::TermIterator | spellings_begin () const |
An iterator which returns all the spelling correction targets. More... | |
Xapian::TermIterator | spellings_end () const noexcept |
End iterator corresponding to spellings_begin(). | |
Xapian::TermIterator | synonyms_begin (const std::string &term) const |
An iterator which returns all the synonyms for a given term. More... | |
Xapian::TermIterator | synonyms_end (const std::string &) const noexcept |
End iterator corresponding to synonyms_begin(term). | |
Xapian::TermIterator | synonym_keys_begin (const std::string &prefix=std::string()) const |
An iterator which returns all terms which have synonyms. More... | |
Xapian::TermIterator | synonym_keys_end (const std::string &=std::string()) const noexcept |
End iterator corresponding to synonym_keys_begin(prefix). | |
std::string | get_metadata (const std::string &key) const |
Get the user-specified metadata associated with a given key. More... | |
Xapian::TermIterator | metadata_keys_begin (const std::string &prefix=std::string()) const |
An iterator which returns all user-specified metadata keys. More... | |
Xapian::TermIterator | metadata_keys_end (const std::string &=std::string()) const noexcept |
End iterator corresponding to metadata_keys_begin(). | |
std::string | get_uuid () const |
Get a UUID for the database. More... | |
bool | locked () const |
Test if this database is currently locked for writing. More... | |
Xapian::WritableDatabase | lock (int flags=0) |
Lock a read-only database for writing. More... | |
Xapian::Database | unlock () |
Release a database write lock. More... | |
Xapian::rev | get_revision () const |
Get the revision of the database. More... | |
void | compact (const std::string &output, unsigned flags=0, int block_size=0) |
Produce a compact version of this database. More... | |
void | compact (int fd, unsigned flags=0, int block_size=0) |
Produce a compact version of this database. More... | |
void | compact (const std::string &output, unsigned flags, int block_size, Xapian::Compactor &compactor) |
Produce a compact version of this database. More... | |
void | compact (int fd, unsigned flags, int block_size, Xapian::Compactor &compactor) |
Produce a compact version of this database. More... | |
std::string | reconstruct_text (Xapian::docid did, size_t length=0, const std::string &prefix=std::string(), Xapian::termpos start_pos=0, Xapian::termpos end_pos=0) const |
Reconstruct document text. More... | |
Additional Inherited Members | |
![]() | |
static size_t | check (const std::string &path, int opts=0, std::ostream *out=NULL) |
Check the integrity of a database or database table. More... | |
static size_t | check (int fd, int opts=0, std::ostream *out=NULL) |
Check the integrity of a single file database. More... | |
This class provides read/write access to a database.
A WritableDatabase object contains zero or more shards, and operations are performed across these shards. Documents added by add_document() are stored to the shards in a round-robin fashion.
Most methods can throw:
Xapian::DatabaseCorruptError | if database corruption is detected |
Xapian::DatabaseError | in various situation (for example, calling methods after close() has been called) |
Xapian::NetworkError | when remote databases are in use |
|
inline |
Create a WritableDatabase with no subdatabases.
The created object isn't very useful in this state - it's intended as a placeholder value.
|
explicit |
Create or open a Xapian database for both reading and writing.
path | Filing system path for the database. If creating a new database with a backend which uses a directory of files (such as glass does by default) then Xapian will create a directory for path if necessary (but the parent directory must already exist). |
flags | A bitwise-or (| in C++) combination of: |
Constant | DB exists | DB doesn't exist |
---|---|---|
Xapian::DB_CREATE_OR_OPEN | open | create |
Xapian::DB_CREATE | fail | create |
Xapian::DB_CREATE_OR_OVERWRITE | overwrite | create |
Xapian::DB_OPEN | open | fail |
Constant | Meaning |
---|---|
Xapian::DB_BACKEND_GLASS | Create a glass database |
Xapian::DB_BACKEND_CHERT | Create a chert database |
Xapian::DB_BACKEND_INMEMORY | Create inmemory DB (ignores path) |
block_size | The block size in bytes to use when creating a new database. This is ignored when opening an existing database, and by backends which don't have the concept of a block size. The glass backend allows block sizes which are a power of 2 between 2048 and 65536 (inclusive) and its default (also used instead of an invalid value) is 8192 bytes. |
Xapian::DatabaseLockError | is thrown if the database's write lock could not be acquired. |
Xapian::DatabaseOpeningError | if the specified database cannot be opened |
Xapian::DatabaseVersionError | if the specified database has a format too old or too new to be supported. |
|
inline |
Copy constructor.
The internals are reference counted, so copying is cheap.
|
inline |
Add shards from another WritableDatabase.
Any shards in other are added to the list of shards in this object. The shards are reference counted and also remain in other.
other | Another WritableDatabase to add shards from |
Xapian::InvalidArgumentError | if other is the same object as this. |
Xapian::docid Xapian::WritableDatabase::add_document | ( | const Xapian::Document & | doc | ) |
Add a document to the database.
The document is allocated document ID (get_lastdocid() + 1) - the next highest document ID which has never previously been used by this database (so docids from deleted documents won't be reused).
If you want to specify the document ID to be used, you should call replace_document() instead.
If a transaction is active, the document addition is added to the transaction; otherwise it is added to the current batch of changes. Either way, it won't be visible to readers right away (unless we're not in a transaction and the addition triggers an automatic commit).
doc | The Document object to be added. |
void Xapian::WritableDatabase::add_spelling | ( | const std::string & | word, |
Xapian::termcount | freqinc = 1 |
||
) | const |
Add a word to the spelling dictionary.
If the word is already present, its frequency is increased.
word | The word to add. |
freqinc | How much to increase its frequency by (default 1). |
void Xapian::WritableDatabase::add_synonym | ( | const std::string & | term, |
const std::string & | synonym | ||
) | const |
Add a synonym for a term.
term | The term to add a synonym for. |
synonym | The synonym to add. If this is already a synonym for term, then no action is taken. |
void Xapian::WritableDatabase::begin_transaction | ( | bool | flushed = true | ) |
Begin a transaction.
A Xapian transaction is a set of consecutive modifications to be committed as an atomic unit - in any committed revision of the database either none are present or they all are.
A transaction is started with begin_transaction() and can either be completed by calling commit_transaction() or aborted by calling cancel_transaction().
Closing the database (by an explicit call to close() or by its destructor being called) when a transaction is active will implicitly call cancel_transaction() to abort the transaction and discard the changes in it.
By default, commit() is implicitly called by begin_transaction() and commit_transaction() so that the changes in the transaction are committed or not independent of changes before or after it.
The downside of these implicit calls to commit() is that small transactions can harm indexing performance in the same way that explicitly calling commit() frequently can.
If you're applying atomic groups of changes and only wish to ensure that each group is either applied or not applied, then you can prevent the automatic commit() before and after the transaction by starting the transaction with begin_transaction(false). However, if cancel_transaction() is called (or if commit_transaction() isn't called before the WritableDatabase object is destroyed) then any changes which were pending before the transaction began will also be discarded.
flushed | Is this a flushed transaction? By default transactions are "flushed", which means that committing a transaction will ensure those changes are permanently written to the database. By contrast, unflushed transactions only ensure that changes within the transaction are either all applied or all aren't. |
Xapian::UnimplementedError | is thrown if this is an InMemory database, which don't currently support transactions. |
Xapian::InvalidOperationError | will be thrown if a transaction is already active. |
|
inline |
Abort the transaction currently in progress.
Changes made within the current transaction will be discarded (if the transaction was not begun as a flushed transaction, any changes made but not committed before begin_transaction() will also be discarded).
Xapian::UnimplementedError | is thrown if this is an InMemory database, which don't currently support transactions. |
Xapian::InvalidOperationError | is thrown if no transaction was active. |
void Xapian::WritableDatabase::clear_synonyms | ( | const std::string & | term | ) | const |
Remove all synonyms for a term.
term | The term to remove all synonyms for. If the term has no synonyms, no action is taken. |
void Xapian::WritableDatabase::commit | ( | ) |
Commit pending modifications.
Updates to a Xapian database are more efficient when applied in bulk, so by default Xapian stores modifications in memory until a threshold is exceeded and then they are committed to disk.
When the database is closed (by an explicit call to close() or its destructor being called) then commit() is implicitly called unless a transaction is active.
You can force any such pending modifications to be committed by calling this method, but bear in mind that the batching happens for a reason and calling commit() a lot is likely to slow down indexing.
If the commit operation succeeds then the changes are reliably written to disk and available to readers. If the commit operation fails, then any pending modifications are discarded.
It's not valid to call commit() within a transaction - see begin_transaction() for more details of how transactions work in Xapian.
Currently batched modifications are automatically committed every 10000 documents added, deleted, or modified. This value is rather conservative, and if you have a machine with plenty of memory, you can improve indexing throughput dramatically by setting XAPIAN_FLUSH_THRESHOLD in the environment to a larger value.
|
inline |
Complete the transaction currently in progress.
If the transaction was begun as a flushed transaction then the changes in it have been committed to the database upon successful completion of this method.
If an exception is thrown, then the changes in the transaction will be discarded (if the transaction was not begun as a flushed transaction, any changes made but not committed before begin_transaction() will also be discarded).
In all cases the transaction will no longer be in progress.
Xapian::UnimplementedError | is thrown if this is an InMemory database, which don't currently support transactions. |
Xapian::InvalidOperationError | is thrown if no transaction was active. |
void Xapian::WritableDatabase::delete_document | ( | const std::string & | unique_term | ) |
Delete any documents indexed by a term from the database.
This method removes any documents indexed by the specified term from the database.
A major use is for convenience when UIDs from another system are mapped to terms in Xapian, although this method has other uses (for example, you could add a "deletion date" term to documents at index time and use this method to delete all documents due for deletion on a particular date).
unique_term | The term to remove references to. |
void Xapian::WritableDatabase::delete_document | ( | Xapian::docid | did | ) |
Delete a document from the database.
This method removes the document with the specified document ID from the database.
If a transaction is active, the document removal is added to the transaction; otherwise it is added to the current batch of changes. Either way, it won't be visible to readers right away (unless we're not in a transaction and the addition triggers an automatic commit).
did | The document ID of the document to be removed. |
|
inline |
Assignment operator.
The internals are reference counted, so assignment is cheap.
References Xapian::Database::operator=().
termcount Xapian::WritableDatabase::remove_spelling | ( | const std::string & | word, |
termcount | freqdec = 1 |
||
) | const |
Remove a word from the spelling dictionary.
The word's frequency is decreased, and if would become zero or less then the word is removed completely.
word | The word to remove. |
freqdec | How much to decrease its frequency by (default 1). |
void Xapian::WritableDatabase::remove_synonym | ( | const std::string & | term, |
const std::string & | synonym | ||
) | const |
Remove a synonym for a term.
term | The term to remove a synonym for. |
synonym | The synonym to remove. If this isn't currently a synonym for term, then no action is taken. |
Xapian::docid Xapian::WritableDatabase::replace_document | ( | const std::string & | unique_term, |
const Xapian::Document & | document | ||
) |
Replace any documents matching a term.
This method replaces any documents indexed by the specified term with the specified document. If any documents are indexed by the term, the lowest document ID will be used for the document, otherwise a new document ID will be generated as for add_document.
One common use is to allow UIDs from another system to easily be mapped to terms in Xapian. Note that this method doesn't automatically add unique_term as a term, so you'll need to call document.add_term(unique_term) first when using replace_document() in this way.
Note that changes to the database won't be immediately committed to disk; see commit() for more details.
unique_term | The "unique" term. |
document | The new document. |
void Xapian::WritableDatabase::replace_document | ( | Xapian::docid | did, |
const Xapian::Document & | document | ||
) |
Replace a document in the database.
This method replaces the document with the specified document ID. If document ID did isn't currently used, the document will be added with document ID did.
The monotonic counter used for automatically allocating document IDs is increased so that the next automatically allocated document ID will be did + 1. Be aware that if you use this method to specify a high document ID for a new document, and also use WritableDatabase::add_document(), Xapian may get to a state where this counter wraps around and will be unable to automatically allocate document IDs!
Note that changes to the database won't be immediately committed to disk; see commit() for more details.
did | The document ID of the document to be replaced. |
document | The new document. |
void Xapian::WritableDatabase::set_metadata | ( | const std::string & | key, |
const std::string & | metadata | ||
) |
Set the user-specified metadata associated with a given key.
This method sets the metadata value associated with a given key. If there is already a metadata value stored in the database with the same key, the old value is replaced. If you want to delete an existing item of metadata, just set its value to the empty string.
User-specified metadata allows you to store arbitrary information in the form of (key, value) pairs.
There's no hard limit on the number of metadata items, or the size of the metadata values. Metadata keys have a limited length, which depend on the backend. We recommend limiting them to 200 bytes. Empty keys are not valid, and specifying one will cause an exception.
Metadata modifications are committed to disk in the same way as modifications to the documents in the database are: i.e., modifications are atomic, and won't be committed to disk immediately (see commit() for more details). This allows metadata to be used to link databases with versioned external resources by storing the appropriate version number in a metadata item.
You can also use the metadata to store arbitrary extra information associated with terms, documents, or postings by encoding the termname and/or document id into the metadata key.
key | The key of the metadata item to set. |
metadata | The value of the metadata item to set. |
Xapian::DatabaseError | will be thrown if a problem occurs while writing to the database. |
Xapian::DatabaseCorruptError | will be thrown if the database is in a corrupt state. |
Xapian::InvalidArgumentError | will be thrown if the key supplied is empty. |
Xapian::UnimplementedError | will be thrown if the database backend in use doesn't support user-specified metadata. |