welcome: please sign in

The following 352 words could not be found in the dictionary of 7 words (including 7 LocalSpellingWords) and are highlighted below:
above   add   added   additional   advanced   after   After   all   allows   already   Also   also   an   and   antiword   are   as   at   attachment   attachments   auto   automagically   available   back   based   basically   be   Before   big   Binary   bindings   blacklist   bottom   build   Building   building   by   cache   calls   can   catdoc   change   check   checking   command   commonly   complete   completely   config   configdir   configuration   Configuration   configure   contain   containing   Contents   control   core   course   Creating   data   date   debugging   default   deletes   dependencies   Dependency   dialogue   dir   directly   directory   disable   disabling   disk   distinguished   do   doc   Document   doesn   Don   don   dramatically   Due   eat   enable   enabled   enables   Enabling   enabling   encodings   engine   engines   etc   every   Excel   except   execute   existing   extracted   fall   False   feature   features   File   files   filter   filters   find   Find   first   following   for   forces   forget   forms   from   furthermore   Furthermore   gets   going   got   have   haven   Help   here   history   how   http   identified   if   If   images   improve   in   In   included   index   indexed   indexer   indexing   indices   Info   information   Initially   initially   installed   installing   instruct   into   introduced   invocations   is   iso   it   Items   its   itself   jpg   language   last   least   legacy   let   lets   like   line   linking   list   long   look   lot   may   means   might   mode   modes   modification   more   Moreover   must   necessary   need   needs   new   next   None   not   note   Notes   od   Of   of   Office   older   On   on   one   Only   only   Open   opposed   option   options   or   org   other   own   packages   page   Page   pages   passed   pdf   pdttotext   perform   performance   performed   Periodic   Please   plugins   possible   prefixed   prefixes   previous   process   purposes   query   rather   raw   Re   re   rebuild   rebuilding   recent   referred   Requirements   results   revision   revisions   running   same   save   saved   search   Search   searches   Searching   searching   See   see   separate   seperate   Set   set   Setting   several   should   show   single   size   slow   some   space   specifiy   standard   Star   status   stemmed   stemming   strings   stuff   supplied   support   sx   System   Table   take   term   terms   test   tested   tester   Testing   testing   Text   that   The   the   their   they   this   This   those   time   timings   to   To   tool   tries   True   txt   type   underlay   unlock   up   update   updated   Updates   upon   url   usable   Usage   use   used   useful   users   uses   Using   using   utf   utils   version   versions   want   way   well   were   what   where   whether   which   wiki   wikiconfig   wikifarm   wikifarms   wikiname   will   with   without   Word   words   work   www   Xapian   xapian   xls   xls2csv   xpdf   yield   You   you   your  

Clear message
Page Locked

HelpOnXapian

Using Xapian you can dramatically improve the performance of searching in moin and furthermore unlock some more features (see the search prefixes above) not possible with the legacy search engine.

1. Setting it up

1.1. Requirements

You must have Xapian itself and its Python bindings (xapian-core and xapian-bindings) from http://www.xapian.org/ at least in version 1.0.0 installed.

To process attachment files, moin uses filter plugins - here is the list of filter plugins included:

File type

Dependency

Notes

Text files (.txt)

-

tries utf-8 and iso-8859-15 encodings (or forces to ASCII if those do not work)

JPEG images (.jpg)

-

EXIF data is extracted

Open Office files (.sx?)

-

e.g. from older OpenOffice.org/StarOffice versions

Open Document files (.od?)

-

e.g. from recent OpenOffice.org/StarOffice versions

Binary files

-

moin uses a strings like filter to process those, as well as a blacklist with stuff you don't want to search

MS Word files (.doc)

antiword

filter calls antiword

MS Excel files (.xls)

catdoc

filter calls xls2csv

PDF files (.pdf)

xpdf-utils

filter calls pdttotext

After installing additional filters (or dependencies) you should (re)build your index. Xapian will find the new filters / support packages automagically. The next time your search results may contain results linking directly to your attachments.

1.2. Configuration

In your wikiconfig, you have several options on how to configure Xapian:

xapian_search

False

if True, enables Xapian search

xapian_index_dir

None

if set, set and use a separate index directory for every wiki distinguished by wikiname; useful for wikifarms to seperate indices (note: needs rebuilding the index)

xapian_index_history

True

if True, it will instruct the indexer to index all revisions of a page to let users search in their history (note: needs rebuilding the index)

xapian_stemming

False

if True, enables stemming of terms in Xapian (note: needs rebuilding the index)

1.3. (Re-)Building an index

You can use the supplied command line tool moin to initially build, completely rebuild and update an existing index.

To build your index the first time, execute

moin --config-dir=/where/your/configdir/is --wiki-url=wiki-url/ index build --mode=add

in your command line. You can check the status of Xapian and its index on SystemInfo.

Moreover, the following modes can be passed to the command above to control the building of the index:

/!\ Please note that you must rebuild your index if you change at least one of xapian_index_history, xapian_index_dir or xapian_stemming configuration options!

1.4. Testing

You can test if Xapian is enabled and if an index is available by checking SystemInfo. To check if searches are performed using Xapian, enable show_timings in your wikiconfig, perform a search and look for _xapianSearch on the bottom of the page.

2. Usage

Xapian is basically used the same way as all other search engines. Due to Xapian's advanced features some new search term prefixed were introduced which are not already available in the legacy search engine (commonly referred to as moin search). See HelpOnSearching for more information and/or use the new advanced search dialogue available on FindPage to see what's available and possible.