sphinx.search.SearchLanguage

class documentation

class SearchLanguage: (source)

Known subclasses: sphinx.search.da.SearchDanish, sphinx.search.de.SearchGerman, sphinx.search.en.SearchEnglish, sphinx.search.es.SearchSpanish, sphinx.search.fi.SearchFinnish, sphinx.search.fr.SearchFrench, sphinx.search.hu.SearchHungarian, sphinx.search.it.SearchItalian, sphinx.search.ja.SearchJapanese, sphinx.search.nl.SearchDutch, sphinx.search.no.SearchNorwegian, sphinx.search.pt.SearchPortuguese, sphinx.search.ro.SearchRomanian, sphinx.search.ru.SearchRussian, sphinx.search.sv.SearchSwedish, sphinx.search.tr.SearchTurkish, sphinx.search.zh.SearchChinese

View In Hierarchy

This class is the base class for search natural language preprocessors. If you want to add support for a new language, you should override the methods of this class. You should override `lang` class property too (e.g. 'en', 'fr' and so on). .. attribute:: stopwords This is a set of stop words of the target language. Default `stopwords` is empty. This word is used for building index and embedded in JS. .. attribute:: js_splitter_code Return splitter function of JavaScript version. The function should be named as ``splitQuery``. And it should take a string and return list of strings. .. versionadded:: 3.0 .. attribute:: js_stemmer_code Return stemmer class of JavaScript version. This class' name should be ``Stemmer`` and this class must have ``stemWord`` method. This string is embedded as-is in searchtools.js. This class is used to preprocess search word which Sphinx HTML readers type, before searching index. Default implementation does nothing.

Method	`__init__`	Undocumented
Method	`init`	Initialize the class with the options the user has given.
Method	`split`	This method splits a sentence into words. Default splitter splits input at white spaces, which should be enough for most languages except CJK languages.
Method	`stem`	This method implements stemming algorithm of the Python version.
Method	`word_filter`	Return true if the target word should be registered in the search index. This method is called after stemming.
Class Variable	`js_splitter_code`	Undocumented
Class Variable	`js_stemmer_code`	Undocumented
Class Variable	`js_stemmer_rawcode`	Undocumented
Class Variable	`lang`	Undocumented
Class Variable	`language_name`	Undocumented
Class Variable	`stopwords`	Undocumented
Instance Variable	`options`	Undocumented
Class Variable	`_word_re`	Undocumented

def __init__(self, options): (source) ¶

Undocumented

Parameters
options:`dict`	Undocumented

def init(self, options): (source) ¶

overridden in sphinx.search.da.SearchDanish, sphinx.search.de.SearchGerman, sphinx.search.en.SearchEnglish, sphinx.search.es.SearchSpanish, sphinx.search.fi.SearchFinnish, sphinx.search.fr.SearchFrench, sphinx.search.hu.SearchHungarian, sphinx.search.it.SearchItalian, sphinx.search.ja.SearchJapanese, sphinx.search.nl.SearchDutch, sphinx.search.no.SearchNorwegian, sphinx.search.pt.SearchPortuguese, sphinx.search.ro.SearchRomanian, sphinx.search.ru.SearchRussian, sphinx.search.sv.SearchSwedish, sphinx.search.tr.SearchTurkish, sphinx.search.zh.SearchChinese

Initialize the class with the options the user has given.

Parameters
options:`dict`	Undocumented

def split(self, input): (source) ¶

overridden in sphinx.search.ja.SearchJapanese, sphinx.search.zh.SearchChinese

This method splits a sentence into words. Default splitter splits input at white spaces, which should be enough for most languages except CJK languages.

Parameters
input:`str`	Undocumented
Returns
`list[str]`	Undocumented

def stem(self, word): (source) ¶

This method implements stemming algorithm of the Python version. Default implementation does nothing. You should implement this if the language has any stemming rules. This class is used to preprocess search words before registering them in the search index. The stemming of the Python version and the JS version (given in the js_stemmer_code attribute) must be compatible.

Parameters
word:`str`	Undocumented
Returns
`str`	Undocumented

def word_filter(self, word): (source) ¶

overridden in sphinx.search.ja.SearchJapanese, sphinx.search.zh.SearchChinese

Return true if the target word should be registered in the search index. This method is called after stemming.

Parameters
word:`str`	Undocumented
Returns
`bool`	Undocumented

js_splitter_code: str = (source) ¶

Undocumented

js_stemmer_code: str = (source) ¶

Undocumented

js_stemmer_rawcode: str|None = (source) ¶

overridden in sphinx.search.da.SearchDanish, sphinx.search.de.SearchGerman, sphinx.search.es.SearchSpanish, sphinx.search.fi.SearchFinnish, sphinx.search.fr.SearchFrench, sphinx.search.hu.SearchHungarian, sphinx.search.it.SearchItalian, sphinx.search.nl.SearchDutch, sphinx.search.no.SearchNorwegian, sphinx.search.pt.SearchPortuguese, sphinx.search.ro.SearchRomanian, sphinx.search.ru.SearchRussian, sphinx.search.sv.SearchSwedish, sphinx.search.tr.SearchTurkish

Undocumented

stopwords: set[str] = (source) ¶

overridden in sphinx.search.ro.SearchRomanian, sphinx.search.tr.SearchTurkish

Undocumented

options = (source) ¶

Undocumented

_word_re = (source) ¶

Undocumented