django full text search not matching partial words

  • Last Update :
  • Techknowledgy :

This is working on Django 1.11:

tools = Tool.objects.annotate(
   search = SearchVector('name', 'description', 'expert__user__username'),
).filter(search__icontains = form.cleaned_data['query_string'])

@santiagopim solution is correct but to address Matt's comment for if you get the following error:

ERROR: function replace(tsquery, unknown, unknown) does not exist
at character 1603 HINT: No
function matches the given name
and argument types.You might need to add explicit type casts.

I know this doesn't address the underlying issue for if you need to use SearchQuery but if you are like me and just need a quick fix, you can try the following.

vector = SearchVector('name') + SearchVector('author__username')

# NOTE: I commented out the line below
# search = SearchQuery('Sa')
search = 'Sa'

Report.objects.exclude(visible = False).annotate(search = vector)\
   .filter(search__icontains = search)

Suggestion : 2

We hope this has convinced you that you need not tie yourself to a rickety search infrastructure simply to keep easily-implemented features like partial word search. Although Django's full text search feature does not allow for partial word search, it is easy to see from the code above that it is not too much of a heavy lift to implement the feature yourself. If you're looking to go down this route but think our short solution above lacks features you want, we think this library looks promising. ,Enter Django's Postgres-backed Full Text Search feature. I won't go into the details here (the docs do a pretty good job on that front), but suffice it to say, this is a great solution if you are worried about any kind of discrepancy between your application and search databases. Now, instead of relying on keeping another service's database in sync with your own application database, you can just use your own Postgres database to search over your models' text fields.,Finally, you simply have to place this custom filtering code wherever it makes sense for your application. In our case, we use Django Rest Framework views and serializers to provide our React frontend with JSON data, which it then renders. As such, we place this custom partial word search filtering in our serializer's filter method, which is responsible for filtering the serializer's queryset. Here's a basic template for what our partial word search filter looks like. ,Recently we ran into an issue with our search interface for a project we're working on here at Fusionbox: the service backing our Django Haystack implementation experienced a momentary hiccup and all of a sudden our search index was out of sync with the state of our database. Luckily this project is still in the development phase and has not yet been released to users, but before we figured out that solr had experienced downtime, we were seeing some strange bugs in our application.

from rest_framework import serializers

class MyModelSerializer(serializers.Serializer)
def filter(self, qs):
<other custom queryset filtering>
   # Where 'search_query' is something like, "toda", a partial word
   query = process_query(search_query)
   # 'query' is now "toda:*"
   qs = qs.extra(
   to_tsvector('english', unaccent(concat_ws(' ',
   <model_table>.<model column_1>,
               ))) @@ to_tsquery('english', unaccent(%s))
               <more custom queryset filtering>
                  return qs
def process_query(s):
Converts the user 's search string into something suitable for passing to
query = re.sub(r '[!\'()|&]', ' ', s).strip()
if query:
   query = re.sub(r '\s+', ' & ', query)
# Support prefix search on the last word.A tsquery of 'toda:*'
# match against any words that start with 'toda', which is good
# search - as - you - type.
query += ':*'
return query

Suggestion : 3

A common way to use full text search is to search a single term against a single column in the database. For example:,SearchQuery translates the terms the user provides into a search query object that the database compares to a search vector. By default, all the words the user provides are passed through the stemming algorithms, and then it looks for matches for all of the resulting terms.,The database functions in the module ease the use of PostgreSQL’s full text search engine.,Special database configuration isn’t necessary to use any of these functions, however, if you’re searching more than a few hundred records, you’re likely to run into performance problems. Full text search is a more intensive process than comparing the size of an integer, for example.

>>> Entry.objects.filter(body_text__search='Cheese')
[<Entry: Cheese on Toast recipes>, <Entry: Pizza Recipes>]
>>> from import SearchVector
>>> Entry.objects.annotate(
... search=SearchVector('body_text', 'blog__tagline'),
... ).filter(search='Cheese')
[<Entry: Cheese on Toast recipes>, <Entry: Pizza Recipes>]
>>> Entry.objects.annotate(
... search=SearchVector('body_text') + SearchVector('blog__tagline'),
... ).filter(search='Cheese')
[<Entry: Cheese on Toast recipes>, <Entry: Pizza Recipes>]
>>> from
import SearchQuery
   SearchQuery('red tomato') # two keywords >>>
   SearchQuery('tomato red') # same results as above >>>
   SearchQuery('red tomato', search_type = 'phrase') # a phrase >>>
   SearchQuery('tomato red', search_type = 'phrase') # a different phrase >>>
   SearchQuery("'tomato' & ('red' | 'green')", search_type = 'raw') # boolean operators >>>
   SearchQuery("'tomato' ('red' OR 'green')", search_type = 'websearch') # websearch operators
>>> from
import SearchQuery
   SearchQuery('meat') & SearchQuery('cheese') # AND >>>
   SearchQuery('meat') | SearchQuery('cheese') # OR >>>
   ~SearchQuery('meat') # NOT
>>> from import SearchQuery, SearchRank, SearchVector
>>> vector = SearchVector('body_text')
>>> query = SearchQuery('cheese')
>>> Entry.objects.annotate(rank=SearchRank(vector, query)).order_by('-rank')
[<Entry: Cheese on Toast recipes>, <Entry: Pizza recipes>]

Suggestion : 4

To implement full text searching there must be a function to create a tsvector from a document and a tsquery from a user query. Also, we need to return results in a useful order, so we need a function that compares documents with respect to their relevance to the query. It's also important to be able to display the results nicely. PostgreSQL provides support for all of these functions.,To present search results it is ideal to show a part of each document and how it is related to the query. Usually, search engines show fragments of the document with marked search terms. PostgreSQL provides a function ts_headline that implements this functionality.,PostgreSQL provides the function to_tsvector for converting a document to the tsvector data type.,Because to_tsvector(NULL) will return NULL, it is recommended to use coalesce whenever a field might be null. Here is the recommended method for creating a tsvector from a structured document:

to_tsvector([config regconfig, ] document text) returns tsvector

to_tsvector parses a textual document into tokens, reduces the tokens to lexemes, and returns a tsvector which lists the lexemes together with their positions in the document. The document is processed according to the specified or default text search configuration. Here is a simple example:

SELECT to_tsvector('english', 'a fat  cat sat on a mat - it ate a fat rats');
-- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -
'ate': 9 'cat': 3 'fat': 2, 11 'mat': 7 'rat': 12 'sat': 4

Because to_tsvector(NULL) will return NULL, it is recommended to use coalesce whenever a field might be null. Here is the recommended method for creating a tsvector from a structured document:

UPDATE tt SET ti =
   setweight(to_tsvector(coalesce(title, '')), 'A') ||
   setweight(to_tsvector(coalesce(keyword, '')), 'B') ||
   setweight(to_tsvector(coalesce(abstract, '')), 'C') ||
   setweight(to_tsvector(coalesce(body, '')), 'D');
to_tsquery([config regconfig, ] querytext text) returns tsquery

to_tsquery creates a tsquery value from querytext, which must consist of single tokens separated by the tsquery operators & (AND), | (OR), ! (NOT), and <-> (FOLLOWED BY), possibly grouped using parentheses. In other words, the input to to_tsquery must already follow the general rules for tsquery input, as described in Section 8.11.2. The difference is that while basic tsquery input takes the tokens at face value, to_tsquery normalizes each token into a lexeme using the specified or default configuration, and discards any tokens that are stop words according to the configuration. For example:

SELECT to_tsquery('english', 'The & Fat & Rats');
-- -- -- -- -- -- -- -
'fat' & 'rat'

As in basic tsquery input, weight(s) can be attached to each lexeme to restrict it to match only tsvector lexemes of those weight(s). For example:

SELECT to_tsquery('english', 'Fat | Rats:AB');
-- -- -- -- -- -- -- -- --
'fat' | 'rat': AB

to_tsquery can also accept single-quoted phrases. This is primarily useful when the configuration includes a thesaurus dictionary that may trigger on such phrases. In the example below, a thesaurus contains the rule supernovae stars : sn:

SELECT to_tsquery(''
   'supernovae stars'
   ' & !crab');
-- -- -- -- -- -- -- -
'sn' & !'crab'
plainto_tsquery([config regconfig, ] querytext text) returns tsquery

For both these functions, the optional weights argument offers the ability to weigh word instances more or less heavily depending on how they are labeled. The weight arrays specify how heavily to weigh each category of word, in the order:

   D - weight, C - weight, B - weight, A - weight

If no weights are provided, then these defaults are used:


Here is an example that selects only the ten highest-ranked matches:

SELECT title, ts_rank_cd(textsearch, query) AS rank
FROM apod, to_tsquery('neutrino|(dark & matter)') query
WHERE query @ @ textsearch
title | rank
   -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- - + -- -- -- -- --
Neutrinos in the Sun | 3.1
The Sudbury Neutrino Detector | 2.4
A MACHO View of Galactic Dark Matter | 2.01317
Hot Gas and Dark Matter | 1.91171
The Virgo Cluster: Hot Plasma and Dark Matter | 1.90953
for Solar Neutrinos | 1.9
NGC 4650 A: Strange Galaxy and Dark Matter | 1.85774
Hot Gas and Dark Matter | 1.6123
Ice Fishing
for Cosmic Neutrinos | 1.6
Weak Lensing Distorts the Universe | 0.818218
ts_headline([config regconfig, ] document text, query tsquery[, options text]) returns text

For example:

SELECT ts_headline('english',
'The most common type of search
is to find all documents containing given query terms
and return them in order of their similarity to the
to_tsquery('english', 'query & similarity'));
containing given <b>query</b> terms +
and return them in order of their <b>similarity</b> to the+

SELECT ts_headline('english',
'Search terms may occur
many times in a document,
requiring ranking of the search matches to decide which
occurrences to display in the result.',
to_tsquery('english', 'search & term'),
'MaxFragments=10, MaxWords=7, MinWords=3, StartSel=<<, StopSel=>>');
   <<Search>> <<terms>> may occur +
         many times ... ranking of the <<search>> matches to decide