1. ======
    
  2. Search
    
  3. ======
    
  4. 
    
  5. A common task for web applications is to search some data in the database with
    
  6. user input. In a simple case, this could be filtering a list of objects by a
    
  7. category. A more complex use case might require searching with weighting,
    
  8. categorization, highlighting, multiple languages, and so on. This document
    
  9. explains some of the possible use cases and the tools you can use.
    
  10. 
    
  11. We'll refer to the same models used in :doc:`/topics/db/queries`.
    
  12. 
    
  13. Use Cases
    
  14. =========
    
  15. 
    
  16. Standard textual queries
    
  17. ------------------------
    
  18. 
    
  19. Text-based fields have a selection of matching operations. For example, you may
    
  20. wish to allow lookup up an author like so::
    
  21. 
    
  22.     >>> Author.objects.filter(name__contains='Terry')
    
  23.     [<Author: Terry Gilliam>, <Author: Terry Jones>]
    
  24. 
    
  25. This is a very fragile solution as it requires the user to know an exact
    
  26. substring of the author's name. A better approach could be a case-insensitive
    
  27. match (:lookup:`icontains`), but this is only marginally better.
    
  28. 
    
  29. A database's more advanced comparison functions
    
  30. -----------------------------------------------
    
  31. 
    
  32. If you're using PostgreSQL, Django provides :doc:`a selection of database
    
  33. specific tools </ref/contrib/postgres/search>` to allow you to leverage more
    
  34. complex querying options. Other databases have different selections of tools,
    
  35. possibly via plugins or user-defined functions. Django doesn't include any
    
  36. support for them at this time. We'll use some examples from PostgreSQL to
    
  37. demonstrate the kind of functionality databases may have.
    
  38. 
    
  39. .. admonition:: Searching in other databases
    
  40. 
    
  41.     All of the searching tools provided by :mod:`django.contrib.postgres` are
    
  42.     constructed entirely on public APIs such as :doc:`custom lookups
    
  43.     </ref/models/lookups>` and :doc:`database functions
    
  44.     </ref/models/database-functions>`. Depending on your database, you should
    
  45.     be able to construct queries to allow similar APIs. If there are specific
    
  46.     things which cannot be achieved this way, please open a ticket.
    
  47. 
    
  48. In the above example, we determined that a case insensitive lookup would be
    
  49. more useful. When dealing with non-English names, a further improvement is to
    
  50. use :lookup:`unaccented comparison <unaccent>`::
    
  51. 
    
  52.     >>> Author.objects.filter(name__unaccent__icontains='Helen')
    
  53.     [<Author: Helen Mirren>, <Author: Helena Bonham Carter>, <Author: Hélène Joy>]
    
  54. 
    
  55. This shows another issue, where we are matching against a different spelling of
    
  56. the name. In this case we have an asymmetry though - a search for ``Helen``
    
  57. will pick up ``Helena`` or ``Hélène``, but not the reverse. Another option
    
  58. would be to use a :lookup:`trigram_similar` comparison, which compares
    
  59. sequences of letters.
    
  60. 
    
  61. For example::
    
  62. 
    
  63.     >>> Author.objects.filter(name__unaccent__lower__trigram_similar='Hélène')
    
  64.     [<Author: Helen Mirren>, <Author: Hélène Joy>]
    
  65. 
    
  66. Now we have a different problem - the longer name of "Helena Bonham Carter"
    
  67. doesn't show up as it is much longer. Trigram searches consider all
    
  68. combinations of three letters, and compares how many appear in both search and
    
  69. source strings. For the longer name, there are more combinations that don't
    
  70. appear in the source string, so it is no longer considered a close match.
    
  71. 
    
  72. The correct choice of comparison functions here depends on your particular data
    
  73. set, for example the language(s) used and the type of text being searched. All
    
  74. of the examples we've seen are on short strings where the user is likely to
    
  75. enter something close (by varying definitions) to the source data.
    
  76. 
    
  77. Document-based search
    
  78. ---------------------
    
  79. 
    
  80. Standard database operations stop being a useful approach when you start
    
  81. considering large blocks of text. Whereas the examples above can be thought of
    
  82. as operations on a string of characters, full text search looks at the actual
    
  83. words. Depending on the system used, it's likely to use some of the following
    
  84. ideas:
    
  85. 
    
  86. - Ignoring "stop words" such as "a", "the", "and".
    
  87. - Stemming words, so that "pony" and "ponies" are considered similar.
    
  88. - Weighting words based on different criteria such as how frequently they
    
  89.   appear in the text, or the importance of the fields, such as the title or
    
  90.   keywords, that they appear in.
    
  91. 
    
  92. There are many alternatives for using searching software, some of the most
    
  93. prominent are Elastic_ and Solr_. These are full document-based search
    
  94. solutions. To use them with data from Django models, you'll need a layer which
    
  95. translates your data into a textual document, including back-references to the
    
  96. database ids. When a search using the engine returns a certain document, you
    
  97. can then look it up in the database. There are a variety of third-party
    
  98. libraries which are designed to help with this process.
    
  99. 
    
  100. .. _Elastic: https://www.elastic.co/
    
  101. .. _Solr: https://solr.apache.org/
    
  102. 
    
  103. PostgreSQL support
    
  104. ~~~~~~~~~~~~~~~~~~
    
  105. 
    
  106. PostgreSQL has its own full text search implementation built-in. While not as
    
  107. powerful as some other search engines, it has the advantage of being inside
    
  108. your database and so can easily be combined with other relational queries such
    
  109. as categorization.
    
  110. 
    
  111. The :mod:`django.contrib.postgres` module provides some helpers to make these
    
  112. queries. For example, a query might select all the blog entries which mention
    
  113. "cheese"::
    
  114. 
    
  115.     >>> Entry.objects.filter(body_text__search='cheese')
    
  116.     [<Entry: Cheese on Toast recipes>, <Entry: Pizza recipes>]
    
  117. 
    
  118. You can also filter on a combination of fields and on related models::
    
  119. 
    
  120.     >>> Entry.objects.annotate(
    
  121.     ...     search=SearchVector('blog__tagline', 'body_text'),
    
  122.     ... ).filter(search='cheese')
    
  123.     [
    
  124.         <Entry: Cheese on Toast recipes>,
    
  125.         <Entry: Pizza Recipes>,
    
  126.         <Entry: Dairy farming in Argentina>,
    
  127.     ]
    
  128. 
    
  129. See the ``contrib.postgres`` :doc:`/ref/contrib/postgres/search` document for
    
  130. complete details.