Skip to content

Search#

The GraphQL query search can be used to search for resources. The aim of this GraphQL query is to fulfill the requirements for a search within the website. The focus here is on full-text searches, filters and facets.

Search in a index#

The search is performed using a full-text index. The CMS IES takes care of filling and updating the index. There is a separate index for each publication channel. For translated publication channels, there is a separate index for each language.

For each query via selectResources, the index to be searched must be specified via the input parameter index.

To find resources using a full-text search, the text is specified using the input parameter text. The index is searched for the text and the corresponding hits are returned. The search is performed word by word. If several words (separated by spaces) are entered, an 'OR' search is carried out by default and the hits must contain at least one of the words. An OR search can also be carried out. To do this, the input parameter queryDefaultOperator must be specified with OR:

query {
  search(input: {
    text: "cacao coffee"
    queryDefaultOperator: OR
  }) {
    ...
  }
}

Example:

query {
  search(input: { text: "chocolate" }) {
    total
    offset
    queryTime
    results {
      id
    }
  }
}

Quotation marks can be used to search for related phrases:

query {
  search(input: {
    text: "cacao \"milk coffee\""
  }) {
    ...
  }
}

In order to force a word to be included in the search results, a + can be placed in front of the word:

query {
  search(input: {
    text: "cacao coffee +milk"
  }) {
    ...
  }
}

To exclude a word from the search results, a - can be placed in front of the word:

query {
  search(input: {
    text: "cacao coffee -milk"
  }) {
    ...
  }
}

The IES (Sitepark's content management system) supports multilingual resource channels. Editorial content is only ever written in one language and is automatically translated into the other languages by the CMS. A multilingual resource channel then contains several resources for an article, each of which is published in a different language. For the search, a separate full text index is created for each language, which also takes into account language-specific features such as stop words and stemming.

If the publication channel is multilingual, the search is limited to a specific language. The language is specified using the input parameter lang. If no lang is specified, the search is carried out in the base language of the channel.

Example:

query {
  search(input: { text: "chocolate", lang: "it" }) {
    total
    offset
    queryTime
    results {
      id
    }
  }
}

Of course, the search results are also in the respective language. Regardless of whether a full-text search or a filter is carried out.

Sorting#

Sort criteria can be used to specify how the result should be sorted. Multiple sorting criteria can be specified, which are applied to the result one after the other. The second sort criterion is used if the first is the same and so on.

If no sorting criterion is specified, the result is sorted by relevance. The score is used here, which is higher the more precisely the hit matches the search.

The following sorting criteria are possible:

Search criteria
Description
name This is sorted by the name of the article. In some cases, the name is preceded by a numerical prefix to achieve the desired sorting in the CMS and is therefore not always identical to the headline.
headline Sort by the title of the article.
date In many cases, an editorial date can be set for the article that is used here. Otherwise it is the last modification date of the article.
natural In most cases, a sort field is written to the index, which should describe the natural sorting of the entry. For normal articles, this is usually the heading. For news or events, however, it is the date, for example. This sort field is used in this case.
score The score is determined during the search and describes how closely the individual hits match the search query. This sorting is useful for full-text searches in order to obtain the most accurate results first. Here it is sorted according to relevance.
spatialDist The hits are sorted according to the distance to the reference point. The reference point is specified via an additional parameter spatialPoint
custom This sort criterion allows you to use your own fields from the search index for sorting.

When specifying the search criteria, you must specify whether the sorting should be in ascending (ASC) or descending (DESC) order.

The sorting criteria are specified as a list in the following form:

sort: [ { name: ASC }, { date: DESC }, ... ]

Here is an example of a search criteria:

Examples:

query {
  search(input: { text: "chocolate", sort: [{ name: ASC }] }) {
    total
    offset
    queryTime
    results {
      id
    }
  }
}

Sort by distance to the reference point:

query {
  search(
    input: {
      text: "chocolate"
      sort: [
        { spatialDist: ASC, spatialPoint: { lng: 7.6286691, lat: 51.9651620 } }
      ]
    }
  ) {
    total
    offset
    queryTime
    results {
      id
    }
  }
}

When specifying a custom sort criteria, the name of the field to be used for sorting must also be specified. This field must be present in the index.

query {
  search(
    input: {
      text: "chocolate"
      sort: [{ custom: { field: "myfield", direction: ASC } }]
    }
  ) {
    total
    offset
    queryTime
    results {
      id
    }
  }
}

Warning

If the schema is changed, the specified sort field for this sorting may no longer work.

The indexed resources can be marked as "archived". This flag ensures that these resources are not normally included in the search. This can be used for news, for example, to include only the latest news in the general search. For a special search, such as a news archive search, the archive flag can be used to also find archived resources.

query {
  search(
    input: {
      text: "chocolate"
      filter: [{ objectTypes: ["news"] }]
      archive: true
    }
  ) {
    total
    offset
    queryTime
    results {
      id
    }
  }
}

Date ranges#

Date ranges can be used in the search to limit the search to a specific time period. Date range filters are used for this purpose.

Facets are another area of application for data ranges. These can be used to determine how many hits are contained in a specific time period or how the hits are distributed over the different days, for example. Date range facets are used for this purpose.

A date must always be specified for the UTC time zone and in the format ISO-8601 (e.g.2024-05-22T10:13:00Z).

Date ranges can be defined over an absolute period or relatively based on a specific date.

Absolute date range#

An absolute period is defined by two parameter:

  • the start date (from)
  • the end date (to)

If the start or end date is not specified, the current date is used.

Example of an absolute date range

{
  absoluteDateRange: {
    from: "2024-05-21T22:00:00Z",
    to: "2024-05-22T21:59:59Z"
  }
}

At least the from or to date must be specified. If from is not specified, there is no lower limit. If to is not specified, there is no upper limit.

Note

A special case is the use of the absolute date range for facets. Here, a defined period must always be specified for which the facets are to be determined. from and to are mandatory here.

Relative date range#

A relative date range is specified using two intervals that are relative to a specific date:

  • the before interval is based on a specific date
  • the after interval is based on a specific date

The interval must be specified in the format ISO-8601 Durations (e.g. P1D for one day).

Optionally, a base can also be specified. This date is used as the basis for calculating the relative date. If no base is specified, the current date is used.

Relative date ranges can only be exact to the day. Specifying a time such as "P1DT1H" will result in an error.

The period defined via the before and after intervals is to the day. The period is therefore always rounded. See Round Date.

The following examples illustrate the relative date ranges:

Only everything from yesterday:

{
  relativeDateRange: {
    before: "P1D"
    roundStart : START_OF_DAY
    roundEnd: END_OF_PREVIOUS_DAY # default end-date is 'now'
  }
}

Yesterday, today and tomorrow

{
  relativeDateRange: {
    before: "P1D"
    after: "P1D"
  }
}

Everything from the last 7 days and today:

{
  relativeDateRange: {
    before: "P7D"
    roundStart : START_OF_DAY
    roundEnd: END_OF_DAY # default end-date is 'now'
  }
}

Everything this month, past and future

{
  relativeDateRange: {
    roundStart : START_OF_MONTH
    roundEnd: END_OF_MONTH # default end-date is 'now'
  }
}

All in the last month:

{
  relativeDateRange: {
    before: "P1M"
    roundStart : START_OF_MONTH
    roundEnd: END_OF_PREVIOUS_MONTH # default end-date is 'now'
  }
}

All in the seven days before Christmas Eve 2024 (Timezone Europe/Berlin):

{
  relativeDateRange: {
    base: "2024-12-23T23:00:00Z"
    before: "P7D"
    roundEnd: END_OF_PREVIOUS_DAY
  }
}

Round date#

The smallest unit in which ranges can be defined is a day. Each date is therefore rounded (for current Timezone). How the date is to be rounded can be defined using the roundStart and roundEnd parameters. The possible values are:

Name
Description
START_OF_DAY The start of the day. The time is set to 00:00:00.
START_OF_PREVIOUS_DAY The start of the previous day. The time is set to 00:00:00 of the previous day.
END_OF_DAY The end of the day. The time is set to 23:59:59.
END_OF_PREVIOUS_DAY The end of the previous day. The time is set to 23:59:59 of the previous day.
START_OF_MONTH The start of the month. The date is set to the first day of the month the time is set to 00:00:00.
START_OF_PREVIOUS_MONTH The start of the previous month. The date is set to the first day of the previous month and the time is set to 00:00:00.
END_OF_MONTH The end of the month. The date is set to the last day of the month and the time is set to 23:59:59.
END_OF_PREVIOUS_MONTH The end of the previous month. The date is set to the last day of the previous month and the time is set to 23:59:59.
START_OF_YEAR The start of the year. The date is set to the first day of the year and the time is set to 23:59:59.
START_OF_PREVIOUS_YEAR The start of the previous year. The date is set to the first day of the previous year and the time is set to 23:59:59.
END_OF_YEAR The end of the year. The date is set to the last day of the year and the time is set to 23:59:59.
END_OF_PREVIOUS_YEAR The end of the previous year. The date is set to the last day of the previous year and the time is set to 23:59:59.

If no rounding parameter is specified, START_OF_DAY is used for roundStart and END_OF_DAY for roundEnd.

See also Timezone

Timezone#

By default, all mathematical date expressions are evaluated relative to the Server time zone, but the timeZone parameter can be specified to override this behavior by performing all date-related additions and rounding relative to the specified time zone.

This is relevant for Date ranges.

Example:

query {
  search(input: { timeZone: "Europe/London", ... }) {
    ...
  }
}

Note

The time zone only affects the range of dates. The date specifications transferred in the GraphQL query and the date specifications returned in the results remain UTC.

A Spatial Search searches data based on spatial or geographical relationships rather than traditional text or number-based criteria. It takes into account coordinates and distances to find relevant results in a specific area or radius. This technique is often used in applications such as maps to filter and sort information according to its position in space.

To enable searches based on geodata, the corresponding fields must be available in the index. This requires the individual resources to be provided with geo coordinates via the CMS.

The GraphQL search supports the following features:

Example:

Query:

query search($geoPoint: InputGeoPoint!) {
  search(
    input: {
      distanceReferencePoint: $geoPoint
      sort: { spatialDist: ASC, spatialPoint: $geoPoint }
      filter: [
        {
          key: "geofilter"
          spatialOrbital: {
            distance: 20.0
            centerPoint: $geoPoint
            mode: BOUNDING_BOX
          }
        }
      ]
      facets: [
        {
          key: "geo"
          excludeFilter: ["geofilter"]
          spatialDistanceRange: { point: $geoPoint, from: 0, to: 10 }
        }
      ]
    }
  ) {
    total
    results {
      objectType
      geo {
        distance
        primary {
          lat
          lng
        }
      }
    }
    facetGroups {
      key
      facets {
        key
        hits
      }
    }
    queryTime
  }
}

Variable:

{
  "geoPoint": {
    "lng": 7.6286691,
    "lat": 51.965162
  }
}

Boosting#

Boosting makes it possible to increase the relevance of certain documents in the search results. This can be achieved by customizing query parameters, such as adding boosting factors to specific fields or applying custom functions. In this way, search results can be specifically influenced to place more relevant results at the top.

The following parameters can be used to influence the result:

Name
Description
queryFields This parameter specifies the fields to be searched and their relative importance. It is a list of fields, optionally with boost factors that indicate how heavily each field should be weighted when matching search terms. For example, qf=title^2.0 description means that the title field is twice as important as the description field.
phraseFields This parameter increases the importance of whole phrases (word sequences) in the specified fields. It is used to increase the relevance of documents in which the search terms appear as phrases in these fields. For example, pf=title^1.5 content increases the relevance of documents in which the search terms appear as a phrase in the title field more than in the content field.
boostQueries This parameter allows additional query clauses that increase the relevance score of documents that match these clauses. These clauses do not affect whether a document matches the main query, but increase the score of documents that match them. For example, contenttype:(text/html*)^10 would increase the relevance score of HTML documentserhöhen.
boostFunctions This parameter applies function-based boosts to the relevance score. These are mathematical functions that adjust the score based on field values or other criteria. For example, if(termfreq(sp_objecttype,'news'),scale(sp_date,0,12),scale(sp_date,10,11) could be used to score older news articles less highly
tie (Tie-Breaker-Multiplikator) This parameter combine the best match points from multiple fields. The tie parameter adjusts how much lower scores affect the overall score. A higher tie value means that the lower scores have more influence on the final score. For example, tie=0.1 could be used to give the secondary fields some influence in the scoring process, preventing only the best matches from dominating.

Setting the boosting parameters requires in-depth knowledge of how the search index works and its schema. If no boosting is specified, the default values of Sitepark are used, which have already proven themselves in many projects.

Example:

query {
  search(input: {
     ...
     boosting: {
      queryFields: [
        "sp_title^1.4",
        "keywords^1.2",
        "description^1.0",
        "title^1.0",
        "url^0.9",
        "content^0.8"
      ]
      phraseFields: [
        "sp_title^1.5",
        "description^1",
        "content^0.8"
      ]
      boostQueries: [
        "sp_objecttype:searchTip^100",
        "contenttype:(text/html*)^10"
      ]
      boostFunctions: [
        "if(termfreq(sp_objecttype,'news'),scale(sp_date,0,12),scale(sp_date,10,11))"
      ]
      tie: 0.1
    }
  }) {
    ...
  }
}

Warning

The boosting parameters are matched to the fields of the index. If changes are made to the index schema, the boosting parameters must be adjusted accordingly.

Search results#

The search results can be output using the SearchResult type. In this case, results returns a list of Resource objects. This can be used to query further data. See also:

Additional input parameters are available for extended search functionalities, which can be used with the search. These are described on the following pages.

Explain#

The explain parameter can be used to output detailed information about the search. This can be used to analyze the search and to identify possible sources of error.

It helps to understand why certain hits are returned and why they are returned in the corresponding order.

query {
  search(input: {
    text: "section"
    explain: true
  }) {
    results {
      name
      explain {
        score
        type
        description
        details {
          score
          type
          description
          details {
            ...
          }
        }
      }
  }
}

Depending on how precise the analysis is to be, a corresponding number of levels of details can be returned.

A result can look like this, for example:

{
  "data": {
    "search": {
      "total": 26,
      "results": [
        {
          "name": "Section with Tabs",
          "explain": {
            "score": 24.873642,
            "type": "sum",
            "details": [
              {
                "score": 3.9798312,
                "type": "max",
                "description": "max plus 0.1 times others of:",
                "details": [
                  {
                    "score": 1.4506807,
                    "type": "weight",
                    "field": "sp_title",
                    "details": [
                      {
                        "score": 1.4506807,
                        "type": "score",
                        "description": "score(freq=1.0), computed as boost * idf * tf from:",
                        "details": [
                          {
                            "description": "boost"
                          },
                          {
                            "description": "idf, computed as log(1 + (N - n + 0.5) / (n + 0.5)) from:"
                          },
                          {
                            "description": "tf, computed as freq / (freq + k1 * (1 - b + b * dl / avgdl)) from:"
                          }
                        ]
                      }
                    ]
                  },
                  {
                    "score": 1.0478733,
                    "type": "weight",
                    "field": "title",
                    "details": [
                      {
                        "score": 1.0478733,
                        "type": "score",
                        "description": "score(freq=1.0), computed as boost * idf * tf from:",
                        "details": [
                          {
                            "description": "idf, computed as log(1 + (N - n + 0.5) / (n + 0.5)) from:"
                          },
                          {
                            "description": "tf, computed as freq / (freq + k1 * (1 - b + b * dl / avgdl)) from:"
                          }
                        ]
                      }
                    ]
                  },
                  {
                    "score": 3.7299757,
                    "type": "weight",
                    "field": "content",
                    "details": [
                      {
                        "score": 3.7299757,
                        "type": "score",
                        "description": "score(freq=3.0), computed as boost * idf * tf from:",
                        "details": [
                          {
                            "description": "boost"
                          },
                          {
                            "description": "idf, computed as log(1 + (N - n + 0.5) / (n + 0.5)) from:"
                          },
                          {
                            "description": "tf, computed as freq / (freq + k1 * (1 - b + b * dl / avgdl)) from:"
                          }
                        ]
                      }
                    ]
                  }
                ]
              },
              {
                "score": 10,
                "type": "boosting",
                "description": "contenttype:text/html*^10.0",
                "details": []
              },
              {
                "score": 10.893811,
                "type": "function",
                "description": "FunctionQuery(if(termfreq(sp_objecttype,news),scale(date(sp_date),0.0,12.0),scale(date(sp_date),10.0,11.0))), product of:",
                "details": [
                  {
                    "score": 10.893811,
                    "type": "boosting",
                    "field": null,
                    "details": []
                  },
                  {
                    "score": 1,
                    "type": "boost",
                    "field": null,
                    "details": []
                  }
                ]
              }
            ]
          }
        },
        ...
      ]
    }
  }
}

TODO#

  • Spell Checking