Migrating Lucene collectors to Search API in Jira 11
Starting with Jira 11, the ability to iterate through search results using the Lucene-specific org.apache.lucene.search.Collector
has been removed. The Search API now offers several platform-agnostic alternatives:
Search
Use com.atlassian.jira.bc.issue.search.SearchService
to retrieve a list of matching issues, or com.atlassian.jira.search.issue.IssueDocumentSearchService
to retrieve a list of matching documents, and process them locally.
When using OpenSearch, Search
has a limit on the number of documents that can be returned in a single request. To ensure compatibility across platforms, use it for searches with a limited number of results or in conjunction with pagination. The default limit is 10,000, and it is configurable via the max_result_window
setting in the OpenSearch index.
Explore the related OpenSearch documentation:
Aggregation
Many uses of Lucene Collectors involve iterating over search results to aggregate data.
Starting with Jira 10.6, a new API is available for performing aggregation operations such as count and sum. This API offers high-performance data processing for both Lucene (using a Lucene Collector internally) and OpenSearch (using Aggregations
internally).
This page outlines the new aggregation methods and provides examples for migrating from Lucene-specific collectors to the aggregation API.
Aggregation methods
Jira 11 contains metric and bucket aggregation types to replace common use cases for collectors.
Metric aggregations
Metric aggregations calculate values over a set of documents. In Jira 11, the following metric aggregations are available for numeric fields:
AvgAggregation: Computes the average of numeric values for a field across all matching documents.
CountAggregation: Computes the number of documents with each value for a field across all matching documents.
MaxAggregation: Determines the maximum numeric value for a field across all matching documents.
StatsAggregation: Computes the sum, count, and average of numeric values for a field across all matching documents.
SumAggregation: Calculates the sum of numeric values for a field across all matching documents.
Bucket aggregations
Bucket aggregations group documents based on specified criteria and can include sub-aggregations, unlike metric aggregations. In Jira 11, the following bucket aggregations are available:
DateHistogramAggregation: Groups documents into buckets based on date intervals.
FilterAggregation: Applies a filter query to narrow down the documents before creating buckets.
RangeAggregation: Groups documents into buckets based on numeric ranges.
TermsAggregation: Groups documents into buckets based on unique terms within a specified field.
We plan to introduce more aggregations in the future. You can suggest new aggregations by filing a suggestion ticket on JAC with the Search - Search API component.
To perform aggregation on a field, ensure that the field is indexed with doc values enabled.
Migrating collectors to Aggregation API
Check the following examples in expands to see how to migrate collectors.
Aggregation buckets limit
OpenSearch limits the number of aggregation buckets in a single response. The default limit is 65,535, but you can adjust it using the search.max_buckets setting. For more information, visit the OpenSearch configuration guide.
Search Stream
The IssueDocumentSearchService#searchStream()
returns search hits as a stream. Unlike the Search
method, Search Stream:
- has no limit on the number of documents that are returned from a search
- doesn't support Sorting. It's the responsibility of the consumer to sort the results. Any sort clause provided with the query will be dropped.
- implementation for OpenSearch retrieves results incrementally so the entire result set doesn't need to be held in memory within Jira.
This method can return an unlimited amount of data, which might lead to performance or scalability issues when used with large datasets and loose filters. We recommend to use other mechanisms where possible to avoid these issues.