Migrating Lucene collectors to Search API in Jira 11

Still need help?

The Atlassian Community is here for you.

Ask the community

Starting with Jira 11, the ability to iterate through search results using the Lucene-specific org.apache.lucene.search.Collector has been removed. The Search API now offers several platform-agnostic alternatives:

Search

Use com.atlassian.jira.bc.issue.search.SearchService to retrieve a list of matching issues, or com.atlassian.jira.search.issue.IssueDocumentSearchService to retrieve a list of matching documents, and process them locally.

When using OpenSearch, Search has a limit on the number of documents that can be returned in a single request. To ensure compatibility across platforms, use it for searches with a limited number of results or in conjunction with pagination. The default limit is 10,000, and it is configurable via the max_result_window setting in the OpenSearch index.

Explore the related OpenSearch documentation:

Aggregation

Many uses of Lucene Collectors involve iterating over search results to aggregate data.

Starting with Jira 10.6, a new API is available for performing aggregation operations such as count and sum. This API offers high-performance data processing for both Lucene (using a Lucene Collector internally) and OpenSearch (using Aggregations internally).

This page outlines the new aggregation methods and provides examples for migrating from Lucene-specific collectors to the aggregation API.

Aggregation methods

Jira 11 contains metric and bucket aggregation types to replace common use cases for collectors.

Metric aggregations

Metric aggregations calculate values over a set of documents. In Jira 11, the following metric aggregations are available for numeric fields:

  • AvgAggregation: Computes the average of numeric values for a field across all matching documents.

  • CountAggregation: Computes the number of documents with each value for a field across all matching documents.

  • MaxAggregation: Determines the maximum numeric value for a field across all matching documents.

  • StatsAggregation: Computes the sum, count, and average of numeric values for a field across all matching documents.

  • SumAggregation: Calculates the sum of numeric values for a field across all matching documents.

Bucket aggregations

Bucket aggregations group documents based on specified criteria and can include sub-aggregations, unlike metric aggregations. In Jira 11, the following bucket aggregations are available:

  • DateHistogramAggregation: Groups documents into buckets based on date intervals.

  • FilterAggregation: Applies a filter query to narrow down the documents before creating buckets.

  • RangeAggregation: Groups documents into buckets based on numeric ranges.

  • TermsAggregation: Groups documents into buckets based on unique terms within a specified field.

We plan to introduce more aggregations in the future. You can suggest new aggregations by filing a suggestion ticket on JAC with the Search - Search API component.

tip/resting Created with Sketch.

To perform aggregation on a field, ensure that the field is indexed with doc values enabled.

Migrating collectors to Aggregation API

Check the following examples in expands to see how to migrate collectors.

Example: Compute average value of a number custom field

Legacy Lucene:

public class AverageValueCollector extends SimpleCollector {

    private final String customFieldId;

    private double sum = 0.0;
    private long count = 0;
    private NumericDocValues customFieldValues;

    public AverageValueCollector(final String customFieldId) {
        this.customFieldId = customFieldId;
    }

    @Override
    public void collect(final int docId) throws IOException {
        if (customFieldValues.advanceExact(docId)) {
            sum += customFieldValues.longValue(); // Assuming the custom field indexes a long value
            count++;
        }
    }

    @Override
    protected void doSetNextReader(final LeafReaderContext context) throws IOException {
        customFieldValues = context.reader().getNumericDocValues(customFieldId);
    }

    @Override
    public boolean needsScores() {
        return false;
    }

    public double getResult() {
        return count > 0 ? sum / count : 0.0;
    }
}

// ...

final var collector = new AverageValueCollector("customfield_10000");

searchProvider.search(SearchQuery.create(query, searcher), collector);

collector.getResult();

Aggregation API:

// Create an average aggregation on the "customfield_10000"
final var aggregation = new AvgAggregation("customfield_10000");

// Execute the search
final var searchResponse = searchService.search(DocumentSearchRequest.builder()
        .jqlQuery(query)
        .searcher(searcher)
        // define aggregation with name "avg_cf"
        .aggregation("avg_cf", aggregation)
        .build(), new PagerFilter<>(0));

// Read the results
final var avgResult = searchResponse.getAggregations().getAvg("avg_cf");

avgResult.getValue();


Example: Count issues by project

Legacy Lucene:

public class IssuePerProjectCollector extends SimpleCollector {
  
    private final Map<Long, Long> projectToIssueCount = new HashMap<>();
    private SortedDocValues docIdToProjectIdValues;

    @Override
    public void collect(final int docId) throws IOException {
        if (docIdToProjectIdValues.advanceExact(docId)) {
            final long projectId = Long.parseLong(docIdToProjectIdValues.binaryValue().utf8ToString());
            projectToIssueCount.merge(projectId, 1L, Long::sum);
        }
    }

    @Override
    protected void doSetNextReader(final LeafReaderContext context) throws IOException {
        docIdToProjectIdValues = context.reader().getSortedDocValues(DocumentConstants.PROJECT_ID);
    }

    @Override
    public boolean needsScores() {
        return false;
    }

    public Map<Long, Long> getResult() {
        return projectToIssueCount;
    }
}

// ...

final var collector = new IssuePerProjectCollector();

searchProvider.search(SearchQuery.create(query, searcher), collector);

collector.getResult();

Aggregation API:

// Create a terms aggregation on the PROJECT_ID field
final var aggregation = TermsAggregation.builder()
        .withField(DocumentConstants.PROJECT_ID)
        //.withSubAggregation(...) - you can define a sub-aggregation too
        .build();

// Execute the search
final var searchResponse = searchService.search(DocumentSearchRequest.builder()
        .jqlQuery(query)
        .searcher(searcher)
         // define aggregation with name "count_issues_by_project"
        .aggregation("count_issues_by_project", aggregation)
        .build(), new PagerFilter<>(0));

// Read the results
final var termsResult = searchResponse.getAggregations().getTerms("count_issues_by_project");

termsResult.getBuckets().forEach(bucket -> {
    final var projectId = Long.parseLong(bucket.getKey());
    final var issueCount = bucket.getDocCount();

    // Handle the results ...
});


Aggregation buckets limit

OpenSearch limits the number of aggregation buckets in a single response. The default limit is 65,535, but you can adjust it using the search.max_buckets setting. For more information, visit the OpenSearch configuration guide.

Search Stream

The IssueDocumentSearchService#searchStream() returns search hits as a stream. Unlike the Search method, Search Stream:

  • has no limit on the number of documents that are returned from a search
  • doesn't support Sorting. It's the responsibility of the consumer to sort the results. Any sort clause provided with the query will be dropped.
  • implementation for OpenSearch retrieves results incrementally so the entire result set doesn't need to be held in memory within Jira.

This method can return an unlimited amount of data, which might lead to performance or scalability issues when used with large datasets and loose filters. We recommend to use other mechanisms where possible to avoid these issues.



Last modified on Jun 4, 2025

Was this helpful?

Yes
No
Provide feedback about this article
Powered by Confluence and Scroll Viewport.