Skip to content
Advertisement

Elasticsearch cardinality aggregation with text fields

I’m trying to query out average request generated in per session, I’m inserting a session_id while putting the data in my indices, I want to count distinct sessions and take out the average, while checking the mapping of data I came to know that it is in text field:

"session_id": {
    "type": "text",
    "fields": {
      "keyword": {
        "type": "keyword",
        "ignore_above": 256
      }
    }
  },

For fetching the data I call:

$this->elasticsearch->search([
    'index' => 'nits_media_bid_won',
    'body' => [
        'query' => $query,
        'aggs' => [
            'total_session' => [
                'cardinality' => [
                    'field' => 'session_id',
                    'precision_threshold' => 100,
                ]
            ]
        ]
    ]
]);

But I get an error stating:

{
   "error":{
      "root_cause":[
         {
            "type":"illegal_argument_exception",
            "reason":"Fielddata is disabled on text fields by default. Set fielddata=true on [session_id] in order to load fielddata in memory by uninverting the inverted index. Note that this can however use significant memory. Alternatively use a keyword field instead."
         }
      ],
      "type":"search_phase_execution_exception",
      "reason":"all shards failed",
      "phase":"query",
      "grouped":true,
      "failed_shards":[
         {
            "shard":0,
            "index":"nits_media_bid_won",
            "node":"q438L5GRSqaHJz1_vRtZXg",
            "reason":{
               "type":"illegal_argument_exception",
               "reason":"Fielddata is disabled on text fields by default. Set fielddata=true on [session_id] in order to load fielddata in memory by uninverting the inverted index. Note that this can however use significant memory. Alternatively use a keyword field instead."
            }
         }
      ],
      "caused_by":{
         "type":"illegal_argument_exception",
         "reason":"Fielddata is disabled on text fields by default. Set fielddata=true on [session_id] in order to load fielddata in memory by uninverting the inverted index. Note that this can however use significant memory. Alternatively use a keyword field instead.",
         "caused_by":{
            "type":"illegal_argument_exception",
            "reason":"Fielddata is disabled on text fields by default. Set fielddata=true on [session_id] in order to load fielddata in memory by uninverting the inverted index. Note that this can however use significant memory. Alternatively use a keyword field instead."
         }
      }
   },
   "status":400
}

If I do the change as per the keyword:

'total_session' => [
    'cardinality' => [
        'field' => 'session_id.fields.keyword',
        'precision_threshold' => 100,
    ]
]

It gives me 0 value

"aggregations": {
  "total_session": {
    "value": 0
  }
}

Advertisement

Answer

While I didn’t fully check your query it appears that the name for the keyword field is off. The keyword field of session_id would be session_id.keywordaccording to the supplied mapping

User contributions licensed under: CC BY-SA
2 People found this is helpful
Advertisement