Vibe monitoring with Last9 MCP: Ask your agent to fix production issues! Setup →
Last9 Last9

Feb 21st, ‘25 / 4 min read

How to Use OpenSearch with Python for Search and Analytics

Learn how to set up, index data, run queries, and secure OpenSearch with Python for efficient search and analytics.

How to Use OpenSearch with Python for Search and Analytics

If you're working with search and analytics, you’ve probably heard about OpenSearch—the open-source alternative to Elasticsearch. OpenSearch is a powerful tool, whether you're building a search engine, running log analytics, or implementing full-text search in your applications. And the best part? You can integrate it easily with Python.

This guide will walk you through everything you need to know to get started with OpenSearch using Python, from installation to advanced querying and performance tuning.

What is OpenSearch?

OpenSearch is an open-source search and analytics suite, originally forked from Elasticsearch 7.10. It provides a scalable, distributed search engine with built-in security, observability, and machine-learning features. OpenSearch is widely used for log analytics, full-text search, and business intelligence applications.

Key Features of OpenSearch:

  • Full-Text Search: Powerful search capabilities with tokenization, stemming, and ranking algorithms.
  • Scalability: Distributed architecture that allows the handling of large-scale data.
  • Observability: Integrated tools for monitoring and analyzing logs and metrics.
  • Security: Authentication, access controls, and encryption.
  • Machine Learning: Supports anomaly detection and predictive analytics.
💡
For a deeper dive into Amazon OpenSearch Service and its features, check out this guide on Amazon OpenSearch Service.

Why Use OpenSearch with Python?

Python has become the go-to language for developers working with search technologies, thanks to its rich ecosystem of libraries and ease of use.

Here’s why you might want to integrate OpenSearch with Python:

  • Simple API access – The OpenSearch Python client makes it easy to interact with the search engine.
  • Data analysis capabilities – Python’s data processing libraries (like Pandas and NumPy) complement OpenSearch’s querying power.
  • Automation – Automate indexing, searching, and monitoring tasks using Python scripts.
  • Integration with Machine Learning – Use OpenSearch with machine learning libraries such as TensorFlow and Scikit-learn.
💡
To enhance observability in your Python applications, explore this guide on OpenTelemetry Python Instrumentation.

Step-by-Step Process to SetUp OpenSearch with Python

1. Install and Run OpenSearch Using Docker

Before connecting OpenSearch to Python, you need to have OpenSearch running. You can set it up using Docker:

docker pull opensearchproject/opensearch:latest
docker run -d -p 9200:9200 -e "discovery.type=single-node" opensearchproject/opensearch:latest

This starts OpenSearch in single-node mode, making it easy to test and develop locally.

2. Install the OpenSearch Python Client for Hassle-free Interaction

To interact with OpenSearch from Python, install the OpenSearch client:

pip install opensearch-py

3. Establish a Connection to OpenSearch from Python

Now, let’s set up a connection to OpenSearch:

from opensearchpy import OpenSearch

client = OpenSearch(
    hosts=[{'host': 'localhost', 'port': 9200}],
    http_auth=('admin', 'admin')
)

print(client.info())

If everything is set up correctly, this will return basic cluster information.

💡
Understanding common errors can improve your OpenSearch integration—check out this guide on Types of Errors in Python.

How to Index and Search Data in OpenSearch

Before you can search, you need to index some data.

1. Setting Up an Index with Mappings and Settings

index_name = "products"

index_body = {
    "settings": {
        "index": {
            "number_of_shards": 1,
            "number_of_replicas": 1
        }
    },
    "mappings": {
        "properties": {
            "name": {"type": "text"},
            "price": {"type": "float"},
            "in_stock": {"type": "boolean"}
        }
    }
}

client.indices.create(index=index_name, body=index_body)

2. Adding Documents to the OpenSearch Index

document = {
    "name": "Wireless Keyboard",
    "price": 39.99,
    "in_stock": True
}

client.index(index=index_name, body=document)

Searching Data in OpenSearch

Once data is indexed, you can run queries to retrieve it.

1. Running a Basic Search Query

query = {
    "query": {
        "match": {"name": "keyboard"}
    }
}

response = client.search(index=index_name, body=query)
print(response)

2. Applying Filters to Refine Search Results

query = {
    "query": {
        "bool": {
            "must": [
                {"match": {"name": "keyboard"}}
            ],
            "filter": [
                {"range": {"price": {"gte": 30, "lte": 50}}}
            ]
        }
    }
}

response = client.search(index=index_name, body=query)
print(response)
💡
For better logging in your OpenSearch applications, check out this guide on Python Logging with Structlog.

Performance Optimization Tips

If you're working with large-scale data, performance optimization is key. Here are some best practices:

  • Use bulk indexing – Instead of indexing documents one by one, use the bulk API to send batches of documents.
  • Optimize queries – Avoid wildcard queries and excessive aggregations.
  • Shard wisely – Too many or too few shards can impact performance. Monitor and adjust based on your workload.
  • Cache results – Use OpenSearch’s built-in caching mechanisms for frequently queried data.

How to Handle Security and Authentication

OpenSearch includes built-in security features such as TLS encryption and authentication.

1. Enabling Secure Connections with TLS and Authentication

OpenSearch includes built-in security features such as TLS encryption and authentication. To ensure secure communication between your Python client and OpenSearch, you should use HTTPS and enable certificate verification:

from opensearchpy import OpenSearch

client = OpenSearch(
    hosts=[{'host': 'localhost', 'port': 9200}],
    http_auth=('admin', 'admin'),  # Replace with secure credentials
    use_ssl=True,
    verify_certs=True
)
  • use_ssl=True ensures that SSL/TLS is enabled.
  • verify_certs=True enforces certificate validation to prevent man-in-the-middle attacks.

For production deployments, consider using proper SSL certificates instead of self-signed ones and restrict access using firewall rules.

2. Using API Keys or IAM for Secure Authentication on AWS

If you're running OpenSearch on AWS, you should use IAM-based authentication or API keys instead of basic authentication for improved security.

Using IAM Authentication (AWS SigV4 Signing)

You can authenticate your Python client using AWS IAM roles with the requests-aws4auth library:

from opensearchpy import OpenSearch
from requests_aws4auth import AWS4Auth
import boto3

region = 'us-east-1'  # Change to your OpenSearch region
service = 'es'

credentials = boto3.Session().get_credentials()
aws_auth = AWS4Auth(credentials.access_key, credentials.secret_key, region, service, session_token=credentials.token)

client = OpenSearch(
    hosts=[{'host': 'your-opensearch-domain.us-east-1.es.amazonaws.com', 'port': 443}],
    http_auth=aws_auth,
    use_ssl=True,
    verify_certs=True
)
  • boto3.Session().get_credentials() retrieves temporary credentials for IAM authentication.
  • AWS SigV4 signing ensures authenticated and authorized access to OpenSearch on AWS.
Using API Key Authentication

If API key authentication is enabled, you can use it instead of IAM or basic authentication:

client = OpenSearch(
    hosts=[{'host': 'your-opensearch-domain', 'port': 443}],
    http_auth=('apikey', 'your-api-key'),
    use_ssl=True,
    verify_certs=True
)
  • API keys should be stored securely (e.g., using environment variables or a secrets manager).
💡
Ensure reliable logging in your OpenSearch integration with this guide on Python Logging Best Practices.

Sample OpenSearch Programs Using Python

This section provides sample programs demonstrating how to interact with OpenSearch using Python clients.

1. Bulk Indexing Multiple Documents Efficiently

from opensearchpy.helpers import bulk

docs = [
    {"_index": "products", "_source": {"name": "Mouse", "price": 25.99, "in_stock": True}},
    {"_index": "products", "_source": {"name": "Monitor", "price": 199.99, "in_stock": True}},
]

bulk(client, docs)

2. Running Aggregation Queries for Insights

query = {
    "aggs": {
        "average_price": {"avg": {"field": "price"}}
    }
}
response = client.search(index="products", body=query)
print(response)

Wrapping Up

OpenSearch is a powerful tool for search and analytics, and Python makes it easy to work with. You should now be comfortable setting up OpenSearch with Python, indexing and searching data, and optimizing performance.

💡
Got any questions or use cases you’d like to explore? Our Discord community is open! We have a dedicated channel where you can connect with other developers and explore your specific use case.

Contents


Newsletter

Stay updated on the latest from Last9.

Authors
Preeti Dewani

Preeti Dewani

Technical Product Manager at Last9

X