Elasticsearch to MongoDB Migration - MongoES

The following are some of the instances where the developers simply love to hate!
  • The one-last-thing syndrome - This reminds me of the following quote:
  The first 90 percent of the code accounts for the first 90 percent of the development time. The remaining 10 percent of the code accounts for the other 90 percent of the development time.
Tom Cargill, Bell Labs, from the book `Programming Pearls `
  • QAs declaring certain undocumented features to be as bugs - Seriously, this create traumas for a devloper.
  • Interruptions during coding - Here's an idea. Try talking to developers while they code; chances are, they have just about <10% of your attention. 
There are some problems which we get used to..

But, there are others which makes us wanna do this..



  • DISCONNECTION FROM THE SERVER DUE TO BAD INTERNET DURING A MIGRATION - Ouch!! That's gotta hurt real bad.

Talking about ES to MongoDB Migration 

- How hard could that be?

Good Side:
JSON objects are common for both.
Numerous tools to choose from, for migration.
Bad Side: 
The Migration can be hideous, and can eat up a lot of the system resources. Be ready for a system-freeze, in case the migration tool uses a queue.
Ugly Side:
Can never be resumed from the point of failure. If the connectivity goes down during the migration; the transferred collection has to be deleted and the data transfer has to be initiated once again from the beginning.


Alright, there's nothing there to be felt bad about.

Enter, MongoES.



MongoES is a pure python3-bred Migration tool to migrate documents from the elasticsearch's index to the MongoDB collections.

It's robust in it's native way; no queues/message brokers are involved; which means that there won't be any memory spikes or system freezes.

This became achievable due to the fact that MongoES specifically uses a tagging strategy prior to the migration. The tagging happens in the source elasticsearch, which stands as a checkpoint during the migration.

Why a new custom id tagging, while there's an _id already?

Unless the documents are explicitly tagged, the _id fields in elasticsearch documents are a bunch of alphanumeric strings generated to serialize the documents. These _id columns become unusable, since queries/aggregations can not be run using them.

MongoES - How to:
  1. Install all the Prerequisites.
  2. Clone the repository from https://github.com/datawrangl3r/mongoes.git
  3. Edit the mongoes.json file according to your requirements.

  4. Make sure that both the elasticsearch and mongoDB services are up and running, and fire up the migration by keying in:

  5. Sit back and relax; for we got you covered! The migration's default value is 1000 documents per transfer.
Happy Wrangling!!! :)

9 comments:

  1. Thank you so much it helped me a lot.

    ReplyDelete
  2. Hi

    I am migrating the heavy load from es to mongodb. This script is helping me a lot with less no. of doc. in es index but for index with large number of data ..it keeps sending the records to mongodb but the count of records in db remains same after some heavy data migration. Please look into it and reply me as soon as possible.

    ReplyDelete
    Replies
    1. Hi Prince, thanks for reading. Sorry for the late reply; I was inactive for a while. Have you solved the problem or still facing the issue?

      Delete
  3. Hi

    I am new to mongodb. But my first task in mongodb is to migrate data from Elasticserch to Mongodb..I dont know how to do it..can you please explain in steps how to migrate data
    Or can post any small example video

    ReplyDelete
    Replies
    1. Hi,

      Thanks for reading.

      You can use the package, https://github.com/datawrangl3r/mongoes for this project.

      The instructions are clearly documented there.

      I insist you go through the README of the repository and test it yourself.

      Let me know if you have any questions

      Delete
  4. Thanks for the blog filled with so many information. Stopping by your blog helped me to get what I was looking for. Now my task has become as easy as ABC. proceso de migracion

    ReplyDelete
    Replies
    1. Thanks a lot, @M. Taha
      Glad it helped.
      And please don't hesitate to raise any issues on the GitHub url if you face any issues with Mongoes (https://github.com/datawrangl3r/mongoes)

      Delete
  5. Hi,
    I was trying to executing this script getting below error.
    How can I fix this one.

    R:\mongoes-master>python __init__.py
    int() argument must be a string, a bytes-like object or a number, not 'dict'

    ReplyDelete

Featured Posts

ETL & Enterprise Level Practices

  ETL Strategies & Pipelines have now become inevitable for cloud business needs. There are several ETL tools in the market ranging fro...