One of the most typical scenarios out there is to integrate big amount of data between systems. Lately the integrations with Salesforce are more common and many companies want to have Salesforce like the main input of data and have a "backup" that allows people back on time and make some restore or data analysis.
After make some research and trying to figure out the best way, the only solution that worked for me to integrate Salesforce and SQL Server using Mulesoft was:
- Create a BULK Load Job in Salesforce (operation:query, object:dynamic_SObject,ContentType:CSV)
- Inject the query in the job (couldn't set a limit on the records, instead you need to specify the WHERE clause).
- Handle a process to check and validate the status of the JOB. in this step I'm checking the Batch info list and waiting all files are generated in order to start downloading them.
- Once the files are ready, I starting to retrieve the results from the JOB. Unfortunately text areas might contain break lines and weird characters that makes the CSV file output been corrupted or malformed, so I needed to implement maybe not the fancy function but functional Java code that allows me to remove those break lines and adjust the result.
- Then I'm converting the CSV file (String payload) into a Java object in order to prepare the dataset before to start the Batch processing.
- Starting to process the data in my Batch flow (even I set the batch size to 100000 it seems only process like 100+ records at the time I would like to make fast this insert). inserting the records in BULK mode using the SQL Driver and finally reporting once the process is done.
For this example I exported 150K records from Salesforce, it chunk the data in files of 100000 records each and the total process takes around 30 mins
For the SQL Database I created a free account in Azure and a SQL Database.
If there's a best way to achieve the total records keeping in mind we need to handle million records would be awesome, for now it will take so long if I want to push around 6M.
Hope this helps and eventually I will post a sample code in github.