Automate your Salesforce Data Exports with Python (force-backup-automator)

The Data Export Automation problem

Salesforce provides a simple interface to schedule mass data exports of objects into CSV format in a weekly/monthly basis. These exports are great, but lack the API interfaces of the rest platform that would allow super admins to automate the download of the resulting files.

This page is not available via REST API 😮

The force-backup-automator

Python is a flexible, easy to learn scripting language. It comes with a vast ecosystem of packages, forums and documentation articles to automate most of your daily tasks.

To solve the backup download problem and save you countless man hours I created a Python package to automate this process, the force-backup-automator

The components

The package uses 3 main components:

  1. A built-in login module: Using methods built into the package you can provide a username and password to log-in as you would regularly do in the interface. This provides great flexibility as the user running the package does not need API access. Additionally, if you prefer to handle the login process yourself, this package accepts an optional cookie parameter with the sid and oid cookies. These cookies can be used by the download mechanism to access the Data Export Page already authenticated.
  2. A Selenium powered download mechanism: In order to support Lightning and Classic Experience, the package uses Selenium to render a headless version of your Chrome browser. This allows the package to navigate to the Salesforce Data Export Service page as regular browser would to identify all the download links for the files that need to be retrieved.
  3. A stream-based download: Some of the files to be downloaded can be pretty large. To avoid memory leaks, the package will use the popular requests package in stream mode to write to the file system in chunks. Additionally, the package receives as an input the location to download the files of course!

The combination of these three fundamental components, provide a robust package that will get your started with three lines of code:

Import the package:

from force_backup_automator import BackupController

Create an instance of the Controller

backup_instance = BackupController(
driver_location='./chromedriver',
org_link='ORG MAIN URL',
is_headless=0)

Download your files

backup_instance.download_backups(
download_location='TARGET_LOCATION',
backup_url='ORG_URL/lightning/setup/DataManagementExport/home',
user_name='USERNAME',password='PASSWORD')

It’s that easy!

To get started on your own, download this package from PiPy or the Github Repo here:

pip install force-backup-automator

Take your automation to the next level

As a companion of this package you may need to use operating system scheduling and scripting tools. These tools will allow you to run the package in a regular basis. For example, a common set up in Windows would be to use the Task Scheduler to run the package every week as diagrammed below:

Cron tab and other tools can be used in Linux based operating systems.

I hope this package helps the great Salesforce community save some time. Leave a comment for any questions!

About me

Article by Stefan Zepeda
Hands-on Technical Architect and Salesforce enthusiast with experience collecting requirements, transforming them into solutions and implementing them efficiently on any tehcnology platform.

5 thoughts on “Automate your Salesforce Data Exports with Python (force-backup-automator)

  1. Hi Stefan!
    Thank you for providing this automation solution.
    I downloaded package from GitHub and installed.
    I installed force-backup-automator by using ‘pip install force-backup-automator’ this command.
    I executed 3 lines of code which you have provided by passing respective parameters but facing below issue.
    Could you please help me on this. Really it will be very helpful if you provide solution to resolve this problem.

    “Failed to read descriptor from node connection: A device attached to the system is not functioning “

    Like

    • Hey Yashu, yes glad it helps.
      A couple of things to check:
      1. In the Github page I mention that you need to download the Chrome driver, I will add it to this article to make it more clear:
      driver_location: The path to the Chrome Web Driver, make sure you download the proper version for your Operating System and Chrome binary here:
      https://chromedriver.chromium.org/downloads
      Once you download your driver make sure you specify the correct path in the drive_location parameter.

      2. What operating system are you using? Make sure you have Chrome installed in the system.

      Let me know if that works!

      Like

  2. Hi Stefan,
    thank you for offering your solutions for the weekly export. At the moment i am stuck with installing the package as i run the pip install force-backup-automator as written on the description. Still there is an syntax error on vscode after the installation of the package:
    line 1: from force-backup-automator import BackupController

    unexpected token ‘-‘ at line 1
    unexpected token ‘import’ at line 1

    Which Data do i need to install from the Github package to run the script? Thank you in advance for your help and great work!

    Regards,
    Pierre

    Like

    • Hey @Pierre good catch, there was a typo in my sample code. The package needs to be imported with underscores not dashes. I updated the sample as shown below:

      from force_backup_automator import BackupController

      Some pro tips to help you as you go:

      1. Don’t forget you need to install the Chrome Driver for selenium, more instructions are here in the Github Repo:
      https://github.com/stefanzepeda/force-backup-automator

      2. Keep an eye for the parameters specific to your Org like the path to the backups or the login URLs.

      3. You can enable the is_headless=1 to see what the driver is doing while you are testing this

      Like

  3. Hello Stefan,

    Thanks for making a solution to speed up salesforce backups a bit. I have a few issue. If I run the script with headless=0 it will try to log in and I will get a verification page. I can’t just verify once ether by simply stepping through the code. It’s each time the chrome driver runs. The other issue is if I run headless, I end up with an error about trying to do simultaneous downloads.

    [0401/111043.497:INFO:CONSOLE(13)] “Synchronous XMLHttpRequest on the main thread is deprecated because of its detrimental effects to the end user’s experience. For more help, check https://xhr.spec.whatwg.org/.”, source: https://www.salesforce.com/etc/clientlibs/sfdc-aem-master/clientlibs_www_tags.min.49c634c0df8e725801cecc00b8a87f20.js (13)
    Classic Detected
    Reading file links
    Is there an argument we can pass to simply run one download at a time?

    Thanks in advance,

    Ryan

    Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s