tl;dr

Salesforce Data Loader is a tool for bulk import and export of data into applications running on the Salesforce platform. Salesforce says it’s Windows and MacOS only. Do not believe the vendor. You can run its command line tools on Linux without trouble.

Update 2020-02-28: This article only cares about the command line utilities delivered with the Salesforce Data Loader. After writing this article, I have been asked by other Salesforce users if I got the UI elements to work under Linux as well. I started investigating and wrote a second blog article about using the Salesforce Data Loader GUI on Linux.

Update 2020-02-28: I added a section about going with wine and completely avoid any dependencies on a Windows system.

Salesforce Integration

When integrating with the Salesforce platform, companies often choose a batch data integration approach for its simplicity. They voluntarily buy into all the pitfalls that go with it. They do not consider more maintainable and durable approaches like realtime integration using messaging and APIs. We will not question that here.

Our customers often used a complex ETL tool like Talend Open Studio for Data Integration for the purpose of Salesforce Integration. While a graphical ETL tool brings the benefits of

  1. a quick start without having to dive too deep into Salesforce’s API and
  2. the flexibility to transform data as needed,

for some use cases using the official Salesforce Data Loader is just good enough.

Supported operating systems

In the official documentation for installing Salesforce Data Loader only MacOS and Windows are mentioned as supported platforms. As it shows, at least for command line usage that is without any deeper reason except enterprise stupidity.

Download and install the Windows package; respectively let somebody else install it. You surely have a friendly colleague with a higher pain tolerance, still stuck on Windows. They will help you out, won’t they?

You will see that the installation directory essentially contains a uber-jar plus some scripts, helper exe files and samples.

$ tree DataLoader
DataLoader
├── bin
│   ├── dataloader-38.0.0-java-home.exe
│   ├── dataloader-38.0.1-java-home.exe
│   ├── encrypt.bat
│   └── process.bat
├── dataloader-38.0.1.exe
├── dataloader-38.0.1.l4j.ini
├── dataloader-38.0.1-uber.jar
├── dataloader.ico
├── licenses
│   ├── ApacheCommonsBeanutils_license.txt
│   ├── ...
│   └── wsc-license.rtf
├── samples
│   ├── conf
│   │   └── ...
│   ├── data
│   │   └── ...
│   └── status
│       └── ...
└── Uninstaller.exe

6 directories, 38 files

The tool is developed purely in Java, which means it is perfectly portable between operating systems. As far as we know no native library dependencies exist. The only part we need from the installation is dataloader-38.0.1-uber.jar.

Using wine if no Windows installation is available

You don’t have a friendly Windows using colleague? Keep calm! You can use wine to install and extract the vendor package.

To do so, install wine32 and wine64 for you Linux distribution. For Ubuntu this is done by calling:

$ apt-get install wine64 wine64-tools wine32

Depending on the details you may need to install other packages as well. You can now execute the installer by running:

$ wine ApexDataLoader.exe

Agree to the license agreement:

Salesforce Data Loader Installation Wizard: Confirm License agreement

Select the packages to be installed:

Salesforce Data Loader Installation Wizard: Select packages

It doesn’t matter too much, what you select here. Just make sure the Command line tools are ticked.

After clicking through the wizard, you will see a message complaining about a missing Java installation. You can safely ignore it. We do not want to run Salesforce Data Loader in wine. We just want to reap the installation results. Salesforce Data Loader will have been installed into ~/.wine/drive_c/Program Files/salesforce.com/Data Loader. The folder structure will look like described above.

How to use Salesforce Data Loader on Linux

We successfully used the tool by rewriting the examples from the Windows batch script documentation with a direct call to Java on Linux.

E.g. the example for creating the encryption key file

$ encrypt.bat —k [path to key file]

becomes

java -cp dataloader-38.0.1-uber.jar com.salesforce.dataloader.security.EncryptionUtil -k [path to key file]

Or the example for importing the data

process.bat "<file path to process-conf.xml>" <process name>

becomes

java -cp dataloader-38.0.1-uber.jar -Dsalesforce.config.dir="<file path to process-conf.xml>" com.salesforce.dataloader.process.ProcessRunner process.name="<process name>"

Otherwise, the Windows Documentation for the Salesforce Data Loader Command-Line applies as usual. You just

  • prepare your key,
  • prepare your .sdl field mapping files and
  • prepare your process-conf.xml

before running the ProcessRunner.

Optional: Create shell scripts

Instead of calling the functions using the verbose command lines above, you can replace the bat scripts with shell scripts. Modern Linux distributions have their own mechanism for choosing Java environments, so we don’t have to care about JAVA_HOME.

Here is an example to get you started for encrypt.bat:

#!/bin/bash
#
# encrypt.sh 
# replaces encrypt.bat

# routine safety measures
set -o nounset
set -o errexit
set -o pipefail

java -cp dataloader-38.0.1-uber.jar com.salesforce.dataloader.security.EncryptionUtil "$@"

… and here for process.bat:

#!/bin/bash
#
# process.sh
# replaces process.bat

# routine safety measures
set -o nounset
set -o errexit
set -o pipefail

if [ -z "${1}" ]; then
  cat <<-EOF
Usage: process <configuration directory> [process name]

       <configuration directory>    directory that contains configuration files,
                                    i.e. config.properties, process-conf.xml, database-conf.xml

       <process name>               optional name of a batch process bean in process-conf.xml,
                                    for example:

                                       process ../myconfigdir AccountInsert

                                    If process name is not specified, the parameter values from config.properties
                                    will be used to run the process instead of process-conf.xml,
                                    for example:

                                       process ../myconfigdir
EOF
fi

config_dir="${1}"
process_option="${2:+process.name=}${2:-}"

java -cp dataloader-38.0.1-uber.jar -Dsalesforce.config.dir="${config_dir}" com.salesforce.dataloader.process.ProcessRunner "${process_option}"

With these scripts, you can just call encrypt.sh instead of encrypt.bat and process.sh instead of process.bat, e.g. with the examples above:

encrypt.sh —k [path to key file]

or

process.sh "<file path to process-conf.xml>" <process name>

Afterthoughts: Breaking lock-in

At metamorphant we are certainly no Salesforce experts. Most of the time our Salesforce-related tasks revolve around integration and around one of our core businesses: Breaking lock-in for our customers, who fell into a vendor trap resp. preventing it beforehand.

Quite often this means

  • fixing the packaging and development sins of commercial vendors,
  • bridging the gap of lacking APIs or
  • promoting open standards and the use of open source software.

This enterprise liberation work involves a broad knowledge of recent, legacy and ancient technologies. It requires the skill to quickly understand complex evolved system landscapes and reshape them for the better.

In the case which inspired this blog post, we were about to decommission a customer’s legacy integration interface relying on commercial integration software. Thereby, we freed up resources by

  • saving license costs,
  • enabling people to invest more time in reusable generic skills instead of vendor-/product-specific skills,
  • simplifying the overall IT architecture.


Contact us