Jason McKittrick's Blog

Azure Data Factory or SSIS

 

I have been a on few calls with my customers lately and there seems to be confusion when it comes to Azure Data Factory (ADF) and how it compares to SSIS.  So I thought I would post a blog entry on it.

First ADF is not “SSIS in the cloud”.  What makes them different you might ask? SSIS is an Extract-Transfer-Load(ETL) tool, but ADF is a Extract-Load Tool(ELT), as it does not do any transformations within the tool, instead those would be done by ADF calling a stored procedures on a SQL Server that does the transformation, or calling a Hive job, or a U-SQL job in Azure Data Lake Analytics, for example.  Think of it more of an orchestration tool.  SSIS has the added benefit of doing transformations in memory and is extremely fast utilizing this method of ETL, keeping in mind the performance of any transformation depends on the power of the server hardware that SSIS is installed on.

Here is a table of some of the other differences that may be helpful:

Azure Data Factory SQL Server Integration Services
  • Cloud-based service (via ADF editor in Azure portal)
  • SSIS is a desktop tool (via SQL Server Data Tools)
  • PaaS service does not require hardware or any installation
  • Requires a decent sized server that you manage and install SQL Server with SSIS
  • Uses JSON scripts for its orchestration (coding)
  • Uses drag-and-drop objects (very little to no coding)
  • Pay-as-you-go via an Azure subscription
  • License cost as part of SQL Server
  • Can spin up HDInsight clusters and run Pig and Hive scripts for BIG Data
  • ADF has a basic editor and no intellisense or debugging as of authoring of this blog.
  • Powerful GUI, intellisense, and debugging
  • Administered via the Azure portal
  • Administered via SSMS
  • Does not have a programming SDK, has automation via PowerShell, and no third-party components
  • Programming SDK, automation via BIML, and third-party components
  • No Error Handling
  • Extensive Error Handling
  • Has “data lineage“, tagging and tracking the data from different sources
  • SSIS does not have Data Lineage
  • Currently a limited range of data sources and destinations
  • Wide range of supported data sources and destinations

Think of ADF as a complementary service to SSIS, with its main use case confined to dealing with big data in the cloud. In my opinion I would expect this product offering to evolve or be replaced with SSIS or Informatica functionality(since Microsoft has a large multi-billion investment in Informatica).

 

1 comment for “Azure Data Factory or SSIS

  1. Jamie
    August 7, 2017 at 11:35 am

    Thank you for this post. It was a quick comparison for me to understand ADF and SSIS differences. Also thanks for all the embedded links I appreciate you pointing out the additional referenced reading.

Leave a Reply

Your email address will not be published. Required fields are marked *