So, you want to bring Snowflake’s advanced ML/AI capabilities to bear on your mainframe data? Treehouse Software enables that…

by Dan Vimont, Director of Innovation at Treehouse Software, Inc. and Joseph Brady, Director of Business Development at Treehouse Software, Inc.

The exploding popularity of advanced data analytics platforms such as Snowflake, where an ever-expanding array of machine learning and artificial intelligence tools are available to generate vital insights from your enterprise’s data, has quickly transformed the world of data processing.  Your data science teams are sitting there at their Snowflake consoles, eagerly awaiting the arrival of critical data from your mainframes to supercharge their predictive analytics and generative AI frameworks.

They’re waiting…

So, what’s the hold-up?

Oh yeah, getting legacy data out of ancient mainframe datastores and into Cloud analytics frameworks is HARD, right?

Um, no, actually — it’s not.

The Treehouse Software solution…

____0_Mainframe_To_Snowflake01

How does it work?

  1. We start at the source — the mainframe — where an agent (with a very small footprint) extracts data (in the context of either bulk-load or CDC processing).
  2. The raw data is securely passed from the mainframe to MDR (Treehouse Mainframe Data Replicator powered by Rocket® Software) which speedily transforms mainframe-formatted data into Unicode/JSON and publishes the results to a Kafka topic.
  3. Our efficient and autoscaling microservices take it from there. Treehouse Dataflow Toolkit functions consume the data from Kafka, automatically prepare landing tables, views, and additional infrastructure in Snowflake, and then land the data in Snowflake (all the while adhering to Snowflake’s recommended “best practices” for massive data loading, thus assuring shortest and surest loads).

Snowflake tables and views: something for everybody

Within this framework, the Snowflake staging tables are constantly accruing historical data, ideally suited for data scientists looking to do trend analysis, predictive analytics, ML, and AI work.  For business analysts and others who prefer structured data representations of potentially complex hierarchical data, the Treehouse framework also automatically provides structured user-views.

… and the world keeps on changing, so keep your options open!

Publishing both bulk-load and CDC data to a reliable and scalable framework like Kafka allows you to maintain a broad array of options to ultimately feed your legacy data to any number of JSON-friendly ETL tools, target datastores, and data analytics packages (some of which may not even have been invented yet!).  In addition to Snowflake, the Treehouse Dataflow Toolkit also currently targets Amazon Redshift, Amazon DynamoDB, and Amazon Athena/S3.


__TSI_LOGO

Contact Treehouse Software today to discuss your project, or to schedule a demo.