TDT: Much more than a mere “data connector” for Snowflake

by Joseph Brady, Director of Business Development at Treehouse Software, Inc. and Dan Vimont, Director of Innovation at Treehouse Software, Inc.

____0_TDT_Snowflake_Splash

Over the past few months, we have been rolling out information on Treehouse Dataflow Toolkit (TDT), a state-of-the-art, fully automated offering for data transfer from Kafka pipes to Analytics/ML/AI frameworks.  TDT is a set of proprietary microservices that assures highly-available, auto-scalable, and event-driven data transfers to your data science teams’ favorite analytics frameworks, such as Snowflake, Amazon Redshift, Amazon Athena/S3Amazon S3 Express One Zone Buckets, as well as Amazon Aurora PostgreSQL, all the while adhering to AWS’s and Snowflake’s recommended best practices for massive data loading. Make no mistake, TDT is MUCH more than merely a “connector”.

In this blog, we will focus on how TDT handles data transfers to perhaps the most complex environment: Snowflake.  Out of all TDT functions and features, our Snowflake connectivity offers the biggest “value added” to customers, because Snowflake has quickly become a top choice for enterprises looking for a Cloud platform onto which they can mobilize data at near-unlimited scale and performance, and bring advanced ML/AI capabilities.

Snowflake overview video…

Connectivity using Snowflake’s best practices vs. traditional ODBC…

TDT’s innovative Lambda-based (microservices) approach enables faster data flow than any conceivable ODBC-based solution, which is the standard tool used for most “roll your own” approaches, or “we have a connector for that” offerings.  

To load massive quantities of data to a target, TDT uses Snowflake’s (hugely scalable) bulk load utilities—not ODBC. It is vital to note that Snowflake is NOT a relational (OLTP) database, so doing CDC transfers to these targets via ODBC (with update, insert, delete transactions) goes directly against “best practices” advice from Snowflake, and would almost assuredly result in unwieldy bottlenecks.

____0_TDT_Snowflake01

TDT loads data into Snowflake’s “delta tables”, which inherently retain the entire history of source data ever since the source-to-target synchronization began (perfect for time-based trend/predictive/prescriptive analytics). Again, TDT adheres to Snowflake’s best practices recommendation for pulling data from S3 for bulk loading massive quantities of data…

____0_TDT_Snowflake02

Publishing both bulk-load and CDC data to a reliable and scalable framework like Kafka allows you to maintain a broad array of options to ultimately feed your legacy data to any number of JSON-friendly ETL tools, target data stores, and data analytics packages (some of which have not even been invented yet!). 

The “build vs buy” question is put to rest…

The Snowflake-proprietary target DDL/metadata/resources that TDT automatically produces for the staging of data in Snowflake are of such complexity that it is easy to justify the “buy” option in the “build vs buy” conversations customers have. A decision by an enterprise not to use TDT, but instead to build its own Kafka-to-Snowflake solution, could result in any or all of the following:

  • accumulation of technical debt
  • extensive/unpredictable time to production
  • ongoing resource planning to maintain home-grown technologies
  • potential vendor lock for maintenance of custom-made technologies designed and developed by consultants
  • managing a mix of manual and automated functions
  • tracking cobbled together components created by multiple staff and consultants
  • limited agility for future customization and innovation
  • problems adhering to evolving best practices over time
  • higher costs for future growth/scaling
  • potential lack of proper security/ongoing security updates
  • your organization has now become an enterprise software development company, whether or not you intended that, and whether or not you realized that!

Simply put, TDT is a self-contained, turn-key solution that can eliminate months, or years, of research and development time and costs. With TDT, high-speed and massive data movement to Snowflake takes minutes to ramp up.

Download the TDT AWS Partner Solution Brief to share with your team…

DOWNLOAD…AWS_TDT_Product_Brief_Thumb01

Treehouse Dataflow Toolkit (TDT) is Copyright © 2024 Treehouse Software, Inc. All rights reserved.

____Treehouse_AWS_Badges 

Contact Treehouse Software for a Demo Today!

Contact Treehouse Software today for more information or to schedule a product demonstration.

So, You’ve Managed to Start Streaming Your Legacy Data into Kafka Pipelines… Now What?

by Joseph Brady, Director of Business Development at Treehouse Software, Inc. and Dan Vimont, Director of Innovation at Treehouse Software, Inc.

Treehouse_Dataflow_Toolkit_Splash

Treehouse Software is helping customers modernize their valuable enterprise data on Cloud and Hybrid Cloud environments without disrupting the existing critical work on their legacy systems. However, a new strategic imperative has been added to the modernization game—the requirement to utilize today’s advanced Analytics/AI/ML-friendly platforms, such as Amazon Redshift, Snowflake, Amazon Athena/S3, Amazon S3 Express One Zone Buckets, as well as Amazon Aurora PostgreSQL, where an ever-expanding array of AI/ML tools are available to generate vital insights from the customer’s data. Many of these customers are already using software tools provided by Treehouse, or other vendors to replicate their data into various target data stores, but also more crucially into Kafka pipelines (i.e., Amazon MSK, Confluent, etc.). Kafka is now the top choice for high-speed streaming of massive volumes of mission critical data, providing stable performance under extreme loads. This is especially valuable for enterprises that require up-to-the-second data delivery for use cases that include e-commerce, financial services, logistics, telecommunications, and government IT.

Traditionally, Treehouse customers utilized our data replication technologies to load legacy data into Kafka pipelines, and that was where our involvement generally ended…

____0_Traditional_Mainframe_To_Kafka

However, once Kafka is designated as a target in the customer’s architecture, we have increasingly become involved in two questions: “What now?”, and/or “What is the best mechanism for us to rapidly transfer data from Kafka to advanced analytics platforms?” Our answer: Look no further than Treehouse Software!

Treehouse Software brings a state-of-the-art, fully automated offering for data transfer from Kafka pipes to Analytics/ML/AI frameworks: the Treehouse Dataflow Toolkit (TDT).  TDT is a set of proprietary microservices that assures highly-available, auto-scalable, and event-driven data transfers to your data science teams’ favorite analytics frameworks, all the while adhering to AWS’s and Snowflake’s recommended best practices for massive data loading, thus assuring shortest and surest loads. Additionally, TDT provides a frictionless and instant implementation, accelerating your path to deep data insights for optimizing business processes.

Why do AWS’s and Snowflake’s best practices recommend against using ODBC?

Your data science teams need large quantities of the very latest data in near-real-time, and ODBC doesn’t really do the job, offering only single-threaded, difficult to scale pipes. By contrast, TDT’s approach not only keeps things up-to-date faster than any conceivable ODBC-based solution, but the “delta tables” into which it loads data also inherently retain the entire history of source data ever since the source-to-target synchronization began (perfect for time-based trend/predictive/prescriptive analytics).  To load massive quantities of data to a target, TDT uses the target vendors’ (massively scalable) bulk load utilities—not ODBC. It’s vital to note that Snowflake and Redshift are NOT relational (OLTP) databases, so doing CDC transfers to these targets via ODBC (with update, insert, delete transactions) goes directly against “best practices” advice from the vendors, and would almost assuredly result in unwieldy bottlenecks.

What if my data is not on a mainframe?

No worries. Treehouse Software’s messaging is primarily mainframe-centric, since that has been our area of expertise and bread-and-butter for over 40 years. However, data movement is data movement, and if your mainframe, or non-mainframe, data is being pumped to a Kafka pipeline, TDT will take it from there. When a data replication tool publishes both bulk-load and CDC data in JSON format to a reliable and scalable framework like Kafka, it sets the stage for TDT to feed legacy data to any number of JSON-friendly ETL tools, target data stores, and the latest (or yet to be invented) data analytics packages. TDT is the turn-key solution for the easiest and fastest implementation of Kafka data transfer…

Treehouse_Dataflow_Toolkit03

TDT allows you to quickly ramp up your data analytics game by providing a rapid flow of data fresh off your enterprise data systems.

Download: TDT AWS Partner Solution Brief to share with your team…

DOWNLOAD…AWS_TDT_Product_Brief_Thumb01

Treehouse Dataflow Toolkit (TDT) is Copyright © 2024 Treehouse Software, Inc. All rights reserved.


____Treehouse_AWS_Badges 

Contact Treehouse Software for a Demo Today!

Contact Treehouse Software today for more information or to schedule a product demonstration.

Treehouse Software Customer Case Study: Infineon Technologies—Real-time and Bi-directional Data Synchronization between Adabas on Unix and MS SQL Server on Windows

Business Use case

Infineon, the largest German semiconductor manufacturer, wanted to update their Manufacturing Execution Systems (MES) by changing the database and replace the corresponding application that was rewritten by Systema, a software house that specialized in MES.

The customer’s senior management decided that the aging Adabas and Natural system had to be replaced, and the goal was to accomplish this within three years for three production sites.

In a preliminary Proof of Concept (PoC), the Infineon team selected MS SQL Server as the new database system.

In total, three production systems had to be migrated together with a couple of different sub systems. The related systems for development and testing also needed to be considered in the customer’s plans.

The customer’s primary question during the planning stage was, “Do we try doing the switch all at once, in a big bang, or should we go with a phased approach?”  The big bang transition seemed too risky for the customer’s production goals, so the decision was made to plan a phased, incremental migration with functional areas defined. Advantages of the incremental approach were that a fall-back plan could be put in place to address any problems, smaller package sizes were defined (safe harbors), and results were easily monitored. Additionally, there is no downtime for the production system while data replication is occurring.

After setting up a workflow with all the necessary actions, it turned out that a sequential approach was very time-consuming, so the next question was, “How can we parallelize tasks?” The answer was to find a product that supports real-time replication of data from the old DB to the new DB, so production will not be interrupted and migrated parts of the application are already run on the new DB, while non-migrated parts of the application are still running on the old DB.

The software must also guarantee co-existence between the currently used Adabas database and the new MS SQL Server database. And the software must support the change from the non-SQL Adabas DB structures to a MS SQL Server DB with normalized structures.

Bi-directional replication is another requirement, because during a certain phase in the migration, updates must be replicated in both directions.

The Technology Solution...

Infineon contacted Treehouse Software in 2020 to discuss Rocket Data Replicate and Sync (RDRS), the product formerly called tcVISION, and set up a presentation and discussion. When the session showed promise for RDRS, a PoC was scheduled to demonstrate if the above-mentioned use case and requirements could be handled by RDRS. The first part of the PoC ran with Oracle, which was Infineon’s initial choice of target RDBMS and afterwards the teams tested with MS SQL Server, which was Infineon’s final choice. The PoC consisted of a 3-phase migration model, transformation capabilities, performance testing, bi-directional replication, and various other requirements. The PoC produced successful results.

_0_Infineon_Diagram

Additionally, with RDRS, no replication efforts were needed in the application and a safe switchover was achieved.

Interestingly, Infineon went with the most complicated scenario first, so they could identify any difficulties early on in the project. Infineon and Treehouse technical teams fulfilled the use cases and requirements requested by Infineon senior management in order to move into production.


__TSI_LOGO

About Treehouse Software

Since 1982, Treehouse Software has been serving enterprises worldwide with industry-leading software products and outstanding technical support. Today, Treehouse Software is a global leader in providing data replication, and integration solutions for the most complex and demanding heterogeneous environments, as well as feature-rich, accelerated-ROI offerings for information delivery, and application modernization. Please contact Treehouse Software at sales@treehouse.com with any questions. We look forward to serving you in your modernization journey!

A Treehouse Software Proof of Concept is the low-risk approach to testing mainframe data replication on Cloud and Hybrid Cloud environments

by Joseph Brady, Director of Business Development / Cloud Alliance Leader at Treehouse Software, Inc.

____0_Mainframe_To_Cloud

Many Treehouse Software customers have discovered the value of saving weeks, or months in their mainframe modernization initiatives by engaging in a Rocket Data Replicate and Sync (RDRS) Proof of Concept (POC) for Mainframe-to-Cloud data replication. Depending on the complexity of the customer’s project, an RDRS POC generally lasts as little as 10 business days after the product is installed and all connectivity is set up between the mainframe and Cloud environments.

How does it work?

  1. Treehouse Software provides documentation beforehand that outlines all of the requirements and agenda for the POC, and Treehouse technicians assist in downloading and installing RDRS.
  2. The customer provides a representative subset of z/OS or z/VSE mainframe data (e.g., Db2, Adabas, VSAM, IMS/DB, CA IDMS, CA DATACOM, etc.), use case, and goals for the POC, and the Treehouse team mentors the customer’s technical team via remote screen sharing sessions.
  3. The application is executed on customer facilities, in a non-production environment, and a limited-scope implementation of RDRS is conducted to prove that the product meets the customer’s desired use case.

By the end of the POC, customers will have replicated mainframe data on their Cloud target, tested out product capabilities, and demonstrated a successful, repeatable data replication process, with documented results. After the POC, the customer has all the connectivity and processes in place to begin setting up the production phase of their mainframe data modernization project. The minimal cost and resources makes an RDRS POC a valuable ROI in the customer’s mainframe modernization journey.

About RDRS…

Many Cloud and Systems Integration partners are recommending RDRS for mainframe data modernization projects. RDRS focuses on changed data capture (CDC) when transferring information between mainframe data sources and Cloud targets. Through an innovative technology, changes occurring in any mainframe application data are tracked and captured, and then published to a variety of RDBMS and other targets.

RDRS utilizes a Windows-based GUI Control Board, which is ideal for non-mainframe programmers. While mainframe experts are required in the design/architecture phase during the POC and occasionally during implementation, the requirement for their involvement is limited. The RDRS Control Board acts as a single point of administration, data modeling and mapping, script generation, and monitoring. Comprehensive monitoring and logging of all data movements ensure transparency across all data exchange processes.

Additionally, once RDRS is up and running, the customer’s legacy mainframe environment can continue as long as needed, while they replicate data – in real time and bi-directionally – on the new Cloud platform. Now the enterprise can quickly take advantage of the latest Cloud services, such as advanced analytics, ML/AI, etc., as well as move data to a variety of highly available and secure databases and data stores.


__TSI_LOGO

Contact Treehouse Software Today…

Contact us to discuss how a Treehouse Software POC can accelerate your mainframe Cloud and hybrid Cloud data modernization journey.

Does your data science team want to accelerate insights and bring advanced ML/AI capabilities to your mainframe data with Amazon Redshift? Sure they do—and Treehouse Software enables that…

by Joseph Brady, Director of Business Development at Treehouse Software, Inc. and Dan Vimont, Director of Innovation at Treehouse Software, Inc.

We are beginning to see a pleasant and welcomed trend with Treehouse customers who are looking to modernize their valuable mainframe legacy data on the Cloud—they are including their data science teams in the important planning phase of architecting new Cloud environments and targets. This is especially vital for customers who want to incorporate advanced analytics and ML/AI in their strategic data usage plans on the Cloud. Who can contribute better understandings of ultimate data usage than your resident data scientists?

____0_Amazon_Redshift

We have heard from many of these data scientists that a primary item on their “wish lists” is for a fully managed, AI powered, massively parallel processing (MPP) architecture to extract maximum value and insights. They specifically mention Amazon Redshift as the Cloud data warehouse (which is much more than a data warehouse) of choice for driving digitization across the enterprise, as well as help personalizing customer experiences. Redshift uses SQL to analyze structured and semi-structured data across data warehouses, operational databases, and data lakes, using AWS-designed hardware and ML to deliver the highest performance at any scale. To this desire/question, we can answer with a resounding, “Yes, Treehouse Software has got you covered with Redshift connectivity!”.

The Treehouse Software solution…

Enterprise customers have come to Treehouse Software, because we bring not only proven mainframe data replication tools, but deep subject matter expertise in mainframe technologies, as well as the know-how to target relevant AWS offerings, such as Redshift, S3 (including S3 Express One Zone – see our recent blog on S3 Express One Zone), etc.

The Rocket Data Replicate and Sync (RDRS) solution allows customers’ legacy mainframe environment to operate normally while replicating data on AWS. The technology focuses on changed data capture (CDC) when transferring information between mainframe data sources and Cloud-based databases and applications. Through an innovative set of technologies, changes occurring in any mainframe datastore are tracked and captured, and ultimately published to Redshift.

____0_Mainframe_To_RedshiftHow does it work?

  1. We start at the source – the mainframe – where an agent (with a very small footprint) extracts data (in the context of either bulk-load or CDC processing).
  2. The raw data is securely passed from the mainframe to RDRS, which speedily transforms mainframe-formatted data into Unicode/JSON and publishes the results to a Kafka topic.
  3. Our efficient, autoscaling microservices take it from there. Treehouse Dataflow Toolkit functions consume the data from Kafka and land it in S3 buckets, where Treehouse’s proprietary crawler technology is used to automatically prepare landing tables, views, and additional infrastructure in Redshift.  Thenthe mainframe data is loaded into Redshift (all the while adhering to AWS’ recommended “best practices” for massive data loading, thus assuring shortest and surest loads).  The inherent reliability and scalability of the entire pipeline infrastructure assure near-real-time synchronization between mainframe sources and Redshift target tables.

Redshift tables and views: something for everybody

Within this framework, the Redshift staging tables (often referred to as “delta tables”) are constantly accruing historical data, ideally suited for data scientists looking to do trend analysis, predictive analytics, ML, and AI work.  For business analysts and others who prefer structured data representations of potentially complex hierarchical data, the Treehouse framework also automatically provides structured user-views, providing the look and feel of a SQL database.

…as innovations move faster along the timeline, keep your options open!

Publishing both bulk-load and CDC data to a reliable and scalable framework like Kafka allows you to maintain a broad array of options to ultimately feed your legacy data to any number of JSON-friendly ETL tools, target datastores, and data analytics packages (some of which may not even have been invented yet!).  In addition to Redshift, the Treehouse Dataflow Toolkit also currently targets Snowflake, Amazon DynamoDB, and Amazon Athena/S3.

Video – Introduction to Data Warehousing on AWS with Amazon Redshift…


__TSI_LOGO

Contact Treehouse Software today to discuss your project, or to schedule a demo of our Mainframe-to-AWS real-time and bi-directional data replication solution. 

Treetip: Treehouse Software can help enterprise mainframe customers accelerate their data analytics, machine learning, and AI journeys by targeting the new Amazon S3 Express One Zone

by Joseph Brady, Director of Business Development and Cloud Alliance Leader at Treehouse Software, Inc.

Treehouse Software specializes in helping enterprise customers with Mainframe-to-Cloud, Multi-Cloud, and Hybrid Cloud data modernization projects. Many times, our customers not only discuss strategies for replicating their mainframe data, but also their plans for what they want to do with that data on the Cloud side.  This makes it important to our team to stay current on the latest Cloud offerings that can benefit our customers’ enterprise modernization planning. Consequently, a very exciting announcement caught our attention during the 2023 AWS re:Invent conference—the general availability of a new type of S3 storage service referred to as Amazon S3 Express One Zone Storage Class

For those unfamiliar, Amazon S3 (“simple storage service”) is the basic file storage service of AWS, and as such it forms a foundational pillar of the entire AWS world. Amazon S3 Express One Zone is a new type of S3 bucket called a “directory bucket”, which is purpose-built to deliver consistent, single-digit millisecond data access for an enterprise’s most frequently used data and latency-sensitive applications. The new S3 directory buckets allow customers to store data in a single Availability Zone (AZ) that they specifically select, as opposed to the default of three AZs for standard S3. This eliminates the latency associated with spreading data across multiple AZs, providing applications with lower-latency storage. S3 directory buckets also follow a different request scaling model compared to traditional buckets, and their authentication is based on sessions rather than on a per-request basis. Bottom line… reduction in compute time = greater cost reduction.

S3 Express One Zone is ideally suited for services such as Amazon SageMaker Model TrainingAmazon AthenaAmazon EMR, and AWS Glue Data Catalog to accelerate Machine Learning (ML) and interactive analytics workloads. With S3 Express One Zone, storage automatically scales up or down based on consumption and need, and customers no longer need to manage multiple storage systems for low-latency workloads.

So, why is S3 Express One Zone important to Treehouse mainframe modernization customers?

____0_Mainframe_To_S3ExpressOneZone

Amazon S3 Express One Zone just made the Amazon S3 targeting in the Treehouse Dataflow Toolkit (TDT) potentially much more potent and valuable to our enterprise mainframe customers.  When an enterprise uses TDT to land their mission critical data in Express One Zone flavored Athena/S3 buckets, it becomes more directly accessible and manipulable by the various AWS ML and AI tools. In short, if customers choose, Express One Zone Athena/S3 becomes an intermediate data store for big data processing workloads and advanced analytics.

So, when we are asked, “What should Treehouse Software be doing to respond to the burgeoning interest in ML, Generative AI, etc.?”, the answer is — We are doing exactly what we need to be doing.  AI and ML frameworks are the newest incentive for people to use RDRS (Rocket Data Replicate and Sync — formerly called tcVISION) and TDT from Treehouse Software to replicate their mainframe data on advanced data analytics frameworks, or possibly into super-charged S3 Express One Zone buckets.  

Video – Deep Dive Introduction to Amazon S3 Express One Zone Storage Class:


__TSI_LOGO

Contact Treehouse Software today to discuss your project, or to schedule a demo of our Mainframe-to-AWS real-time and bi-directional data replication solution. 

3-Minute Video: Data Management and Processing with Rocket Data Replicate and Sync (formerly tcVISION)

by Joseph Brady, Director of Business Development and Cloud Alliance Leader at Treehouse Software, Inc.

Treehouse Software is a worldwide distributor of Rocket Data Replicate and Sync (formerly tcVISION), the leading tool for using change data capture (CDC) for synchronizing mainframe data with real-time and bi-directional data replication. This video focuses on the product’s data management and use of “staged processing” to minimize its footprint on the mainframe system…


__TSI_LOGO

Contact us today for a live, online demo…

Simply fill out our Demonstration Request Form and a Treehouse representative will contact you to set up a time for your requested demonstration.

What is meant by “Regional Data Sovereignty” when replicating enterprise data on AWS?

by Joseph Brady, Director of Business Development and Cloud Alliance Leader at Treehouse Software, Inc.

I have recently been taking some classes in preparation for an AWS certification. In some of these classes, an example scenario has been used that speaks to an issue I’ve often heard mentioned by Treehouse mainframe customers­–that of “Regional Data Sovereignty”. For example, a customer might have government compliance requirements that financial information in Frankfurt cannot leave Germany, and many other countries have similar restrictions and regulatory controls in place.

Fortunately, Regional Data Sovereignty is a critical part of the design of AWS Global Infrastructure. Within this infrastructure, there are AWS Regions which address data that is subject to local laws and statutes of the country in which a Region is located. With the understanding that the customer’s data and application live and runs in various geographical Regions, there are four business factors a customer should consider when choosing a Region:

  1. Compliance. Before any other factors, customers must first look at their regional compliance requirements to determine if data must live within certain geographical boundaries.
  2. Proximity. How close the enterprise is to its customer base is another major factor because of possible latency issues between countries.  Locating a Region closest to the customer base is generally the best choice.
  3. Feature availability. Sometimes the closest Region may not have all the AWS features a business needs. Every year thousands of new features and products specifically to answer customer requests and needs are released by AWS. But sometimes those new services require new physical hardware that AWS has to build, so the service might be available one Region at a time. 
  4. Pricing. Even when the hardware is equal from one Region to the next, some locations are more expensive in which to operate. For example, the same workload in Sao Paulo could be significantly more expensive than if it is run out of Oregon in the United States. 

Additionally, events such as natural disasters, can happen to cause customers to lose connection to a data center, so a High Availability (HA) cutover plan should also be considered. The customer can run a second data center, but real estate prices alone could restrict that when considering all the duplicate expense of hardware, employees, electricity, heating and cooling, and security. Most businesses simply end up just storing backups somewhere, and then hope for the disaster to never come. And “hope” is not a good business plan. I recently covered how Treehouse Software can help provide an HA framework for mainframe customers in another blog.

Let’s take a look at the AWS Global Infrastructure and how its Regions are distributed worldwide…

____AWS_Global_Infrastructure

AWS Regions are built to be closest to the highest business traffic demands, such as in Paris, Tokyo, Sao Paulo, Dublin, and Ohio. Inside each Region, there are multiple data centers that have all the compute, storage, and other services customers need to run their applications. By utilizing AWS Regions for high availability of its business services, customers can be assured of minimal downtime of operations. Regions can be connected to each other through the high-speed AWS Direct Connect, which bypasses the public Internet, and the customer’s business decision maker chooses which Region they want to use. Each Region is isolated from every other Region in the sense that absolutely no data goes in or out of the customer’s environment in that Region without explicit permission for that data to be moved. These elements should be part of all critical strategic and security conversations when planning global distribution and availability of an enterprise’s data on AWS. 

Video – AWS Global Infrastructure explained…


__TSI_LOGO

Contact Treehouse Software today to discuss your project, or to schedule a demo of our Mainframe-to-AWS real-time and bi-directional data replication solution. 

So, you want to bring Snowflake’s advanced ML/AI capabilities to bear on your mainframe data? Treehouse Software enables that…

by Dan Vimont, Director of Innovation at Treehouse Software, Inc. and Joseph Brady, Director of Business Development at Treehouse Software, Inc.

The exploding popularity of advanced data analytics platforms such as Snowflake, where an ever-expanding array of machine learning and artificial intelligence tools are available to generate vital insights from your enterprise’s data, has quickly transformed the world of data processing.  Your data science teams are sitting there at their Snowflake consoles, eagerly awaiting the arrival of critical data from your mainframes to supercharge their predictive analytics and generative AI frameworks.

They’re waiting…

So, what’s the hold-up?

Oh yeah, getting legacy data out of ancient mainframe datastores and into Cloud analytics frameworks is HARD, right?

Um, no, actually — it’s not.

The Treehouse Software solution…

____0_Mainframe_To_Snowflake01

How does it work?

  1. We start at the source — the mainframe — where an agent (with a very small footprint) extracts data (in the context of either bulk-load or CDC processing).
  2. The raw data is securely passed from the mainframe to MDR (Treehouse Mainframe Data Replicator powered by Rocket® Software) which speedily transforms mainframe-formatted data into Unicode/JSON and publishes the results to a Kafka topic.
  3. Our efficient and autoscaling microservices take it from there. Treehouse Dataflow Toolkit functions consume the data from Kafka, automatically prepare landing tables, views, and additional infrastructure in Snowflake, and then land the data in Snowflake (all the while adhering to Snowflake’s recommended “best practices” for massive data loading, thus assuring shortest and surest loads).

Snowflake tables and views: something for everybody

Within this framework, the Snowflake staging tables are constantly accruing historical data, ideally suited for data scientists looking to do trend analysis, predictive analytics, ML, and AI work.  For business analysts and others who prefer structured data representations of potentially complex hierarchical data, the Treehouse framework also automatically provides structured user-views.

… and the world keeps on changing, so keep your options open!

Publishing both bulk-load and CDC data to a reliable and scalable framework like Kafka allows you to maintain a broad array of options to ultimately feed your legacy data to any number of JSON-friendly ETL tools, target datastores, and data analytics packages (some of which may not even have been invented yet!).  In addition to Snowflake, the Treehouse Dataflow Toolkit also currently targets Amazon Redshift, Amazon DynamoDB, and Amazon Athena/S3.


__TSI_LOGO

Contact Treehouse Software today to discuss your project, or to schedule a demo. 

New Relationship with Rocket Software Strengthens Treehouse Software’s Enterprise Modernization Offerings

Many have seen the announcement that Rocket Software has acquired BOS DigiTec GmbH, developer of tcVISION, which Treehouse Software markets, sells, and supports worldwide. We congratulate both companies on this exciting move to grow the product’s presence in the enterprise modernization market. Please note that Treehouse Software, Inc. was not mentioned in that announcement, because Treehouse is not part of this acquisition. However, our customers, partners, and prospects that we do business with should not be concerned and can be assured that Treehouse Software is still your source and point of contact for the tcVISION and tcACCESS products for years to come. After discussions with the Rocket Software team, we received the following statement for which we are grateful. We would like to share this statement from Rocket Software:

RocketSoftware_Logo

Dear Treehouse Customers,

We are thrilled to share some exciting news about tcVISION and tcACCESS, the products that have become your trusted solutions for data integration. BOS DigiTec GmbH, the German developer behind tcVISION and tcACCESS, has recently entered into a definitive agreement to be acquired by Rocket Software, a global technology leader that develops enterprise software for some of the world’s largest companies.  

As a result of this acquisition, tcVISION will now be rebranded as Rocket Data Replicate and Sync (RDRS) and will become an essential part of Rocket Software’s portfolio of modernization products. tcVISION, now RDRS, will serve as Rocket Software’s primary solution for real-time data replication, complementing their Rocket Data Virtualization solution (RDV), which focuses on providing modern APIs to access current mainframe data in place.

While the product’s name is changing, we want to assure you that it will remain the same exceptional solution you have come to know and trust from Treehouse Software. It’s the same technology with a new name. Treehouse Software, in collaboration with Rocket Software, is committed to keeping you well-informed throughout this transition.

Here is some additional information:

  • Who is Rocket Software and what do they do? Rocket Software partners with the largest enterprises across all industries globally to address their most complex IT challenges in infrastructure, data, and applications. Trusted by over 10,000 customers, Rocket Software enables enterprises to modernize in place with a hybrid cloud strategy, avoiding the need for costly re-platforming. Rocket Software is a privately held U.S. corporation headquartered in the Boston area, with centers of excellence strategically located throughout North America, Europe, Asia, and Australia.
  • How will the acquisition affect tcVISION’s product roadmap? RDRS, formerly tcVISION, will be integrated into Rocket Software’s portfolio of modernization products, promising enhanced scale, flexibility, and security for your organization’s existing infrastructure. Given Rocket Software’s global reach, we anticipate that product roadmaps and requests for enhancements will accelerate, and we look forward to sharing more details soon, as we prioritize listening to and addressing customer needs.
  • How will this acquisition help you? This acquisition will combine Rocket Software’s extensive mainframe portfolio and expertise with the tcVISION (RDRS) data replication capabilities, offering Treehouse Software’s customers the best of both worlds—the security and reliability of the mainframe and the advanced analytics of the cloud, all without incurring excessive business risk or expenses.  This combination allows you to keep core transaction processing workloads secure on the mainframe while benefiting from real-time data replication to the cloud, enabling the development of new applications and generative AI models simultaneously.
  • How will our customer support be affected? Your customer support experience with Treehouse Software remains unchanged and Treehouse is your point of contact for support. Rocket Software, like Treehouse Software, is dedicated to putting customers first, and your success remains our shared priority.
  • Will there be any interruptions in services during the transition? No, there will be no interruptions in services during the transition.
  • Who can I contact with questions? You can reach out to Treehouse Software team members, Joseph Brady, Director of Business Development at: jbrady@treehouse.com or Lynn McIntyre, Technical Support Leader at: lmcyntire@treehouse.com.  For sales-related questions, contact: sales@treehouse.com.
  • Where is Rocket Software located and what markets does it serve? Rocket Software is headquartered in Waltham, MA, USA, and operates globally with offices in North America, Europe, and Asia/Pacific. It serves a wide range of industries, including Aerospace & Defense, Auto Manufacturing, Banking & Finance, Education, Energy, Government, Healthcare, Insurance, and Retail, among others.
  • How big is Rocket Software? Rocket Software has over 2,500 employees worldwide.

This collaboration between Treehouse Software, a mainframe systems software company since 1983, and a global technology leader like Rocket Software marks a significant milestone in our journey to provide you with even better solutions and services.

We sincerely appreciate the trust and loyalty you have shown us. If you have any questions or require further information, please do not hesitate to reach out to your contact at Treehouse Software. Your satisfaction and success remain our top priority, and we are here to support you every step of the way.

Thank you for your continued partnership with Treehouse Software, and we look forward to delivering even more value to you through this exciting new collaboration with Rocket Software.


__tsi_logo_400x200

Conclusion from Treehouse Software…

Treehouse Software has been in business over 40 years, serving enterprise mainframe customers worldwide, and look forward to many more years of innovation and presence in this market space. Additionally, Treehouse Software has been representing tcVISION (now RDRS) for 17 years providing marketing, sales, trials, POCs, tech support, QA/testing, etc. which will continue, uninterrupted.

Please contact Treehouse Software at sales@treehouse.com with any questions. We look forward to serving you in your modernization journey!

Treehouse Software, Inc. | 2605 Nicholson Rd, Suite 1230 | Sewickley, PA 15143 | USA

T: 1-724-759-7070 | W: www.treehouse.com and www.tcvision.com