Replicating Enterprise Mainframe Data to Cloud-based SQL Databases with tcVISION

by Joseph Brady, Director of Business Development and Cloud Alliance Leader at Treehouse Software, Inc.

Treehouse Software has been helping enterprise mainframe customers since 1982, and in recent years, we have been developing a strong presence in the Mainframe-to-Cloud data replication market space. This blog takes a quick look at three of the most popular Treehouse-supported Cloud-bases SQL database services…

Amazon RDS, a collection of managed services that makes it simple to set up, operate, and scale databases in the Cloud. Users can control the type of database, as well as where data is stored. Specific database formats that are supported include Amazon Aurora, PostgreSQL, MySQL, MariaDB, Oracle Database, and SQL Server:

Google Cloud SQL, a fully managed relational database service for MySQL, PostgreSQL, and SQL server. You can connect with nearly any application, anywhere in the world. Cloud SQL automates backups, replication, and failover to ensure your database is reliable, highly available, and flexible to your performance needs:

Microsoft Azure SQL, a part of the Azure SQL family, Azure SQL Database is an always-up-to-date, fully managed relational database service built for the Cloud:


Wherever you want to target your mainframe data on the Cloud, Treehouse Software helps to make the process easy…

Treehouse Software is the worldwide distributor of tcVISION, the leading tool for using changed data capture (CDC) when transferring information between most mainframe data sources (IBM Db2, IBM VSAM, IBM IMS/DB, Software AG Adabas, CA IDMS, CA Datacom, or even sequential files) and Cloud and open systems-based databases and applications. Changes occurring in the mainframe application data are then tracked and captured, and published to a variety of targets.

tcVISION_Overall_Diagram_Cloud_OS

Additionally, tcVISION supports bi-directional data replication, where changes on either platform are reflected on the other platform (e.g., a change to a PostgreSQL table in the Cloud is reflected back on mainframe), allowing the customer to modernize their application on the Cloud or open systems without disrupting the existing critical work on the legacy system. tcVISION’s bi-directional replication writes directly to the mainframe database, thereby bypassing all mainframe business logic, so this architecture requires careful planning, as well as thorough and repeated testing.

Sales and technical leaders at the major Cloud platform companies, as well as systems integrators are engaging with Treehouse Software to take advantage of our tcVISION data replication solution to help them tap into the mainframe data that customers want to be made available on new technologies.


Further reading: tcVISION is featured on the AWS Partner Network Blog showing a walk-through of data replication between Mainframe DB2 z/OS and Amazon Aurora…

AWS Partner Network (APN) Blog: Real-Time Mainframe Data Replication to AWS with tcVISION from Treehouse Software.


__tsi_logo_400x200

Interested in seeing a live, online demo of tcVISION?

Just fill out the Treehouse Software tcVISION Demonstration Request Form and a Treehouse representative will contact you to set up a time for your online tcVISION demonstration.

tcVISION Mainframe Data Replication Solution is Featured in the Microsoft Azure Architecture Center

tcVISION is a data replication solution that provides an IBM mainframe integration solution for mainframe data replication, data synchronization, data migration, and change data capture (CDC) to multiple Azure data platform services.

____Azure_Architecture_Diagram

____Button_READ_MORE


__tsi_logo_400x200

Contact Treehouse Software Today…

Treehouse Software is the worldwide distributor of tcVISION, a software product that allows immediate data replication between many Mainframe sources and Cloud and Open Systems targets, enabling government, healthcare, supply chain, financial, and a variety of public service organizations meet spikes in demand for vital information. No matter where you want your mainframe data to go – the Cloud, Open Systems, or any LUW target, tcVISION from Treehouse Software is your answer.

Just fill out the Treehouse Software Product Demonstration Request Form and a Treehouse representative will contact you to set up a time for your online tcVISION demonstration.

Treehouse Software Customer Success: BMF uses tcVISION for Real-Time Data Replication Between Mainframe Adabas and PostgreSQL

BMF_Building

The Bundesministerium der Finanzen (BMF) is Germany’s Ministry of Finance and establishes sustainable fiscal policy that ensures financial empowerment of the federal budget. From tax policy via development of federal budget, to regulation of national and international financial markets – for these and other fiscal and economic questions of principle, the BMF creates strategies and concepts, and implements them. The Federal Tax Administration is part of BMF, and controls not only the cross-border goods traffic, but acts against illegal employment and other crimes. The tax administration also imposes consumer taxes (e.g., energy and tobacco tax, car tax, etc.). Financial relations between federation, countries, and communities are also coordinated by BMF.

Department II (federal budget) is part of the German government in charge of establishing the budget and financial planning of the federation. Throughout the year, it monitors execution of the budget for eventual intervention (e.g., with a budget freeze, or supplementary budget). After closing the fiscal year, the budget and balance sheet will be presented. The budget is a supplement of the budget act, legally binding.

The central service organization of BMF is the Informationstechnikzentrum Bund – ITZBund (Information technic center).

BUSINESS BACKGROUND

Drawing up the budget is a yearly, highly time consuming, and formalized business process. All departments are involved in nearly every sub-process, and budgeting and financial planning is supported by the application, “Haushaltsaufstellung / Budgetgeneration”. Using the generated reports, various addressees/receivers are supported (e.g., German Federal Government, German Federal Parliament, Federal Council of Germany, finance department in BMF, the employees in the departments, and the public).

Technically, the budget plan of the federation is based on technologies, including the IBM Mainframe with z/OS running Adabas and Natural.

The challenge was to provide an environment for employees in all departments that enables them to do their work quickly, easily, and efficiently. In the BMF, users must have an editorless, end-user driven, and real-time creation of ready-to-print products. An informative description of the workflow is shown on the website of the BMF.

The federal budget is available as download, or one can directly navigate through the data using the online application.

BUSINESS ISSUE

Some time ago, BMF decided to re-engineer the application for budget planning and port it to Open Source. To guarantee a seamless transition, the first step is propagation of data out of Adabas on z/OS to PostgreSQL, concluding with permanent synchronization.

The difficulties of this task are the complexities of setting up data definitions for the data structures in Natural and the propagation of data from Adabas on z/OS to PostgreSQL.

TECHNOLOGY SOLUTION: tcVISION

____Adabas_to_PostgreSQL_Diagram

After an analysis of the project, Treehouse Software proposed creating an extension to tcVISION’s change data capture (CDC) functionality for integration, so that tcVISION could enable BMF to continue using the implemented data definitions in a format suitable for the RDBMS.

The extension was developed within a few days, and a two-day on premise test demonstrated the solution fit the requirements of BMF.

BMF can now provide its data definitions from Natural LDA to the extension of tcVISION, and after the transformation, onto the PostgreSQL load process for processing. Another advantage of the tcVISION solution is that when needed, other targets can be integrated for propagation of data from the mainframe (e.g., Kafka, which BMF indicated is a future target environment).

Additionally, bi-directional propagation can be added in budget planning when BMF is ready.

Data structures are held in LDA, because this provides the advantages of higher flexibility in development and the adaption of new requirements to the data definitions. If definitions would have to be ported manually, in part, to PostgreSQL, it would have been a much bigger and error-prone effort.

Subsequent changes to Adabas structures can now use tcVISION’s newly developed extension to easily regenerate and load the correct definitions to the RDBMS, and tcVISION completely covers the customer’s requirements for special usage of *PEs and *MUs.

After thorough preparation and extensive testing, the solution was released to selected users first, then made available to all users.

* PEs and MUs are special Adabas formats for definition of tables. PE = Periodic Group, MU = Multiple Value Field.


__tsi_logo_400x200

Contact Treehouse Software for a Demo Today…

No matter where you want your mainframe data to go – the cloud, open systems, or any LUW target – tcVISION from Treehouse Software is your answer.

Just fill out the Treehouse Software Product Demonstration Request Form and a Treehouse representative will contact you to set up a time for your online tcVISION demonstration.


Further reading: Treehouse Software Customer Success – ETS: tcVISION for Real-Time Synchronization Between Mainframe IDMS and AWS RDS for PostgreSQL

Considerations for Planning Bi-Directional Mainframe Data Replication with tcVISION

by Joseph Brady, Director of Business Development and Cloud Alliance Leader at Treehouse Software, Inc.

Data_Modrnization

Many medium-to-large size enterprises use mainframe systems that are housing vast amounts of mission-critical data encompassing historical, customer, logistics, etc. information.  Each mainframe site is unique and can have decades worth of customizations requiring innovative approaches to establishing data replication on Cloud and open systems platforms. Fortunately for these customers, Treehouse Software has been in the mainframe software market since 1982, bringing deep experience in mainframe, Cloud, and open systems technologies, as well as delivering the tcVISION mainframe data replication product. Today, Treehouse Software is helping many enterprise mainframe customers accelerate digital transformation and successfully leverage Hybrid Cloud initiatives on the IBM Z platform, storing sensitive data on a private Cloud or local data center and simultaneously leveraging leading technologies on a managed public Cloud.

Treehouse Software’s tcVISION solution focuses on changed data capture (CDC) when transferring information between mainframe data sources and Cloud and open systems-based databases and applications. Changes occurring in the mainframe application data are then tracked and captured, and published to a variety of targets. Additionally, tcVISION supports bi-directional data replication, where changes on either platform are reflected on the other platform (e.g., a change to a PostgreSQL table in the Cloud is reflected back on mainframe), allowing the customer to modernize their application on the Cloud or open systems without disrupting the existing critical work on the legacy system. tcVISION’s bi-directional replication writes directly to the mainframe database, thereby bypassing all mainframe business logic, so this architecture requires careful planning, as well as thorough and repeated testing.

Plan carefully…

The following section offers some real-world customer examples, as well as considerations and recommendations when planning bi-directional replication for any mainframe/RDBMS environments. Bi-directional replication by its nature is a very complicated undertaking, so it is necessary that customers are fully educated in all environments, software, and processes before attempting to write data back to a mainframe database. It is always recommended that customers use a minimally effective measure of bi-directional replication required to accomplish their goal — and no more. An overblown project with unnecessary bi-directional data replication invites undue complexity and delays.

Real-world customer examples…

Treehouse Software has many customers performing bi-directional data replication, and each scenario is vastly different from the others, even if some have the same sources and targets as each other.  For example, some customers utilize a Master/Master, collision-heavy proposition, while others use uni-directional one way, then “flip a switch” uni-directional the other way. Another example is a customer who has a “grand circle,” where data hits multiple applications before it finally makes its way back to an RDBMS staging database that tcVISION replicates to the mainframe.

Example of a Treehouse customer’s bi-directional data replication environment using tcVISION:

tcVISION_Adabas_To_AWS_RDS

There are many planning and implementation stages that go into a successful mainframe replication environment, and performance testing is a vital part of a successful project.  For example, customers should do performance tests on how long it takes tcVISION to read a database log, transfer data, process data, etc.  During testing at one of our reference customer sites we found a significant difference in how long it took for their test and prod LPARs to transmit data to the Cloud, based on whether the mainframe TCP/IP stack used a 32-bit or 128-bit setting.

At another site, where we are helping a large government agency perform bi-directional replication on mainframe data, their original goal was for a significant percentage of mainframe objects to have bi-directional replication. It was determined that it would be impossible to extract business logic from the existing mainframe application for usage in the downstream application. Therefore, they have decided to use a middleware product to perform the “write-back” to the mainframe database.  Given the complexity of the mainframe application, this has proven the safest way for them to proceed.

Because of the variety of customer scenarios as described above, before any site can attempt bi-directional data replication, it is crucial that they have a well-tested uni-directional process with operational controls in place for a significant time period.  “Operational controls” means processes to restart scripts, evaluation of failed transactions, orchestration of mainframe/non-mainframe DBMS changes, etc.

Please contact Treehouse Software to discuss your Mainframe-to-Cloud and Open Systems modernization plans. We can help put in place a roadmap to modernization success.


__TSI_LOGO

Contact Treehouse Software Today for a tcVISION Demo…

No matter where you want your mainframe data to go – the Cloud, open systems, or any LUW target – tcVISION from Treehouse Software is your answer.

_0_Treehouse_tcV_Cloud_OpenSystems

Just fill out the Treehouse Software tcVISION Demonstration Request Form and a Treehouse representative will contact you to set up a time for your online tcVISION demonstration.


Providing a High Availability Framework for Mainframe-to-AWS Data Replication

by Dan Vimont, Cloud Solutions Architect at Treehouse Software, Inc.

tcV_HA_on_AWS

Treehouse Software customers are using tcVISION to enable mission-critical mainframe-to-AWS data replication pipelines.  Some of these production pipelines are providing vital near-real-time synchronization between source and target, and thus can’t afford any significant downtime in the event of failure.  So it’s only natural that a number of our customers have been asking for advice in setting up a high availability configuration for their tcVISION components that run on AWS EC2 instances.  The High Availability Framework discussed here provides for a Failover EC2 instance to automatically pick up tcVISION processing should the Primary instance (running in another Availability Zone) go down.

The Core Components:  Primary Instance & Failover Instance

The core components of a tcVISION high availability framework consist of two EC2 instances running in different Availability Zones:  a Primary EC2 instance and a Failover EC2 instance.  Both identically-configured EC2 instances are attached to a shared working-storage file system (either an EFS or FSx volume), which allows the Failover instance to seamlessly and quickly pick up tcVISION processing should the Primary instance suddenly become unavailable.

HA1

Use a Step Function to Automate the Failover Process

In the event of failure of the Primary instance, the recommended framework calls for automatic triggering of a Step Function for reliable failover processing, with steps that include the following:

  • verify that the Primary instance is unavailable (The tcVISION service cannot be active on both instances simultaneously, so this verification is vital.)
  • redirect all network traffic from the Primary instance to the Failover instance (via Route 53)
  • start tcVISION processing on the Failover instance

HA2

When Ready, Use a Step Function to Automate the Restoration Process

After operations personnel have completed recovery of the Primary EC2 instance, another Step Function may be manually triggered to reliably transfer tcVISION processing back to the Primary instance.

HA3.jp

Many More Details are Available Upon Request to Treehouse Customers

Full details regarding our recommended High Availability Framework for tcVISION are available upon request to Treehouse customers.  AWS services utilized in the complete recommended framework include Step Functions, Lambda Functions, EventBridge rules, CloudWatch alarms, SNS topics, a Route 53 Private Hosted Zone, and more.  The following diagram is a partial visual inventory of the recommended framework components.

HA5

Interested in seeing a live, online demo of tcVISION?

Just fill out the Treehouse Software tcVISION Demonstration Request Form and a Treehouse representative will contact you to set up a time for your online tcVISION demonstration.


__001_TSI_LOGO

Some are calling mainframes “dinosaurs”, but many of us see that as a good comparison!

by Joseph Brady, Director of Business Development and Cloud Alliance Leader at Treehouse Software, Inc.

____Cloud_Mainframe_Dinosaur

Since the dinosaur analogy has been used so much to describe mainframe computer systems in recent years, I would like to use this blog to take a look at the parallels of dinosaurs and mainframes as it relates to the current buzz about modernization on the Cloud.

Of course, dinosaurs and mainframes have been around for a long time and are extremely resilient and successful. I especially say “are” in relation to dinosaurs, because many are not extinct at all, and the fossil record shows that several types have adapted to the changing world by evolving into birds. Additionally, during the age of dinosaurs, they branched off into countless varieties during a span of about 165 million years – hardly a failed species. Also, like the dinosaurs, the mainframe has thrived and survived for over six decades and is continuing to adapt – albeit not nearly as long as the reign of the dinosaurs, but an impressive run, nonetheless.

And the mainframe isn’t finished yet! Mainframe systems are still very much in use, running major banking processes, healthcare systems, government IT services, and critical business operations of many Global 2000 companies. As a matter of fact, IBM has been reporting growth year after year, as the IBM Z platform continues to see important innovations, such as with Cloud-native development capabilities, as well as impressive improvements in processing power.

Looking up and moving forward…

____Cloud_Mainframe_Dinosaur04

As with the dinosaurs who did not fear looking to the clouds and taking wing to ensure survival, the new breed of mainframers envision bold and exciting possibilities in Cloud computing. Many see remarkable opportunities for business advantage by modernizing their mainframe environments. This modernization includes replicating mainframe data on Cloud platforms in order to quickly capitalize on the latest Cloud services, such as analytics, auto scaling, machine learning and artificial intelligence (AI), high availability, advanced security, etc., or to move data to a variety of newer Cloud databases, streaming services, container services, and much more. With the proper data replication technology and planning, all of this modernization can occur while keeping the legacy mainframe environment active as long as it is needed!

The IBM Z mainframe isn’t going anywhere, and with visionary and daring leadership, it can continue to evolve and adapt to whatever develops in the Cloud… and beyond.

Ready to move forward, adapt, and evolve? Treehouse Software is here to help!

Treehouse Software is your partner on your journey into future mainframe modernization plans. With our “data first” approach, we can help accelerate digital transformation and successfully leverage Cloud and Hybrid Cloud initiatives on the IBM Z platform, storing sensitive data on a private Cloud or local data center, and simultaneously leveraging leading technologies on a managed public Cloud.

Bidirectional_Data_Replication

Through an innovative changed data capture (CDC) technology, our tcVISION product tracks and captures changes occurring in any mainframe application data, and then publishes them to a variety of Cloud targets. The customer moves only the right data to the right place at the right time – as much, or as little as they want.

The tcVISION data replication solution has a modular design, which enables it to support mass data load from one source to one or more targets, as well as continuous data exchange processes in real-time via CDC. This modular architecture and the provided APIs gives customers unlimited future potential for continued evolution, and use of new and emerging technologies.


__TSI_LOGO

Want to see tcVISION in action?

You can schedule a live, online demonstration that shows tcVISION replicating data from the mainframe to a Cloud target database. Just fill out the Treehouse Software tcVISION Demonstration Request Form and a Treehouse representative will contact you to set up a time for your tcVISION Mainframe-to-Cloud data replication demonstration.

How to Synchronize Data in Real Time Between the Mainframe and AWS with Treehouse Software’s Enterprise CDC Tool

by Joseph Brady, Director of Business Development and Cloud Alliance Leader at Treehouse Software, Inc.

Bidirectional_Data_Replication

Many mainframe integration scenarios require continuous near-real-time replication of relational data to keep a copy of the data synched in the Cloud. Change Data Capture (CDC) is used for this near-real-time transactional replication by capturing change log activity to drive changes in the target dataset.

Just what is CDC anyway?

Simply put, and in relation to Mainframe-to-Cloud and open systems data replication, CDC is the use of processes to identify when data has been changed in a source system, so the replicated upstream or downstream (depending on how you look at it) target can be kept in sync with the changes.

In a recent AWS Architecture Blog, readers learn about integration using mainframe data to build Cloud native services with AWS, including transactional replication-based integration via CDC.

____AWS_Mainframe_CDC_Diagram

As mentioned in the blog, AWS Partner CDC Tools are available for connecting data center mainframes to the various data targets, and Treehouse Software’s tcVISION is one of those tools available in the AWS Marketplace.

tcVISION allows changes occurring in any mainframe application data to be tracked and captured, and then published to a variety of target AWS databases and applications. tcVISION provides an easy and fast approach for Hybrid Cloud projects, enabling real-time and bi-directional data replication between the hardware and AWS.

Example of Db2-to-AWS CDC using tcVISION Mainframe Manager:

tcVISION_Db2_To_AWS_CDC

tcVISION supports several CDC methods available, depending on each customer’s use case:

Bulk Transfer

  • Efficient transfer of entire databases
  • Analysis for data consistency (verification)
  • Initial load (ETL) and periodic mass data transfer
  • One-step data transfer

Log Processing

  • Transfer of changed data near-realtime or scheduled time frame
  • Reads both active logs and archived logs

Batch Compare

  • Comparison of data snapshots using checksums
  • Efficient transfer of changed data since last processing
  • Flexible processing options (SORT etc.)
  • Automatic creation of deltas by tcVISION

DBMS Extension

  • Real-time capture of changed data directly from the DBMS
  • Secure data storage even across DBMS restart
  • Flexible propagation methods

Interested in seeing a live, online demo of tcVISION CDC?

Just fill out the Treehouse Software tcVISION Demonstration Request Form and a Treehouse representative will contact you to set up a time for your online tcVISION demonstration.


__001_TSI_LOGO

Should You Stay, or Should You Go? You Can Do Both by Incrementally Replicating Your Mainframe Data on the Cloud While Keeping Both Sides Synchronized

by Joseph Brady, Director of Business Development and Cloud Alliance Leader at Treehouse Software, Inc.

Stay_And_Go_Data_Replication

Many of Treehouse Software’s enterprise customers are not close to considering the retirement of their mainframe systems, but instead have long-term data replication projects, or want to indefinitely have their legacy systems co-exist with a new Cloud platform. These organizations are looking for solutions that allow their legacy mainframe environment to continue while replicating data – in real time and bi-directionally – to take advantage of the latest Cloud services, such as analytics, auto scaling, machine learning and artificial intelligence (AI), high availability, advanced security, etc., or move data to a variety of newer Cloud databases, streaming services, container services, and more.

The Transition Doesn’t Have to be a Sudden Big Bang

Much of an enterprise’s mission critical mainframe data is stored in legacy mainframe databases, and the cost to maintain these databases is high.  An added complication is that the data is utilized by many interlinked and dependent programs that have been in place for many years, and sometimes decades. Unlocking the value of this legacy data is also difficult due to many very different types of mainframe databases (e.g., Db2, Adabas, CA Datacom, CA IDMS, etc.).

Immediate data replication on the Cloud is enabling government, healthcare, supply chain, financial, and a variety of public service organizations to meet spikes in demand for vital information, especially in times of crisis. The globalization of markets, increase of data volumes, 24×7 operations, changing business conditions, and high demand for up-to-date information also requires new data transfer and exchange solutions for heterogeneous IT architectures.

The Data-First Solution

Treehouse Software is here to help enterprise mainframe customers accelerate digital transformation and successfully leverage Hybrid Cloud initiatives on the IBM Z platform, storing sensitive data on a private Cloud or local data center and simultaneously leveraging leading technologies on a managed public Cloud. Our tcVISION replication solution focuses on changed data capture (CDC) when transferring information between mainframe data sources and modern databases and applications. Through an innovative technology, changes occurring in any mainframe application data are tracked and captured, and then published to a variety of targets. The customer moves only the right data to the right place at the right time – as much, or as little as they want.

The tcVISION replication solution has a modular design, which enables it to support mass data load from one source to one or more targets, as well as continuous data exchange processes in realtime via CDC. This modular architecture and the provided APIs gives customers unlimited potential for growth and use of new technologies.

tcVISION allows bi-directional, real-time data synchronization of changes on either platform to be reflected on the other platform (e.g., a change to a PostgreSQL table is reflected back on mainframe). The customer can then modernize their application on the cloud, open systems, etc. without disrupting the existing critical work on the legacy system.

In the following example high level architecture diagram, bi-directional data replication between Db2 z/OS and AWS using tcVISION is shown:

___tcVISON_Bidirectional_Db2

tcVISION utilizes a Windows-based GUI Control Board, which is ideal for non-mainframe programmers.  While mainframe experts are required in the design/architecture phase and occasionally during implementation, the requirement for their involvement is limited. The tcVISION Control Board acts as a single point of administration, data modeling and mapping, script generation, and monitoring. Comprehensive monitoring and logging of all data movements ensure transparency across all data exchange processes. In the following example, the mainframe can be seen communicating to an Amazon EC2-based tcVISION replication manager. The tcVISION Control Board shows the user a graphical representation of this replication:

___tcVISION_Control_Board_AWS_Agentless

Additionally, tcVISION supports complex data replication scenarios between multiple data sources and targets, as seen here:

tcVISION_Complex_Replication_Scenarios

With tcVISION, data replication projects can be implemented within a few of months, depending on the complexity of the project.  This includes the proof of concept and design/architecture stages.  After these stages are complete, the customer can start the first production implementation sprint, immediately providing business value.  We suggest successive agile sprints to allow for incremental deployment of additional file replication, sprint by sprint.

Supported Sources and Targets

tcVISION supports a vast array of integration scenarios throughout the enterprise, providing easy and fast data replication for Mainframe-to-Cloud and Open Systems application modernization projects.


__TSI_LOGO

Contact Treehouse Software for a tcVISION Demo Today…

Just fill out the Treehouse Software tcVISION Demonstration Request Form and a Treehouse representative will contact you to set up a time for your tcVISION demonstration. This will be a live, on-line demonstration that shows tcVISION replicating data from the mainframe to a Cloud target database.

Mainframe-to-Cloud Data Replication with tcVISION: Recommendations for Roadmapping Your Deployment on a Cloud Environment

by Joseph Brady, Director of Business Development and Cloud Alliance Leader at Treehouse Software

Mainframe_To_Cloud_Roadmap

Careful planning must occur for a Mainframe-to-Cloud data modernization project, including how a customer’s desired Cloud environment will look. This blog serves as a general guide for organizations planning to replicate their mainframe data on Cloud platforms using Treehouse Software‘s tcVISION.

A successful move to the Cloud requires a number of post-migration considerations and solutions in order to modernize an application on the Cloud.  Some examples of these considerations and solutions include: 

Personnel Resource Considerations

Staffing for Mainframe-to-Cloud data replication projects depends on the scale and requirements of your replication project (e.g., bi-directional data replication projects will require more staffing).  

Most customers deploy a data replication product with Windows and Linux knowledgeable staff at varying levels of seniority.  For the architecture and setup tasks, we recommend senior technical staff to deal with complex requirements around the mainframe, Cloud architecture, networking, security, complex data requirements, and high availability.  Less senior staff are effective for the more repeatable deployment tasks such as mapping new database/file deployments.  Business staff and system staff are rarely required but can be necessary for more complex deployment tasks.  For example, bi-directional replication requires matching keys on both platforms and their input might be required.  Other activities would be PII consideration, specifics of data transformation and data verification requirements.

An example of staffing for a very large deployment might be one very part-time project manager, a part-time mainframe DBA/systems programmer, 1-2 staff to setup and deployment the environment and an additional 1-2 staff to manage the existing replication processes.

Environment Considerations

As part of the architecture planning, your team needs to decide how many tiers of deployment are needed for your replication project.  Much like with applications, you may want a Dev, QA, and Prod tier.  For each of these tiers, you will need to decide the level of separation.  For example, you might combine Dev and QA, but not Prod.  Many customers will keep production as a distinct environment.  Each environment will have its own set of resources, including mainframe managers (possibly on separate LPARs), Could VMs (e.g., EC2) for replication processing, and for managed Cloud RDBMSs (such as AWS RDS).  

After the required QA testing, changes are deployed to the production environment.  Object promotion test procedures should be detailed and documented, allowing for less experience personnel to work in some testing tasks.  Adherence to details, processes, and extended testing is most import when deploying bi-directional replication, due to the high impact of errors and difficult remediation.

Rollout Planning

A data replication product is typically deployed using Agile methods with sprints.  This allows for incrementally realized business value.  The first phase is typically a planning/architecture phase during which the technical architecture and deployment process are defined.   Files for replication are deployed in groups during sprint planning.  Initial sprint deployments might be low value file replications to shield the business from any interruptions due to process issues.  Once the team is satisfied that the process is effective, replication is working correctly, and data is verified on the source and targets, wide scale deployments can start.  The number of files to deploy in a sprint will depend on the customer’s requirements.  An example would be to deploy 20 mainframe files per 2–3-week sprint.  Technical personnel and business users need to work together to determine which files and deployment order will have the greatest business benefit.

Security

For security, both on-premises and to the major Cloud environments, there are several considerations:

  • Data will be replicated between a source and target. The data security for PII data must be considered.  In addition, rules such as HIPPA, FIPS, etc. will govern specific security requirements.
  • The path of the data must be considered, whether it is a private path, or if the data transverses the internet. For example, when going from on-premises to the Cloud the major Cloud providers have a VPN option which encrypts data going over the internet.  More secure options are also available, such as AWS Direct Connect and Azure ExpressRoute.  With these options, the on-premises network is connected directly to the Cloud provider edge location via a telecom provider, and the data goes over a private route rather than the internet.
  • Additionally, Cloud services such as S3, Azure Blob Storage, and GCP buckets default to route service connections over the internet. Creating a private end point (e.g., AWS PrivateLink) allows for a private network connection within the Cloud provider’s network.  Private connections that do not traverse the Internet provide better security and privacy.
  • Protecting data at rest is important for both the source and target environments. The modern Z/OS mainframe has advanced pervasive and encryption capabilities: https://www.redbooks.ibm.com/redbooks/pdfs/sg248410.pdf.  The major Cloud providers all provide extensive at-rest encryption capabilities.  Turning on encryption for Cloud Storage and databases is often just a parameter setting and the Cloud provider takes care of the encryption, keys, and certificates automatically.    
  • Protecting data in transit is equally important. There are often multiple transit points to encrypt and protect.  First, is the transit from the mainframe to on-premises to the Cloud VM instance.  A mainframe data replication product should provide protection employing TLS 1.2 to utilize keys and certificates on both the mainframe and Cloud.  Second is from the Cloud VM to the Cloud target database or service.  Encryption may be less important since often these services are in a private environment.  However, encryption can be achieved as required.

High Availability

  • During CDC processing, high availability must be maintained in the Cloud environment. The data replication product should keep track of processing position.  The first can be a Restart file, which keeps track of mainframe log position, target processing position, and uncommitted transactions.  The second can be a container stored on Linux or Windows to store committed unprocessed transactions.  Both need to be on highly available storage with a preference for storage across Availability Zones (AZs), such as Elastic File System (Amazon EFS) or Windows File Server (FSx).
  • The Amazon EC2 instance (or other Cloud instance) can be part of an Auto Scaling Group spread across AZs with minimum and maximum of one Amazon EC2 instance.
  • Upon failure, the replacement Amazon EC2 instance of the replication product’s administrator function is launched and communicates its IP address to the product’s mainframe administrator function. The mainframe then starts communication with the replacement Amazon EC2 instance.
  • Once the Amazon EC2 instance is restarted, it continues processing at the next logical restart point, using a combination of the LUW and Restart files.
  • For production workloads, Treehouse Software recommends turning on Multi-AZ target and metadata databases.

Scalable Storage

  • With scalable storage provided on most Cloud platforms, the customer pays only for what is used. The data replication product should require file-based storage for its files that can grow in size if target processing stops for an unexpected reason.  For example, Amazon EFS, and Amazon FSx provide a serverless elastic file system that lets the customer share file data without provisioning or managing storage.

Analytics

  • All top Cloud platform providers give customers the broadest and deepest portfolio of purpose-built analytics services optimized for all unique analytics use cases. Cloud analytics services allow customers to analyze data on demand, and helps streamline the business intelligence process of gathering, integrating, analyzing, and presenting insights to enhance business decision making.
  • A data replication product should replicate data to several data sources that can easily be captured by various Cloud based analytics services. For example, mainframe database data can be replicated to the various Cloud ‘buckets’ in JSON, CSV, or AVRO format, which allows for consumption by the various Cloud analytic services.  Bucket types include AWS S3, Azure BLOB Data, Azure Data Lake Storage, and GCP Cloud storage.  Several other Cloud analytics type services also support targets including Kafka, Elasticsearch, HADOOP, and AWS Kinesis.
  • Kafka has become a common target and can serve as a central data repository. Most customers target Kafka using JSON formatted replicated mainframe data.  Kafka can be installed on-premises, or using a managed Kafka service, such as the Confluent Cloud, AWS Managed Kafka, or the Azure Event Hub.

Monitoring

  • Monitoring is a critical part of any data replication process. There are several levels of monitoring at various points in a data replication project.  For example, each node of the replication including the mainframe, network communication, Cloud VM instances (such as EC2) and the target Cloud database service all can require a level of monitoring.  The monitoring process will also be different in development or QA vs. a full production deployment.
  • A data replication product should also have its own monitoring features. One important area to measure is performance and it is important to determine where any performance bottleneck is located.  Sometimes it could be the mainframe process, the network, the transformation computation process, or the target database.  A performance monitor helps to detect where the bottleneck is occurring and then the customer can drill down into specifics.  For example, if the bottleneck is the input data, areas to examine are the mainframe replication product component performance, or the network connection.  The next step is to monitor the area where the bottleneck is occurring using the data replication product’s statistics, mainframe monitoring tools, or Cloud monitoring such as AWS CloudWatch.
  • A data replication product should also allow the customer to monitor processing functions during the replication process. The data replication product should also have extensive logs and traces that allow for detailed monitoring of the data replication process and produce detailed replication statistics that include a numeric breakdown of processing statistics by table, type of operation (insert, update delete), and where these operations occurred (mainframe, or target database). 
  • CloudWatch collects monitoring and operational data in the form of logs, metrics, and events, providing customers with a unified view of AWS resources, applications, and services that run on AWS, and on-premises servers. You can use CloudWatch to set high resolution alarms, visualize logs and metrics side by side, take automated actions, troubleshoot issues, discover insights to optimize your applications, and ensure they are running smoothly.
  • Some customers are satisfied with a basic monitoring that polls every five minutes, while others need more detailed monitoring and can choose polls that occur every minute.
  • CloudWatch allows customers to record metrics for EC2 and other Amazon Cloud Services and display them in a graph on a monitoring dashboard. This provides visual notifications of what is going on, such as CPU per server, query time, number of transactions, and network usage.
  • Given the dynamic nature of AWS resources, proactive measures including the dynamic re-sizing of infrastructure resources can be automatically initiated. Amazon CloudWatch alarms can be sent to the customer, such as a warning that CPU usage is too high, and as a result, an auto scale trigger can be set up to launch another EC2 instance to address the load. Additionally, customers can set alarms to recover, reboot, or shut down EC2 instances if something out of the ordinary happens.

Disaster Recovery

  • IT disasters such as data center failures, or cyber attacks can not only disrupt business, but also cause data loss, and impact revenue. Most Cloud platforms offer disaster recovery solutions that minimize downtime and data loss by providing extremely fast recovery of physical, virtual, and Cloud-based servers.
  • A disaster recovery solution must continuously replicate machines (including operating system, system state configuration, databases, applications, and files) into a low-cost staging area in a target Cloud account and preferred region.
  • Unlike snapshot-based solutions that update target locations at distinct, infrequent intervals, a Cloud based disaster recovery solution should provide continuous and asynchronous replication.
  • Consult with your Cloud platform provider to make sure you are adhering to their respective best practices.
  • Example: https://docs.aws.amazon.com/whitepapers/latest/disaster-recovery-workloads-on-aws/introduction.html

Artificial Intelligence and Machine Learning

  • Many organizations lack the internal resources to support AI and machine learning initiatives, but fortunately the leading Cloud platforms offer broad sets of machine learning services that put machine learning in the hands of every developer and data scientist. For example, AWS offers SageMaker, GCP has AI Platform, and Microsoft Azure provides Azure AI.
  • Applications that are good candidates for AI or ML are those that need to determine and assign meaning to patterns (e.g., systems used in factories that govern product quality using image recognition and automation, or fraud detection programs in financial organizations that examine transaction data and patterns).

The list goes on…

  • Treehouse Software and our Cloud platform and migration partners can advise and assist customers in designing their roadmaps into the future, taking advantage of the most advanced technologies in the world.
  • Successful customer goals are top priority for all of us, and we can continue to work with our customers on a consulting basis even after they are in production.

Of course, each project will have unique environments, goals, and desired use cases. It is important that specific use cases are determined and documented prior to the start of a project and a tcVISION POC. This planning will allow the Treehouse Software team and the customer develop a more accurate project timeline, have the required resources available, and realize a successful project. 

Your Mainframe-to-Cloud Data Migration Partner…

Treehouse Software is a global technology company and Technology Partner with AWS, Google Cloud, and Microsoft. The company assists organizations with migrating critical workloads of mainframe data to the Cloud.

Further reading on tcVISION from AWS, Google Cloud, and Confluent:

More About tcVISION from Treehouse Software…

__Plans_To_Reality

tcVISION supports a vast array of integration scenarios throughout the enterprise, providing easy and fast data migration for mainframe application modernization projects. This innovative technology offers comprehensive abilities to identify and capture changes occurring in mainframe and relational databases, then publish the required information to an impressive variety of targets, both Cloud and on-premises.

tcVISION acquires data in bulk or via CDC methods from virtually any IBM mainframe data source (Software AG Adabas, IBM Db2, IBM VSAM, CA IDMS, CA Datacom, and sequential files), and transform and deliver to a wide array of Cloud and Open Systems targets, including AWS, Google Cloud, Microsoft Azure, Confluent, Kafka, PostgreSQL, MongoDB, etc. In addition, tcVISION can extract and replicate data from a variety of non-mainframe sources, including Adabas LUW, Oracle Database, Microsoft SQL Server, IBM Db2 LUW and Db2 BLU, IBM Informix, and PostgreSQL.


__TSI_LOGO

Contact Treehouse Software for a tcVISION Demo Today…

Simply fill out our tcVISION Demonstration Request Form and a Treehouse representative will be contacting you to set up a time for your requested demonstration.

Video: Mainframe-to-Azure Data Replication with tcVISION from Treehouse Software

by Joseph Brady, Director of Business Development and Cloud Alliance Leader at Treehouse Software, Inc.

Mainframe_To_Azure

Treehouse Software was recently invited by Microsoft Azure Mainframe Modernization technical teams to do a presentation and demonstration of tcVISION, our innovative Mainframe-to-Cloud data replication software product.

In this video, we show an overview of the product, then demonstrate replication of mainframe data on Azure SQL:

Click Here to View the Video


__001_TSI_LOGO

Contact Treehouse Software Today for a tcVISION Demonstration…

No matter where you want your mainframe data to go – the Cloud, Open Systems, or any LUW target – tcVISION from Treehouse Software is your answer.

For more information, please contact customer sales at +1.724.759.7070, email us at sales@treehouse.com, or fill out the Treehouse Software Product Demonstration Request Form and a Treehouse representative will contact you to set up a time for your online tcVISION demonstration.