AWS Services Provide Advanced Monitoring and Analytics of tcVISION’s Mainframe CDC Processing

by Joseph Brady, Director of Business Development and Cloud Alliance Leader at Treehouse Software, Inc.

____AI_Data_Monitoring_And_Analytics

Many Treehouse Software mainframe modernization customers have requirements for continuous near-real-time replication of mainframe data in order to keep a copy of the data synchronized on the Cloud. These customers are using tcVISION from Treehouse Software for changed data capture (CDC) for this synchronization, which allows changes occurring in any mainframe application data to be tracked and captured, and then published to a variety of AWS targets, including Amazon Simple Storage Service (S3). Some of these customers are also now asking us to recommend the best Cloud-based tools and methods to monitor and gain insights to these complex data processes. Coincidentally, while working with a current tcVISION customer, our technicians are testing out two particularly good, fully managed AWS services that can work hand-in-hand to address this need:

Amazon Athena

Since tcVISION supports Amazon S3 as a target, customers modernizing their mainframe systems on AWS can use Amazon Athena for monitoring and analysis of CDC processing from an S3 bucket.

Amazon Athena is a serverless, interactive analytics service built on open-source frameworks, supporting open-table and file formats. Athena provides a simplified, flexible way to analyze data from an S3 Bucket, as well as many other data sources, including on-premises data sources or other Cloud systems. Athena is built on open-source Trino and Presto engines and Apache Spark frameworks, with no provisioning or configuration effort required.

Figure 1: Example of an Athena query showing bulk-load statistics per table

____01_Amazon_Athena_Query

Amazon QuickSight

____01_Amazon_QuickSight

Once Athena is setup for monitoring an S3 Bucket, users can easily view their CDC processing and analytics with Amazon QuickSight. QuickSight utilizes advanced machine learning-powered insights and intuitive dashboards, so end users can make the best and quickest data-driven business decisions.

Figure 2: Example of Amazon QuickSight monitoring the throughput of our data to Snowflake

____01_Amazon_QuickSight02

Figure 3: Example of Amazon QuickSight pie chart showing the resulting rows loaded for each Snowflake table:

____01_Amazon_QuickSight03

Figure 4: Example of Amazon QuickSight chart showing statistics for our data bulk-load into Snowflake:

____01_Amazon_QuickSight04

Figure 5: Example of Amazon QuickSight chart showing our load time into Snowflake per table:

____01_Amazon_QuickSight05

View the Amazon QuickSite video here…


__001_TSI_LOGO

Interested in seeing a live, online demo of tcVISION?

Just fill out the Treehouse Software tcVISION Demonstration Request Form and a Treehouse representative will contact you to set up a time for your online tcVISION demonstration.


Providing a High Availability Framework for Mainframe-to-AWS Data Replication

by Dan Vimont, Cloud Solutions Architect at Treehouse Software, Inc.

tcV_HA_on_AWS

Treehouse Software customers are using tcVISION to enable mission-critical mainframe-to-AWS data replication pipelines.  Some of these production pipelines are providing vital near-real-time synchronization between source and target, and thus can’t afford any significant downtime in the event of failure.  So it’s only natural that a number of our customers have been asking for advice in setting up a high availability configuration for their tcVISION components that run on AWS EC2 instances.  The High Availability Framework discussed here provides for a Failover EC2 instance to automatically pick up tcVISION processing should the Primary instance (running in another Availability Zone) go down.

The Core Components:  Primary Instance & Failover Instance

The core components of a tcVISION high availability framework consist of two EC2 instances running in different Availability Zones:  a Primary EC2 instance and a Failover EC2 instance.  Both identically-configured EC2 instances are attached to a shared working-storage file system (either an EFS or FSx volume), which allows the Failover instance to seamlessly and quickly pick up tcVISION processing should the Primary instance suddenly become unavailable.

HA1

Use a Step Function to Automate the Failover Process

In the event of failure of the Primary instance, the recommended framework calls for automatic triggering of a Step Function for reliable failover processing, with steps that include the following:

  • verify that the Primary instance is unavailable (The tcVISION service cannot be active on both instances simultaneously, so this verification is vital.)
  • redirect all network traffic from the Primary instance to the Failover instance (via Route 53)
  • start tcVISION processing on the Failover instance

HA2

When Ready, Use a Step Function to Automate the Restoration Process

After operations personnel have completed recovery of the Primary EC2 instance, another Step Function may be manually triggered to reliably transfer tcVISION processing back to the Primary instance.

HA3.jp

Many More Details are Available Upon Request to Treehouse Customers

Full details regarding our recommended High Availability Framework for tcVISION are available upon request to Treehouse customers.  AWS services utilized in the complete recommended framework include Step Functions, Lambda Functions, EventBridge rules, CloudWatch alarms, SNS topics, a Route 53 Private Hosted Zone, and more.  The following diagram is a partial visual inventory of the recommended framework components.

HA5

Interested in seeing a live, online demo of tcVISION?

Just fill out the Treehouse Software tcVISION Demonstration Request Form and a Treehouse representative will contact you to set up a time for your online tcVISION demonstration.


__001_TSI_LOGO

How to Synchronize Data in Real Time Between the Mainframe and AWS with Treehouse Software’s Enterprise CDC Tool

by Joseph Brady, Director of Business Development and Cloud Alliance Leader at Treehouse Software, Inc.

Bidirectional_Data_Replication

Many mainframe integration scenarios require continuous near-real-time replication of relational data to keep a copy of the data synched in the Cloud. Change Data Capture (CDC) is used for this near-real-time transactional replication by capturing change log activity to drive changes in the target dataset.

Just what is CDC anyway?

Simply put, and in relation to Mainframe-to-Cloud and open systems data replication, CDC is the use of processes to identify when data has been changed in a source system, so the replicated upstream or downstream (depending on how you look at it) target can be kept in sync with the changes.

In a recent AWS Architecture Blog, readers learn about integration using mainframe data to build Cloud native services with AWS, including transactional replication-based integration via CDC.

____AWS_Mainframe_CDC_Diagram

As mentioned in the blog, AWS Partner CDC Tools are available for connecting data center mainframes to the various data targets, and Treehouse Software’s tcVISION is one of those tools available in the AWS Marketplace.

tcVISION allows changes occurring in any mainframe application data to be tracked and captured, and then published to a variety of target AWS databases and applications. tcVISION provides an easy and fast approach for Hybrid Cloud projects, enabling real-time and bi-directional data replication between the hardware and AWS.

Example of Db2-to-AWS CDC using tcVISION Mainframe Manager:

tcVISION_Db2_To_AWS_CDC

tcVISION supports several CDC methods available, depending on each customer’s use case:

Bulk Transfer

  • Efficient transfer of entire databases
  • Analysis for data consistency (verification)
  • Initial load (ETL) and periodic mass data transfer
  • One-step data transfer

Log Processing

  • Transfer of changed data near-realtime or scheduled time frame
  • Reads both active logs and archived logs

Batch Compare

  • Comparison of data snapshots using checksums
  • Efficient transfer of changed data since last processing
  • Flexible processing options (SORT etc.)
  • Automatic creation of deltas by tcVISION

DBMS Extension

  • Real-time capture of changed data directly from the DBMS
  • Secure data storage even across DBMS restart
  • Flexible propagation methods

Interested in seeing a live, online demo of tcVISION CDC?

Just fill out the Treehouse Software tcVISION Demonstration Request Form and a Treehouse representative will contact you to set up a time for your online tcVISION demonstration.


__001_TSI_LOGO

Treehouse Software Customer Case Study: A State Government Agency’s Real-time Data Synchronization Between IBM Mainframe Adabas and AWS

by Joseph Brady, Director of Business Development and Cloud Alliance Leader at Treehouse Software, Inc.

Mainframe_to_AWS_Graphic

Software AG’s Adabas is a mainframe database that is still heavily used by government sites throughout the U.S. and the world, and this blog focuses on a current Treehouse Software customer – a U.S. State Government Agency that uses Adabas on their mainframe system.

Business Issue

The Agency’s modernization team was looking for a Change Data Capture (CDC) technology solution that enables them to synchronize their mainframe Adabas data on AWS, particularly an Amazon RDS. As with most Treehouse customers, the State’s mainframe contains vital data that must always be highly available, so rather than attempting a complete migration from the mainframe, the modernization teams decided to implement a multi-year data replication plan. This allows the mainframe legacy teams to maintain existing critical applications, while the modernization team develops new applications on AWS.

After researching various technologies, the Agency discovered tcVISION on the AWS Parter Network Blog and contacted Treehouse Software to discuss their project and to see a demonstration of Mainframe-to-AWS data replication.

Addressing the Uniqueness of Adabas

Having specialized in tools and services complementary to Adabas/Natural applications since 1982, Treehouse Software has successfully encountered and addressed many unique scenarios within the Adabas environment. The Treehouse technical team documented three primary issues with Adabas/Natural that the Agency needed to consider when they began planning data replication on AWS:

  1. Adabas has no concept of “transaction isolation”, in that a program may read a record that another program has updated, in its updated state, even though the update has not been committed.  This means that programmatically reading a live Adabas database—one that is available to update users—will almost inevitably lead to erroneous extraction of data.  Record modifications (updates, inserts and deletes) that are extracted, and subsequently backed out, will be represented incorrectly—or not at all—in the target. Because of this, at Treehouse we say “the only safe data source is a static data source”—not the live database.
  2. Many legacy Adabas applications make use of “record typing”, i.e., multiple logical tables stored in a single Adabas file.  Often, each must be extracted to a separate table in the target RDBMS.  The classic example is that of the “code-lookup file”.  Most shops have a single file containing state codes, employee codes, product-type codes, etc.  Records belonging to a given “code table” may be distinguished by the presence of a value in a particular index (descriptor or superdescriptor in ADABAS parlance), or by a range of specific values.  Thus, the extraction process must be able to dynamically assign data content from a given record to different target tables depending on the data content itself.
  3. Adabas is most often used in conjunction with Software AG’s Natural 4GL, and “conveniently” provides for unique datatypes (“D” and “T”) that appear to be merely packed-decimal integers on the surface, but that represent date or date-time values when interpreted using Software AG’s proprietary Natural-oriented algorithm. The most appropriate way to migrate such datatypes is to recognize them and map them to the corresponding native RDBMS datatype (e.g., Oracle DATE) in conjunction with a transformation that decodes the Natural value and formats it to match the target datatype.

The tcVISION Technology Solution...

Adabas_To_AWS

After technical discussions and a successful proof of concept (POC) that proved out a set of use cases, all teams at the Agency determined that tcVISION real-time mainframe data replication capabilities were the perfect fit for meeting their goals.

tcVISION‘s modeling and mapping facilities are utilized to view and capture logical Adabas structures, as documented in Software AG’s PREDICT data dictionary, as well as physical structures as described in Adabas Field Definition Tables (FDTs).  Given that PREDICT is a “passive” data dictionary (there is no requirement that the logical and physical representations agree), it was necessary to scrutinize both to ensure that the source structures were accurately modeled.

Furthermore, tcVISION generates appropriate mappings and transformations for converting Adabas datatypes and structures to corresponding target datatypes and structures, including automatic handling of the proprietary “D” and “T” source datatypes.

The teams examined the three ways that tcVISION can access Adabas data:

  1. ETL – read the active database nucleus
  2. ETL – read datasets containing unloaded Adabas files created by the ADAULD utility
  3. CDC – read the active and archived PLOGs datasets

It was decided to access the data by reading the active and archived PLOGs datasets. The schema, mappings, and transformations from the metadata import were tailored to the customer’s specific requirements.  It is also now possible to import an existing RDBMS schema and retrofit it, via drag-and-drop in tcVISION, to the source Adabas elements.

Additionally, the Agency’s teams are very pleased with tcVISION‘s minimal usage of mainframe resources. The product’s “staged processing” methodology accomplishes this, whereby the only processing occurring on the mainframe is the capture of changes from Adabas PLOGs. The bulk of the processing occurs on the AWS side, minimizing tcVISION’s footprint on the mainframe as seen in this diagram:

tcVISION_Staged_Processing

The user defines on which platform stage their processing should be done. Do as little as possible on the mainframe: Stage 0 – capture data and send data (internal format) to target, and process data in Stages 1 – 3 in AWS.

Customer Outcome

All requirements were met by tcVISION, which led to a successful project implementation.


__001_TSI_LOGO
Contact Treehouse Software for a tcVISION Demo Today…

No matter where you want your mainframe data to go – the Cloud, open systems, or any LUW target – tcVISION from Treehouse Software is your answer.

Just fill out the Treehouse Software tcVISION Demonstration Request Form and a Treehouse representative will contact you to set up a time for your online tcVISION demonstration.


Further reading:

Many more mainframe data migration and replication customer case studies can be read on the Treehouse Software Website.

Treehouse Software Technology Partner, Cognistx Brings Cognitive Computing to Screens Around the World

__CognistxArticle_08_31_2016

Cognistx is transforming the customer buying experience using cognitive computing, which it claims has the potential to become the most disruptive technology of the next 20 years. Wake Forest Innovation Quarter’s The Hub features the latest from Cognistx. Read the article here.


About the Treehouse Software / Cognistx Partnership

Since the mid-1990s, Treehouse Software has provided world-class enterprise data acquisition capabilities, and we now offer the Cognistx cognitive computing capabilities for the most advanced interaction with your customers.

_0_Treehouse_To_Cognistx

The Cognistx platform complements Treehouse Software’s tcVISION, the comprehensive product that can acquire data in bulk or via change data capture methods, including in real time, from virtually any IBM mainframe data source (Software AG Adabas, IBM DB2, IBM VSAM, IBM IMS/DB, CA IDMS, CA Datacom, even sequential files), and transform and deliver to virtually any target. In addition, the same product can extract and replicate data from a variety of non-mainframe sources, including Adabas LUW, Oracle Database, Microsoft SQL Server, IBM DB2 LUW and DB2 BLU, IBM Informix and PostgreSQL.

If this exciting new technology is of interest to you, we would be happy to have a conversation about your company’s needs, so do not hesitate to contact us if you have questions.  Meanwhile, please visit the Treehouse Software Cognistx Web Page for more information.

 

Two Local Technology Companies Partner to Advance Cognitive Computing; Complementary Areas of Expertise Mean Better Data Integration and Individualization

Treehouse Software, Inc., of Sewickley, PA and Cognistx of Pittsburgh, PA announced a partnership to help customers with improved data integration and individualization to fully leverage the power of cognitive computing.

Technology industry leaders from Accenture to Gartner to McKinsey recognize the future of computing will be cognitive, calling it a disruptive force and estimating the industry to reach $200 billion by 2020. Cognitive computing is based on leading edge technology including artificial intelligence, natural language processing, Big Data, advanced analytics and machine learning algorithms.

cog_push_med2

The Treehouse – Cognistx partnership will allow customers to ingest massive amounts of data, whether that data is numbers, images, or audio files, and mine it to find insights that lead to action, and ultimately to increased revenue from improved customer engagement.

Since the mid-1990s, Treehouse Software has been a global leader in mainframe data migration, replication and integration, offering robust and flexible solutions for ETL, CDC and real-time, multidirectional replication between databases on various platforms.

Cognistx is an applied technology company harnessing state-of-the-art cognitive computing tools to help retailers reach individuals with intuitive, intelligent and individualized offers based on their past transactions, preferences, context and profile.

cog_mobile_tech_image

Cognistx complements Treehouse’s capability to deliver data with its machine learning algorithms that become more accurate with every transaction, delivering customized, personalized, prescriptive actions in the right context. Together, the two companies will co-market their capabilities, bringing new competitive advantages to customers who want to expand the use of their most valuable asset — data.

“We’re excited to partner with Cognistx to bring our world-class enterprise data acquisition capabilities to companies that recognize the massive opportunity cognitive computing represents,” said Wayne Lashley, Treehouse Chief Business Development Officer. “We provide the data foundation and Cognistx translates that data into insights, those insights into customer actions, and those actions into incremental revenue.”

“Few retailers do a good job of marrying technology with a customized customer experience that is tailored to their behaviors and timed according to how they might use a retailer’s offer,” said Sanjay Chopra, CEO of Cognistx. “With our proprietary algorithms and Treehouse’s enterprise data solutions, both our customers win. Only with large amounts of data can our system learn about the consumer and their preferences and how those change in order to deliver only the smartest, most individualized offers.”

About Treehouse Software, Inc.

Privately-held Treehouse Software was founded in 1982, and is a global leader in providing data migration, replication, and integration solutions for the most complex and demanding heterogeneous environments. Treehouse offers a comprehensive and flexible portfolio of software and tools for mainframe platforms, and also includes feature-rich, accelerated-ROI offerings for information delivery, and application modernization. http://www.treehouse.com

About Cognistx

Privately-held Cognistx was founded in 2015 and has a technology hub in Pittsburgh and operations offices in the Innovation Quarter in Winston-Salem, NC, and Raleigh. The company’s co-founders include Sanjay Chopra, a serial technology entrepreneur; Eric Nyberg, professor at Carnegie Mellon University’s School of Computer Science, who consulted with IBM on the Watson project and Jeffrey Battin, former owner of Communefx, a successful data analytics company. Other partners include Florian Metze, professor at Carnegie Mellon University’s School of Computer Science; Jill Zoria, SVP Enterprise Development; Pete Minnelli, SVP Creative; and Karen Barnes, SVP Operations. http://www.cognistx.com