Reference architecture overview

Page / - last modified by Administrator on 2019/06/21 15:34

Reference architecture overview

last modified by Administrator

on 2019/06/21 15:34



Table of Contents

Reference architecture overview

Document purpose

This Reference Architecture provides an overview of the architecture of a virtualized business continuity solution that uses the EMC Celerra unified storage platform and Oracle Database 11g on Linux using DNFS and NFS. Key elements, migration details, and hardware and software resources are among the topics discussed.

Information in this document can be used:

  • As the basis for a solution build, white paper, best practices document, or training.
  • By other EMC organizations (for example, the technical services or sales organization) as the basis for producing documentation for a technical services or sales kit.
Solution purpose
The purposes of this solution are to:
  • Demonstrate the functional, performance, resiliency, and scalability capabilities of an Oracle software stack that:
  • Is booted on virtualized hardware and incorporates a four-node VMware High Availability (HA) cluster.
  • Uses either the Oracle Direct NFS (DNFS) protocol (Oracle 11g) or the kernel NFS (KNFS) protocol (Oracle 10g) to access the storage elements for the Oracle Database.
The business challenge

Midsize enterprises face the same challenges as their larger counterparts when it comes to managing database environments. These challenges include:

  • Rising costs
  • Control over resource utilization and scaling
  • Lack of sufficient IT resources to deploy, manage, and maintain complex environments at the departmental level
  • The need to reduce power, cooling, and space requirements

Unlike large enterprises, midsize enterprises are constrained by smaller budgets and cannot afford a custom, one-off solution. This makes the process of creating a database solution for midsize enterprises even more challenging than for large enterprises.

Traditional database deployments have relied on utilizing fiber channel protocol storage infrastructure in a physically managed environment.

Through significant technology innovations and close collaboration with both Oracle and VMware, EMC is able to offer new advanced solutions, incorporating flexible, tiered storage functionality and multiple protocols, as well as optimized use of virtualization.

Customers’ choice to leverage different storage protocols, functionality, and virtualization should be based on their individual application requirements for utilization, performance, flexibility, and cost.

EMC designed, tested, and documented this solution to specifically address the following challenges:

  • Rising costs

Focus on deploying Oracle over lower-cost NFS protocol environments to reduce cost and improve efficiency of networked storage

  • Resource utilization

Align the technology strengths of networked storage with virtualization software to increase database server and storage utilization, with higher leverage at scale

  • Resource efficiency

Simplify database resource and infrastructure administration utilizing NFS and virtualization. Empower DBAs with enhanced solutions for backup, data movement, and data protection.

This solution documents implementation procedures and best practices for deploying a single platform solution to quickly and effectively to achieve these benefits.

The technology solution
This solution demonstrates how organizations can:
  • Maximize the use of the database-server CPU, memory, and I/O channels by offloading the performance impacts of backup, restore, and recovery operations from the production server.
  • Use virtualization to reduce costs by reducing the number of servers and related IT hardware in the data center.
  • Implement a high-availability solution using a VMware HA cluster.
  • Get the most out of hardware by using VMware DRS to automatically reallocate virtual machines, according to need, across servers.
  • Use DNFS to:
  • Simplify network setup and management by taking advantage of DNFS automated management of tasks, such as load balancing across network connections and tuning of Linux NFS parameters.
  • Increase the capacity and throughput of their existing infrastructure. Transactions per second and user load are both higher with DNFS than with KNFS, enabling more output from the same infrastructure.
  • Use EMC Replication Manager for NFS with EMC SnapSure to free up the database server’s CPU, memory, and I/O channels from the effects of operations relating to backup, restore, and recovery. EMC Replication Manager for NFS with EMC SnapSure writeable checkpoints also helps in creating test/development systems without any impact on the production environment.
  • Ensure business continuity by using Celerra and Oracle Data Guard to provide disaster recovery capability.

Solution components

Introduction
This section describes the components of the solution, and explains some of the key terminology and concepts that are used in this document.
Key terms defined
The following table describes several terms used in this document:
TermDescription
SolutionA solution is a complete stack of hardware and software upon which a customer would choose to run their entire business or business function.
Solution attributeA solution attribute addresses the entire solution stack, but does so in a way relating to a discrete area of testing. For example, performance testing is a solution attribute.
Solution component

A solution component addresses a subset of the solution stack that consists of a discrete set of hardware or software, and focuses on a single IT function. For example, backup and recovery, and disaster recovery are solution components.

A solution component can be either “basic” or “advanced.”

Basic solution componentA basic solution component uses only the features and functionality provided by the Oracle stack. Examples are RMAN for backup and recovery, and Data Guard for disaster recovery.
Advanced solution componentAn advanced solution component uses the features and functionality of EMC hardware or software. Examples are EMC SnapView™ for backup and recovery, and EMC MirrorView™ for disaster recovery.
Solution attributes
The following table describes the solution attributes that are included in this solution:
AttributeDescription
Scale-out OLTPThis is the baseline performance test presently used for virtualized solutions. It consists of an industry-standard TPC-C workload running on a cluster of VMware ESX servers. Multiple database images are used for the workload, emulating a software-as-a-service (SaaS) customer. This is a scale-out, because all users in the database are not able to access all rows in the combined databases but can only see the data in the database to which it is connected. Thus, the database is federated.
ResiliencyThe purpose of resiliency testing is to validate the fault-tolerance and high-availability features of the hardware and software stack. Faults are inserted into the configuration at various layers in the solutions stack. Some of the layers where fault tolerance is tested include:
  • VMware HA cluster interconnect port
  • Storage processor
  • Data Mover
Solution components
The following table describes the solution components that are included in this solution:
ComponentDescription
Basic BackupThis is backup and recovery using Oracle RMAN, the built-in backup and recovery tool provided by Oracle.
Advanced Backup

This is backup and recovery using EMC value-added software or hardware.

In this solution EMC SnapSure is used to provide Advanced Backup functionality.

Basic ProtectThis is disaster recovery using Oracle Data Guard, Oracle’s built-in remote replication tool.
Test/DevA running production OLTP database is cloned with minimal, if any, performance impact on the production server, as well as no downtime. The resulting dataset is provisioned on another server for use for testing and development.
MigrationThe ability to migrate a running production Oracle database from one storage protocol to another (FCP/ASM to DNFS or KNFS, or vice versa) is a frequent customer request. This can include full production database cut-over, or the creation of a clone on another storage protocol for backup, test/dev, or other purposes.

Technology solution

Overview

All database objects are stored on an NFS mount. In the case of Oracle Database 11g, datafiles, tempfiles, controlfiles, online redo logfiles, and archive logfiles are accessed using the DNFS protocol.

The Oracle software stack is booted on virtualized hardware and incorporates a four-node VMware High Availability (HA) cluster. For the purpose of high availability, VMotion is used to migrate a virtual machine running the production database from one ESX server to another without any downtime. Each ESX server is connected to the production storage using a dedicated storage network; a separate set of dedicated TCP/IP networks is used for the cluster interconnect. Then based on the need, VMware DRS is used to automatically allocate resources to multiple virtual machines for load-balancing purposes.

Two sites connected by a WAN are used in the solution, one site is used for production and the other site is used as a disaster recovery target. Oracle 11g or 10g for x86-64 is run on Red Hat Enterprise Linux or on Oracle Enterprise Linux on virtual machines.

The solution also includes virtualized servers for use as test/dev and basic protect targets. Virtualization of the test/dev and disaster recovery (DR) target servers is supported using VMware ESX. The Replication Manager server is also virtualized.

Production site
The production site consists of:
  • A Celerra connected to the four-node VMware HA cluster through the production storage network. EMC SnapSure is used to provide an advanced backup solution. SnapSure writeable checkpoints are used to create a test/dev target database.
  • Virtualized Oracle Database 11g or 10g scale-out servers are running on virtual machines hosted on a four-node VMware HA cluster.
  • The virtualized single-instance server is connected to the client, WAN, and target storage networks through virtualized connections on the virtualization server.
  • The Oracle Database 11g or 10g scale-out servers are connected to the client, HA cluster interconnect, WAN, and production storage networks through virtualized connections on the VMware ESX servers.
  • The Replication Manager agent is installed on all production database servers and the mount host where the test/dev target database will be started.
Disaster recovery target site
The disaster recovery target site consists of:
  • A virtualized single-instance Oracle Database 11g or 10g server that is used as:
  • The disaster recovery target for Basic Protect
  • The target for Test/Dev
  • A Celerra connected to the VMware ESX through the disaster recovery storage network. The Oracle Database 11g/10g single-instance target server accesses this network through a virtualized switch on the ESX server.
Storage layout

The following table describes how each Oracle file type and database object is stored for this solution:

WhatWhereFile-system type
Oracle datafilesFC disksRAID-protected NFS file system
Oracle tempfiles
Oracle online redo logfiles
Oracle controlfiles
VMDK pointer files
Archived logfilesSATA II
Flashback recovery area
Backup target

For implementations using Oracle Database 11g, all files are accessed using DNFS. For implementations using Oracle Database 10g, all files are accessed using KNFS.

  • RAID-protected NFS file systems are designed to satisfy the I/O demands of particular database objects. For example, RAID 5 is sometimes used for the datafiles and tempfiles, but RAID 1 is always used for the online redo logfiles. Two separate RAID configurations are supported. For more information refer to EMC Solutions for Oracle Database 10g/11g for Midsize Enterprises Physically Booted Solutions with EMC Celerra NS40 Unified Storage Platform - Reference Architecture.
  • Oracle datafiles and online redo logfiles reside on their own NFS file system. Online redo logfiles are mirrored across two different file systems using Oracle software multiplexing. Three NFS file systems are used - one file system for datafiles and tempfiles, and two file systems for online redo logfiles.
  • Oracle controlfiles are mirrored across the online redo logfile NFS file systems.
Network architecture
The design implements the following physical connections:
  • TCP/IP provides network connectivity.
  • DNFS provides file system semantics for Oracle Database 11g. KNFS provides file system semantics for NFS file systems on Oracle Database 10g.
  • Client virtual machines run on a VMware ESX. They are connected to a client network.
  • Client and VMware HA cluster interconnect and redundant TCP/IP storage networks consist of dedicated network switches and virtual local area networks (VLANs).
  • The VMware HA cluster interconnect and storage networks consist of trunked and virtualized IP connections to balance and distribute network I/O. Jumbo frames are enabled on these networks.

Migration

Introduction

Customers often request the ability to migrate a virtualized Oracle Database across storage protocols. In response to this, the Oracle Consulting (CSV) group has validated that customers who have an Oracle Database operating in a virtualized environment can migrate data from:

  • An FCP/ASM to an NFS-mounted file system
  • An NFS-mounted file system to an FCP/ASM

Customers might be interested in cross-protocol migration because testing and development activities can be supported on less expensive, simpler NFS storage, while production can be carried out on higher-performing ASM/FCP storage.

Customers may also want to use the storage replication technologies available on a different protocol from the production database.  For example, customers may wish to use RecoverPoint to support remote replication but have production on NFS.

The migrations were performed with minimal performance impact and no downtime on the virtualized production database.

SAN to NAS migration diagram
The following diagram is a high-level view of the migration component:

010-[EMC]-Bullets-Tables-English-22Pages-DOC_html_4ecd98f6dd410c9a.png

Migrating an online Oracle Database from SAN to NAS
These steps were followed to perform the SAN to NAS migration operation:
StepAction
1Using EMC Replication Manager, a consistent backup of the running virtualized production database is performed on the EMC CLARiiON® using SnapView snapshot.
2This backup is mounted (but not opened) on the migration server in this case a VMware virtual machine (a physically booted server would also work). The NFS target array is also mounted on the migration server.
3Using Oracle Recovery Manager (RMAN), a backup of this database is taken onto the target location. This backup is performed as a database image, so that the datafiles are written directly to the target NFS mount.
4The migration server is then switched to the new database, which has been copied by RMAN to the NFS mount.
5The target database is set in Data Guard continuous recovery mode, and Data Guard log ship/log apply is used to catch the virtualized target database up to the virtualized production version.
6Once the virtualized target database is caught up to production, Data Guard failover can be used to retarget to the virtualized target database. If appropriate networking configuration is performed, clients will see no downtime when this operation occurs.
The result, as stated above, is that the virtualized production FCP-mounted database can be migrated to NFS with minimal performance impact and no downtime.
NAS to SAN migration diagram
The following diagram is a high-level view of the NAS to SAN migration component.

010-[EMC]-Bullets-Tables-English-22Pages-DOC_html_856356489430d5b5.png

Migrating an online Oracle Database from NAS to SAN
These steps were followed to perform the NAS to SAN migration operation:
StepAction
1Using EMC Replication Manager, a consistent backup of the running virtualized production database is performed on the EMC Celerra® using a SnapSure checkpoint snapshot.
2This backup is mounted (but not opened) on the migration server, in this case a VMware virtual machine (a physically booted server would also work). The FCP/ASM target LUNs are also mounted on the migration server.
3Using Oracle Recovery Manager (RMAN), a backup of this database is taken onto the target location. This backup is performed as a database image, so that the datafiles are written directly to the target FCP/ASM LUNs.
4The migration server is then switched to the new database that has been copied by RMAN to the FCP/ASM LUNs.
5The virtualized target database is set in Data Guard continuous recovery mode, and Data Guard log ship/log apply is used to catch the virtualized target database up to the production version.
6Once the virtualized target database is copied to production, Data Guard failover can be used to retarget to the virtualized target database. If appropriate networking configuration is performed, clients will not see any downtime when this operation occurs.

The result, as stated above, is that the virtualized production NFS-mounted database can be migrated to FCP/ASM with minimal performance impact and no downtime.

Key elements

Introduction
This section provides an overview of the technologies that are used in this solution.
EMC Celerra unified storage platform

The Celerra is a remarkably versatile device. Celerra includes a world-class network-attached storage (NAS) array combined with the functionality and performance of the leading midrange storage area network (SAN) array. Celerra provides both NAS and SAN functionality and performance without compromise.

The key features provided by the Celerra are described in the following table:

FeatureProvided by
NAS

Network File System (NFS) and

Common Internet File System (CIFS) protocols

iSCSI storageCelerra’s Data Movers
SAN storageFibre Channel Protocol (FCP) through the back-end EMC CLARiiON CX4
EMC Replication Manager

EMC Replication Manager manages EMC point-in-time replication technologies through a centralized management console. Replication Manager coordinates the entire data replication process – from discovery and configuration to the management of multiple application consistent disk-based replicas. Replication Manager can auto-discover your replication environment and enable streamlined management by scheduling, recording, and cataloging replica information including auto-expiration.

Replication Manager integrates with the Oracle Database server and provides an easy interface to create and manage Oracle replicas.

EMC Celerra SnapSure

SnapSure creates a logical point-in-time image (checkpoint) of a production file system (PFS) that reflects the state of the PFS at the point in time when the checkpoint is created. SnapSure can maintain up to 96 PFS checkpoints while allowing PFS applications continued access to the real-time data.

How SnapSure works

The principle of SnapSure is “copy on first write” (COFW). When a block within the PFS is modified, a copy containing the block’s original content is saved to a separate volume called the SavVol. Subsequent changes made to the same block in the PFS are not copied into the SavVol. The original blocks from the PFS (in the SavVol) and the unchanged PFS blocks (remaining in the PFS) are read by SnapSure according to a bitmap and blockmap data-tracking structure. These blocks combine to provide a complete point-in-time file system image called a checkpoint.

VMware ESX
VMware ESX is the flagship virtualization server product from VMware and is the market leader in server virtualization. VMware ESX provides a high-performance, robust, fault-tolerant and high-availability virtualization solution.
Oracle software stack
The Oracle software stack covered by this solution consists of:
  • Oracle Database 11g and 10g
  • Oracle DNFS (11g only; KNFS used on 10g)
  • Oracle Enterprise Linux
Oracle DNFS

DNFS is a new feature introduced in Oracle Database 11g; it integrates the NFS client directly inside the database kernel instead of the operating system kernel. DNFS provides significant performance, manageability, and efficiency benefits over KNFS.

Better performance

Transactions per second and user load are both higher with DNFS than with KNFS, and this enables organizations to gain more output from the same infrastructure. CPU costs on both the database server and the file server are lower. In addition, port scaling with DNFS is much better, enabling higher bandwidth and thus higher scaling.

High availability

Load balancing and high availability (HA) are managed internally within the DNFS client.

Concurrent I/O

The DNFS client performs concurrent I/O by bypassing the operating system. The benefits of this are:

  • Better performance due to the reduction of memory consumption and CPU utilization
  • Consistent NFS performance, which is observed across all operating systems.
Optimized for database workloads

DNFS is optimized for database workloads and supports asynchronous I/O, which is suitable for most databases. It delivers optimized performance by automatically load balancing across the available paths. Load balancing in DFNS is almost invariably superior to the conventional Linux kernel NFS (KNFS).

VMware Distributed Resource Scheduler
VMware Distributed Resource Scheduler (DRS) is a cluster feature of VMware vCenter Server; it provides dynamic load balancing and resource sharing for multiple virtual machines across ESX servers. DRS uses VMotion as the underlying transport feature to move virtual machines from one ESX server to another.

Physical architecture

Reference architecture diagram
The following diagram depicts the overall physical architecture of the solution.

010-[EMC]-Bullets-Tables-English-22Pages-DOC_html_dc59c2f02504ebed.png

Validated environment profile

Profile characteristics
For information on the validated environment profile and performance results, refer to the Proven Solution Guide for this solution. This information can be accessed on EMC Powerlink®, EMC.com, and the EMC|KB.WIKI.

Hardware and software resources

Hardware
The hardware used to validate the solution is listed below.
EquipmentQuantityConfiguration
EMC Celerra unified storage platforms (included an EMC CLARiiON CX4 back-end storage array)2
  • 2 Data Movers
  • 4 GbE network connections per Data Mover
  • 2 or 3 FC shelves
  • 1 SATA shelf
  • 30 or 45 73 GB FC disks (depending on configuration)
  • 15 1 TB SATA disks
  • 1 Control Station
  • 2 storage processors
  • DART version 5.6.44-4
Gigabit Ethernet switches524 ports per switch
VMware ESX HA cluster servers4
  • 2 2.66 GHz Intel Pentium 4 quad-core processors
  • 24 GB of RAM
  • 2 146 GB 15k internal SCSI disks
  • 2 onboard GbE Ethernet NICs
  • 2 additional Intel PRO/1000 PT quad-port GbE Ethernet NICs
  • 2 SANblade QLE2462-E-SP 4 Gb/s dual-port FC HBAs (4 ports in total)
Virtualization server (VMware ESX)1
  • 4 2.86 GHz AMD Opteron quad-core processors
  • 32 GB of RAM
  • 2 146 GB 15k internal SCSI disks
  • 2 onboard GbE Ethernet NICs
  • 3 additional Intel PRO/1000 PT quad-port GbE Ethernet NICs
  • 2 SANblade QLE2462-E-SP 4 Gb/s dual-port FC HBAs (4 ports in total)
Software
The software used to validate the solution is listed below.
SoftwareVersion
Oracle Enterprise Linux5.2
VMware ESX Server/vSphere4.0
Microsoft Windows Server 2003 Standard Edition2003
Oracle Database Standard Edition11g (11g Ver. 11.1.0.7.0) or 10g
Quest Benchmark Factory for Databases5.8.1
EMC Celerra Manager Advanced Edition5.6
EMC Navisphere® Agent6.26.0.2.24
EMC Replication Manager5.2.2.0
EMC FLARE®04.28.000.5.504
EMC DART5.6.44-4
EMC Navisphere Management6.28

Conclusion

Summary
This section provides a summary of the solution and of the business challenges that it addresses.
Reduced total cost of ownership

In any reasonable configuration, the database server's CPU is the most precious component of the entire architecture. Therefore, the over-arching principle of EMC's Oracle Database 11g and 10g solutions for midsize enterprises is to free up the database server's CPU (as well as memory and I/O channels) from utility operations such as backup and recovery, disaster recovery staging, test/dev, and cloning. The highest and best use of the database server’s CPUs is to parse and execute the SQL statements that are required by the application user.

CPU usage

This solution reduces the load on the database server CPU by using:

  • EMC Replication Manager for NFS with EMC SnapSure to carry out a physical backup of an Oracle 11g or 10g production database while offloading all performance impacts of the backup operation from the production server
  • DNFS to achieve better performance due to the reduction of memory consumption and CPU utilization
Virtualization

The use of virtualization for the production database server provides manageability and ease-of-use advantages. In a scale-out context, virtualization can provide superior performance and scalability compared to physically booted configurations - even when using hardware identical to that used in the physically booted configuration.

Utility servers, such as a test/dev target and basic protect target, are more easily and conveniently managed as virtual machines than as physically booted Oracle Database servers. The advantages of consolidation, flexible migration and so on, which are the mainstays of virtualization, apply to these servers as well.

Reduced costs

One of the main challenges faced by the customer is to reduce cost by utilizing infrastructure effectively. Virtualization enables reduction in the number of servers and related IT hardware in the data center. The other main feature is the ability to move a running virtual machine production database from one physical sever to another physical server without any downtime.

Distributed resource scheduling

VMware DRS provides the ability to distribute workload across multiple ESX servers by migrating virtual machines according to resource consumption and demand. VMware DRS dynamically manages pooled resources from multiple ESX servers. VMware DRS aggregates and centrally manages the resources of multiple hosts as resource pools. Different sets of rules and priorities can be defined for individual Oracle virtual machines based on the criticality.

When an Oracle Database server virtual machine operating within a resource pool experiences increased utilization of resources, DRS will try to allocate the resources from the centralized pool based on the rules and priorities.

High availability

VMotion

The use of VMware VMotion to migrate an Oracle Database instance from one piece of hardware to another was achieved in our tests with virtually no performance impact and no downtime on the running Oracle Database instance. This provides a very high level of manageability and downtime reduction for tasks such as software and hardware upgrades.

VMware HA cluster

The use of a VMware HA cluster provides further high-availability advantages to the virtualized solution. EMC has validated the use of a VMware HA cluster with Oracle Database 11g and 10g in a single-instance scale-out environment. In addition, a VMware HA cluster can be used to automate failover in the event of a hardware or software failure.

Fault Tolerance

The use of VMware Fault Tolerance provides true zero downtime for Oracle VMware-based database servers. This means that a database server running on VMware can be protected from unplanned downtime. This is a significant improvement over the previous version of VMware and is provided on the HA cluster. The use of Fault Tolerance is presently limited to one vCPU, making Fault Tolerance only applicable to smaller Oracle Database servers.

Improved performance
The Direct NFS (DNFS) client performs concurrent I/O by bypassing the operating system. The benefits of this are:
  • Consistent NFS performance is observed across all operating systems.
  • DNFS is optimized for database workloads and supports asynchronous I/O, which is suitable for most databases. It delivers optimized performance by automatically load balancing across the available paths. Load balancing in DNFS is frequently superior to the conventional Linux kernel NFS (KNFS).
Ease of use

The use of DNFS simplifies network setup and management by eliminating administration tasks such as:

  • Setting up network subnets
  • LACP bonding
  • Tuning of Linux NFS parameters

Load balancing and high availability (HA) are managed internally within the DNFS client.

Business continuity

Advanced backup and recovery

EMC Replication Manager is a comprehensive graphical application that provides Oracle storage replication using EMC storage technology. This eliminates the requirement for the customer to write scripts or to manually perform replication tasks. These tasks can now be fully automated and managed by Replication Manager.

One of the solution components using Replication Manager in the solution presented in this reference architecture included Advanced Backup and Recovery using EMC SnapSure checkpoints.

Test/dev

The ability to deploy a writeable copy of the production database is required by many customers. The process of provisioning this writeable copy must create minimal, if any, performance impact on the production database server. Also, absolutely no downtime can be tolerated. The Test/Dev solution component documented here provides this using EMC Replication Manager for NFS with EMC SnapSure writeable checkpoint.

Robust performance and scaling

The resiliency testing carried out by EMC ensures that the database configuration is reliable. High availability is used at every major layer of the solution, including the database server, NAS file server, and back-end CLARiiON CX4. By testing the fault tolerance of all of these layers, the ability of the application to withstand hardware failures with no downtime is assured.

The performance testing carried out by EMC utilizes an industry-standard OLTP benchmark but does so without exotic tunings that are not compliant with best practices. In addition, real-world configurations that the customer is likely to deploy are used. This enables the customer to be reasonably assured that the configuration that they choose to run their application will do so predictably and reliably.

Next steps

EMC can help accelerate assessment, design, implementation, and management while lowering the implementation risks and costs of an end-to- end solution for an Oracle Database 11g or 10g environment.

To learn more about this and other solutions contact an EMC representative or visit http://www.emc.com/solutions/application-environment/oracle/solutions-for-oracle-database.htm