OVM SPARC (Ldom) root domain

This interesting whitepaper discuss the root Domains and use case for  T4 servers

  • Root domains are a type of logical domain characterized by the fact that they own one or more PCIe root complexes and therefore do not share I/O components with other logical domains.
  • System resources are merely divided among the domains (rather than being virtualized and shared).
  • This type of virtualization is unique to the SPARC systems supported by Oracle VM Server for SPARC and offers a number of benefits, specifically
    • bare metal performance (i.e. without virtualization overhead) and
    • lower interdependence among domains. I/O fault isolation, improved security
    • simple to setup
  • The number of root domains is limited by the number of root PCIe complexes available in the platform. With the SPARC T4 processor there is one root PCIe complex per SPARC T4 socket.
    • So a SPARC T4-2 system has two root PCIe complexes and
    • a SPARC T4-4 system with 4 socket  has four RD
    • SPARC T4-4 with two socket has two RD
  • Restrictions:
    • less flexible, change root domain require downtime
    • RD does not support live migration
    • They require more planning, particularly during the purchase phase to ensure there are enough NICs and HBAs, as these components are not shared across domains.
    • Some of the root domains may not have access to local disks or networks, and will need to be provided via the PCIe cards, or by configuring virtual I/O devices.
  • The best practice with RD is to use zone as V12N: Zone wordloads
    • Workload isolation. Applications can each have their own virtual Solaris environment.
    • Administrative isolation. Each Zone can have a different administrator.
    • Security isolation. Each Zone has its own security context, and compromising one Zone does not imply that other Zones are also compromised.
    • Resource control. Solaris has very robust resource management capabilities that fit well with Zones. This includes CPU capping (including for Oracle license boundaries), Fair Share Scheduler, memory capping and network bandwidth monitoring and management.
    • Workload mobility. Zones can be easily copied, cloned and moved, usually without the need to modify the hosted application. More importantly, Zone startup and shutdown is significantly quicker than traditional VM migrations
  • Using guest domains for Test and Development
    • For development systems it is often more desirable to implement domains using the service domain model.
    • This affords maximum flexibility where performance is generally not a concern. For maximum flexibility place the application or workload within a Zone inside the domain, as this becomes the entity that is promoted through the development and test lifecycle.
    • Functional test can be implemented the same way. Zones can be copied, cloned and/or moved from development to test. Zones can further be cloned multiple times to test in parallel if desired.
  • Migrating Zones to the Production environment
    • When maximum performance is desired, production and pre-production can be implemented with root domains.
    • Ideally, pre-production will be on hardware identical to the production system and with an identical root domain configuration.
    • Workloads can be migrated from development or functional test guest domains to the root domains that exist in the Prod and Pre-Production environments.
  • Use cases
    • image

 

  • The four possible root complexes are defined by the top level PCI path, and are as follows:
    • RC0: pci@400, green
    • RC1: pci@500, blue
    • RC2: pci@600, purple
    • RC3: pci@700, orange
    • It should be noted that the allocation of disks and on-board 1GbE ports is not spread evenly across all the root complexes.
      • RC0 has half the disks and two 1GbE ports,
      • RC2 has two 1GbE ports, and RC3 has the other half of the disks.
      • This needs to be taken into consideration when creating root domains, as
        • neither RC1 nor RC2 have access to local boot disks, and
        • neither RC1 nor RC3 have access to local 1GbE ports.
        • This is easily overcome by providing PCIe cards with this functionality, or having the other domains provide virtual I/O services.
  • T4-4 with two socket
  • image
  • The two possible root complexes are defined by the top level PCI path and are as follows:

    • RC0: pci@400, green
    • RC1: pci@500, blue
      As per the 4 socket case, there is some asymmetry here as well, as
      • only RC0 has access to the onboard 1GbE ports,
      • but the disks are split between the 2 root complexes.

T4-2

image

  • The T4-2 has 2 root complexes, as shown:
    • RC0: pci@400, green
    • RC1: pci@500, blue
      In the case of the T4-2, while
    • both root complexes have access to the onboard 1GbE ports,
    • only the RC0 domain has visibility of the on-board disks.
  • RD Howto
  • The following steps are examples. Values of cpu, memory allocation and root complexes may need to be changed to reflect the actual configuration.
    Steps to perform initial configuration of the control domain:
    # ldm add-vcc port-range=5000-5100 primary-vcc0 primary
    # svcadm enable svc:/ldoms/vntsd:default
    # ldm start-reconf primary
    # ldm set-vcpu 64 primary
    # ldm set-memory 256G primary
    # ldm remove-io pci@500 primary
    (repeat for all other pci@ 600,700 as required)
    # ldm add-spconfig initial; reboot;
    Building Root Domains
    Each root domain can now be created as required. The standard domain creation steps are used, except we add the root complex to each domain as required. The following assumes that the root complex associated with the root domain contains disks suitable for installing Solaris, and network access:
    Steps to create a root domain:
    # ldm create ldom1
    # ldm set-vcpu 64 ldom1
    # ldm set-memory 256G ldom1
    # ldm add-io pci@500 ldom1
    Repeat as necessary for all other domains.
    Save ldom configuration
    # ldm add-spconfig domains

    Building Root Domains with virtual I/O for boot and networking
    It is possible to configure the root domains so that they can use virtual disks and networks provided by other root domains with direct access to internal disks and networks.
    In this case, it is simply a matter of configuring virtual disk and network services and configuring in the logical domains configuration above.

About laotsao 老曹

HopBit GridComputing LLC Rockscluster Gridengine Solaris Zone, Solaris Cluster, OVM SPARC/Ldom Exadata, SPARC SuperCluster
This entry was posted in LDOM, oracle, ovm, Solaris, Solaris 10, Solaris 11, SPARC, SPARC Supercluster, zone. Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s